From patchwork Thu Oct 31 08:13:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13857704 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6190EE68958 for ; Thu, 31 Oct 2024 08:14:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1FD46B008A; Thu, 31 Oct 2024 04:14:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C56A06B0096; Thu, 31 Oct 2024 04:14:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1E5E6B0095; Thu, 31 Oct 2024 04:14:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 901DD6B0096 for ; Thu, 31 Oct 2024 04:14:16 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 529D7161221 for ; Thu, 31 Oct 2024 08:14:16 +0000 (UTC) X-FDA: 82733184660.14.B8BAB5F Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by imf22.hostedemail.com (Postfix) with ESMTP id 72775C000F for ; Thu, 31 Oct 2024 08:13:41 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=edx4yHPM; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.180 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730362295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ipuWYZPCHxpoUpiQDdIApzlSejCgHBDIxG7j4EZzxr4=; b=To48849lnZ0I5IrMXOH8/F5yzDUtgTGRLwNDeTuYzPchgcHKC9FdOes/WLkQcyqU+q8zT3 aruQPzeEy2nMg7nWasnD73BkhpW47T/7LFxVLFerCOgDwuyUcrmUM407vsUbnqr61LxNGI OHyj4TXNkKD8OuzL1ujdGBLQ/ChKexM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=edx4yHPM; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.180 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730362295; a=rsa-sha256; cv=none; b=pqc76Kw87acJlvrTApIoQp5sa0W+FXCZEDC1+AP5/Y/EGWVbOk9aQ9OLlRiPwGYtUwNkIb FE2tWa+6aCTg3EEJd4cm7hFKEHpj0JGLR+kK1BFBvgR8WV8y9OdrpEQpjIb/8sJAdGRGyp 12d4mKnIsgS2nQ6BWXYj1WADu2hP1K8= Received: by mail-pg1-f180.google.com with SMTP id 41be03b00d2f7-7ea7e250c54so600635a12.0 for ; Thu, 31 Oct 2024 01:14:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362453; x=1730967253; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ipuWYZPCHxpoUpiQDdIApzlSejCgHBDIxG7j4EZzxr4=; b=edx4yHPMvNYIbj9znbHKovq4Y1XN0JQQXlARbjUdsLw4y/8GqQ4KIVQw/9iCV3dEdC XvzGG8bLIZcscbwLMdtj4uS6rRr3UWOCukXzPzLgfqmpzksGF2tPMw3UwZNNS/fod21L g3xP9QgpbZodP4O4uE7t2KORmP2OQYUxxPFr8XwbSXUpKiDf8th5ksK7C4WjcUomC8r0 IuegpCNlc7X/6NOWuM/1ijmcuUVC0Czy7Yxfoz1yFBDocgNxqw1yPg42qF1ou9eF/loU x5e9/UAs6gWWJMhGpJJqtSbiEQ3ZSapfaqxivOF7ThADKnovxOlReZRgOAApWgPiW8ZS voow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362453; x=1730967253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ipuWYZPCHxpoUpiQDdIApzlSejCgHBDIxG7j4EZzxr4=; b=aLX7Jsbv+m9deoHwPI3/34q/hMea2NBpiVysQpKHGaHSZzKPwm0cpH1yk+rbr2Mmz8 xNdC7SQp5CGk5N1zgDR9X9//unZAqyoxNNXZKKXztlnCYio1IqRsGBMYiIYTNC1eBWGh 8PpyNjdXaxlGwSBIYgMnhtIgavmcZrKkZNkHw9qz1vqFuhxsden0TyfD+hL75upX5Amd DxUh65EQN6+oKT2GaE/ebc7IUqkkBWCX8wKRMaL2WOTt46UNRixBz3m4qmt/TaUrJReC JXgHQ2c9zP9q+2XPJfxUozWHVP6tI4Ia5DQkvH5aYR25xygx0vC7c6iD8IHRp0EaQjMk 57gw== X-Gm-Message-State: AOJu0Yw/0ua0DXoMWD6dgEKUoxvZnKXkuDETEMwa0KeZDjnLhmrmrHDl efKB1oIfltlrCiKZrtrUqtYuuGqmSHufZ6NUKWJaSnMQpOTNEJq0wny7/1Od/Ok= X-Google-Smtp-Source: AGHT+IGhQTI8wahkjaT4dlfz/gtxWygzYcya76y9Il/Jpy4crLp7RHv4SMqkN227j2dZrXUEfJpRWQ== X-Received: by 2002:a05:6a21:3947:b0:1d9:3456:b71e with SMTP id adf61e73a8af0-1d9a83c1926mr23625663637.12.1730362453050; Thu, 31 Oct 2024 01:14:13 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:12 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 1/7] mm: khugepaged: retract_page_tables() use pte_offset_map_rw_nolock() Date: Thu, 31 Oct 2024 16:13:17 +0800 Message-Id: <4c3f4aa29f38c013c4529a43bce846a3edd31523.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 72775C000F X-Stat-Signature: 9g4qj81gwtx5objp7fudun5yag64e39t X-Rspam-User: X-HE-Tag: 1730362421-413358 X-HE-Meta: U2FsdGVkX18gBnk6ixDy4BxxyMJuRmVaf8zYRJrBaGLgeht80cziDBTxIF4Erqtak/Le0zithHg4BGauOeApzs7QKtUemcL9MNG3E9P5ABZMvhs6GJhgl6xjfJu1axNvpxQvbe+6Fk3gL319KPtNKz5MH2S5ynztpwmBC+DX4z07ay6oDOKzAxKoU7SAz7UEVLAEglIzrhtjApKjhLpgiXvVn3qlYNWM4Yjmyp1PeoZa8etyUAyJmv8sPltGDIfJLc6IqaNtt1mFihm9Kq1E+fkQdRdekhYrXpzqelUHLX1bhuXtJIjEJSLtexHx9m/sCVKFZNF71sEUj8jUunyTKERg/bSEeAKDZZ/NnIyIKiJa5txQbRz3fkAea6DlLQfUN1wY5+9R9WljH9U+P6XoNobmHj3gCz9H4eqrQ4DaoAWxElTuiQjuKxaJ1tKFlJyAHly7AkjHWrD3Q0kykdMi2xwEkiE8bEBj0/DEALK4eV5XwOErdz3g+sb7aOmfmJX4gTavdj4XFdLu2DdydlBvQ27pkSLrMwynvq8gJlHTplSluzK2W756GA9r7DezNH4nkTW75dDJd3zWjueDWQEVNfE7vrP1tuEx3F74bysgVR0oQVxYRQ9ZcpgvD9PB3rEn9Ewmk/ysJJJZN+eHB85mD8aKH2NTmSV90/Zg2ie0nQfeMo7i+3EIgND3jQam+U4hUs4xRyjwZDTkaA1lA9CX56W5LpSNE/LlkwHBNT27IitMxhry8s6aAQEI4J3OxELz6N7uC6tmK5SWFEx+gAmYfjvUjOx9pKGVo8eKcOrYBuvhO6QTN3wXqvU/XXkDXmy7RlyHeGG/ZBrLxYuhgRf8MxcTHitllLLK6xf5c08PSxonH7BBrx/7wg0Q/wnXfK7IK/Zdx4/FlYTjYNDVn3xbkDxwgQY8xuSSxrqi0is+Z1YUCgH2Dsf+oV87ugzKfkBUTyFSnrYsgp7qJMq9kik zoRj5qus VmwEZfBaOpRPHtup1FTstOqR75eVN+1GWMMzrsKCKjT629Xxy5WnR/GkCkNLxeSCJntXJYta91p5s0qrUcVMiamwnzJFvWxrrD2wCx/z9NbrHbX61kuXsEYNxf9njR9b2EIx0LwREJG+7YikqKpo7gNjVo9ApP/2Kgl++5YVMHUc6tE0NI7aIoHdS9cNN8nxTPxzpbIMeN04HAwuFsdErU4yVuN7+QDu8hLNHcFH26ivryBWV+KJRz3WJTRObFWknrXstzapBBpZoKZrPFrvZIF6Tv7X2v7IfruiUczruO6ZHbc+diNZHx8sYBFm9pYNgLRUU6GVnObefYFQZ2odxrZjB2HWESYvc3s2NVUfNNVKGvFA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In retract_page_tables(), we may modify the pmd entry after acquiring the pml and ptl, so we should also check whether the pmd entry is stable. Using pte_offset_map_rw_nolock() + pmd_same() to do it, and then we can also remove the calling of the pte_lockptr(). Signed-off-by: Qi Zheng --- mm/khugepaged.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6f8d46d107b4b..6d76dde64f5fb 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1721,6 +1721,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) spinlock_t *pml; spinlock_t *ptl; bool skipped_uffd = false; + pte_t *pte; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1756,11 +1757,25 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) addr, addr + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); + pte = pte_offset_map_rw_nolock(mm, pmd, addr, &pgt_pmd, &ptl); + if (!pte) { + mmu_notifier_invalidate_range_end(&range); + continue; + } + pml = pmd_lock(mm, pmd); - ptl = pte_lockptr(mm, pmd); if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { + pte_unmap_unlock(pte, ptl); + if (ptl != pml) + spin_unlock(pml); + mmu_notifier_invalidate_range_end(&range); + continue; + } + pte_unmap(pte); + /* * Huge page lock is still held, so normally the page table * must remain empty; and we have already skipped anon_vma From patchwork Thu Oct 31 08:13:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13857705 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F3CBE68958 for ; Thu, 31 Oct 2024 08:14:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C7FB96B0096; Thu, 31 Oct 2024 04:14:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C31136B0098; Thu, 31 Oct 2024 04:14:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB3446B0099; Thu, 31 Oct 2024 04:14:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8B7256B0096 for ; Thu, 31 Oct 2024 04:14:22 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 497918127A for ; Thu, 31 Oct 2024 08:14:22 +0000 (UTC) X-FDA: 82733184912.29.BD550F2 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf20.hostedemail.com (Postfix) with ESMTP id 39FFF1C001F for ; Thu, 31 Oct 2024 08:13:47 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hlwQEO5g; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf20.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730362299; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0uS9zjnEttUyVTWzhhVIHAdfZYrGflmgRk9QlOwbpbs=; b=ynF+3WBIHVxb50jtMSRumNq8DYOMuR/SCv90mY5Qi6lVf0uDAMXEIziB2Iu96gDSQkCV2S 3vaBy3YOZc/WI0rV2DFtB97VOezXWHPwZtfKaxQgiYAqhjRF1w+uROR0FdN4okDMUIjNYT JTXGnJFUuep28VlBS0izQC9Gjakk7OU= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hlwQEO5g; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf20.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730362299; a=rsa-sha256; cv=none; b=Nr9bJerQ5wJWFdbdCL/4uMgnV4rbBI5qdse1zL0KQydisBE0o13jvGNT8nEhKvIdA63USG HIybcEGOf8xmkNWLds+tCSWjzIHbPdBXW23mYWKvxNewvQH9RrKIaHLGlMwr6P4jeyXqKe +un6qi4meAmwu1mxJOIBeT8FWpVR8eg= Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-2e2eb9dde40so517888a91.0 for ; Thu, 31 Oct 2024 01:14:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362459; x=1730967259; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0uS9zjnEttUyVTWzhhVIHAdfZYrGflmgRk9QlOwbpbs=; b=hlwQEO5gY/G9XdiAhug443EbRd8aRGCjAiLUdceLENV8VpWO7x5Gn2XhkCmYFGY/iJ SS+uwr6cW4tHd3LEohIG4ksxvGtzSFi2gBL06zb4MKNgwBTnOAmuVmbJ0NF0JRqvdYyY GLrPPUwuGh1S7zmsPH0eiqhOR/ehzlZlCcCEDnVhavEWEgCEknyxk0mh9GbXtQTXNRvN orjRpqL/w6hR9fvEG5Ihyn6DMlSXdIS6CO6JYwYKBXNom3vYq72dBjB/A/BJGIkzcfaZ TPxuXHwLbygKARB3XQqh4QclzGaED+6NrCQOju4c4iwmNJGM3ZvUBOtr15D4gblUxHQ8 4vXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362459; x=1730967259; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0uS9zjnEttUyVTWzhhVIHAdfZYrGflmgRk9QlOwbpbs=; b=WOIyG4zZeCNMFe1KVzz4g+45lDQFYM51h73tiAXeUNSpytzpOgHoCr/iT5+CbC4JM5 RHQ+wgjuis0TpixBGUcTNpue3X2LQoeWEr6NmoOZvDxFyvYCWyhnrhBlOHn7WqImDCet e3c2PXh10vdrUkspFno4WTi7SCdDh0R0mGDbEPi6LYgTiJE052qccX0lgezPY2Kf6M1q tx99U3IzHUOQrOuT8Fc/z+Fu9WBbPmQpKiIEgoTIURyLjxVW6PA0yj5OZP1+ltk5QTsv 1B5pxme4APeFzLcujL0qFydweO5kIaeI4lQ1iHwaHn4dqHzkjHOT8qRw4pZOHLKBpglk nTzw== X-Gm-Message-State: AOJu0Yxpu/ltqgqg2x/mP45T94UzgIzjF+uDxQz8xOwZV93MVb/vClms J+9HALpRo1LU0wrf9Dba4jnOCMTH3unTwW3zfEQhVpP1RAoNC/OAW3RHZyA6Heo= X-Google-Smtp-Source: AGHT+IHliHONlLjM71G0WcLE2jZg2du2F6KU7SIJsE29Lnbx/3IrRdww7bJdFluVjjQ6fAFe8KWB2A== X-Received: by 2002:a17:90b:53c3:b0:2e2:d859:1603 with SMTP id 98e67ed59e1d1-2e93c1d39b0mr2658162a91.25.1730362459134; Thu, 31 Oct 2024 01:14:19 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:18 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 2/7] mm: introduce zap_nonpresent_ptes() Date: Thu, 31 Oct 2024 16:13:18 +0800 Message-Id: <34ef26a763af7b2d8dee707b6f60ff85edda345f.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 39FFF1C001F X-Stat-Signature: y1axpp3k4j9icc1n1g8gt16xkhyj9r7d X-Rspam-User: X-HE-Tag: 1730362427-37366 X-HE-Meta: U2FsdGVkX1+pBr9FnB+84iHHDE3cF0WEsoCJpA6kRwjKY/CPM70Q7F/y0zob+HtlCYW50VxOtCioNPA36EOzf5A8WIGYs1s9dw3A+nS6K4uJcSckVqF63EAGoXEP8id7lU8RWan3p994yilIhXZAkxsx08fnLsDFYzwT57Ran4/8mYcdbgsRZ5V4fqcoosTbdwkeamxEHc0MPogzBjDG4RfLXpe569w8yDcNGH1QkM1oweGhkhaHvhpTOzfT8vJ9ukM85QDtja5LqX9PH+rCqfdAZSzqcr9+PeO73Gz24m1Voq0Fk/gr4a52of2uzd2+Hh2JuOCBZus5d6tWV+mRsG182HyVRK+wVNtMlJupPlScLn605vZWwPE1NJO8adrJVzOlpGRTnllaXISumd3CbNMLtuz7NWkcmK/RH30G+920ryeRgsOgVp7QXg7ANZrd6SmGJa5b6FVnoTfS9mielxsO2MCDDJF7AR1IQXm883aHRQtj+zW6JVud/TpgMlGDZL1zLi8Uhwvrid6zKmPP0hboQ+T6xqyhvrEhRT8igBjTbQ28pm4x7D4dq7ugpp9nj/SomnPN+pRP1QuPZvauWqNUz9yCOlU5C05KdaSO72VopzzLCtuJB1jm7zy6lqnB9XwjouFexISwjcJXFNmeVj6NPO6jxXMUv0qDpNIdfA530DpoIFgkP5dGM91/YV1scu0Enlgj30+GGYOAkTOQ6ZRwY795qv4ibgYMhFv5UhkZ7YtAack2df/zsx3WKcIOWJhizzCSjdXrcxwdm0vOhYOlOejhVkNAc3jyJhZ/VoPmQHdlLqRbudnMPLTveeENPoWkg0G9EW9a6iT4b882yKvE/l/wnPQjyg+AC4u2MIFZtiRKG67Y0TCoLP3Poj600KqxnIyMK4ZzWXe3uKZQE6RS3SusEzwTgKq77+A2QM/esnmPNDKSPda2qZ5o8vEHHX4l012nQs1vJdV3x/N mqO+XRf1 ILwgdUdN3PN+HziTK69yJXsTNZIBZkI3H5vQqthcwj/sL8sDqFvN1+Ucb5gXwD+ok0Tk41/RbdM3gHg8Dpjx0PTb+tLooWEkYVj6LyN4mRyCouXtj+MN+zmslC5v0XEYeYzOrf/ujj3cymQ/6/gj4GW+KLm1O7Ozi9dwZHiUAnGeluN2sVTjbdsp2YqOVZz9qGkXE6Ls1Sgx/4Jl0zD5NPvgBY2GWxsL59J2SMVDmTophH/7JXMWQ1pW8FOJuA4qMHpeTZfGRlxe0SNmKHLRgDozP31LeGEyJGYweTDmR5BIvb1vfWuMgXVg+JOyIMpyBs52SMnRAN/dpWeNKk1u1dYEBmjHdzQIw1E6WE2kk9TvZ8Y8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Similar to zap_present_ptes(), let's introduce zap_nonpresent_ptes() to handle non-present ptes, which can improve code readability. No functional change. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn Acked-by: David Hildenbrand --- mm/memory.c | 136 ++++++++++++++++++++++++++++------------------------ 1 file changed, 73 insertions(+), 63 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 209885a4134f7..bd9ebe0f4471f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1587,6 +1587,76 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, return 1; } +static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned int max_nr, unsigned long addr, + struct zap_details *details, int *rss) +{ + swp_entry_t entry; + int nr = 1; + + entry = pte_to_swp_entry(ptent); + if (is_device_private_entry(entry) || + is_device_exclusive_entry(entry)) { + struct page *page = pfn_swap_entry_to_page(entry); + struct folio *folio = page_folio(page); + + if (unlikely(!should_zap_folio(details, folio))) + return 1; + /* + * Both device private/exclusive mappings should only + * work with anonymous page so far, so we don't need to + * consider uffd-wp bit when zap. For more information, + * see zap_install_uffd_wp_if_needed(). + */ + WARN_ON_ONCE(!vma_is_anonymous(vma)); + rss[mm_counter(folio)]--; + if (is_device_private_entry(entry)) + folio_remove_rmap_pte(folio, page, vma); + folio_put(folio); + } else if (!non_swap_entry(entry)) { + /* Genuine swap entries, hence a private anon pages */ + if (!should_zap_cows(details)) + return 1; + + nr = swap_pte_batch(pte, max_nr, ptent); + rss[MM_SWAPENTS] -= nr; + free_swap_and_cache_nr(entry, nr); + } else if (is_migration_entry(entry)) { + struct folio *folio = pfn_swap_entry_folio(entry); + + if (!should_zap_folio(details, folio)) + return 1; + rss[mm_counter(folio)]--; + } else if (pte_marker_entry_uffd_wp(entry)) { + /* + * For anon: always drop the marker; for file: only + * drop the marker if explicitly requested. + */ + if (!vma_is_anonymous(vma) && !zap_drop_markers(details)) + return 1; + } else if (is_guard_swp_entry(entry)) { + /* + * Ordinary zapping should not remove guard PTE + * markers. Only do so if we should remove PTE markers + * in general. + */ + if (!zap_drop_markers(details)) + return 1; + } else if (is_hwpoison_entry(entry) || is_poisoned_swp_entry(entry)) { + if (!should_zap_cows(details)) + return 1; + } else { + /* We should have covered all the swap entry types */ + pr_alert("unrecognized swap entry 0x%lx\n", entry.val); + WARN_ON_ONCE(1); + } + clear_not_present_full_ptes(vma->vm_mm, addr, pte, nr, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); + + return nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1598,7 +1668,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, spinlock_t *ptl; pte_t *start_pte; pte_t *pte; - swp_entry_t entry; int nr; tlb_change_page_size(tlb, PAGE_SIZE); @@ -1611,8 +1680,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent = ptep_get(pte); - struct folio *folio; - struct page *page; int max_nr; nr = 1; @@ -1622,8 +1689,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (need_resched()) break; + max_nr = (end - addr) / PAGE_SIZE; if (pte_present(ptent)) { - max_nr = (end - addr) / PAGE_SIZE; nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, details, rss, &force_flush, &force_break); @@ -1631,67 +1698,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, addr += nr * PAGE_SIZE; break; } - continue; - } - - entry = pte_to_swp_entry(ptent); - if (is_device_private_entry(entry) || - is_device_exclusive_entry(entry)) { - page = pfn_swap_entry_to_page(entry); - folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) - continue; - /* - * Both device private/exclusive mappings should only - * work with anonymous page so far, so we don't need to - * consider uffd-wp bit when zap. For more information, - * see zap_install_uffd_wp_if_needed(). - */ - WARN_ON_ONCE(!vma_is_anonymous(vma)); - rss[mm_counter(folio)]--; - if (is_device_private_entry(entry)) - folio_remove_rmap_pte(folio, page, vma); - folio_put(folio); - } else if (!non_swap_entry(entry)) { - max_nr = (end - addr) / PAGE_SIZE; - nr = swap_pte_batch(pte, max_nr, ptent); - /* Genuine swap entries, hence a private anon pages */ - if (!should_zap_cows(details)) - continue; - rss[MM_SWAPENTS] -= nr; - free_swap_and_cache_nr(entry, nr); - } else if (is_migration_entry(entry)) { - folio = pfn_swap_entry_folio(entry); - if (!should_zap_folio(details, folio)) - continue; - rss[mm_counter(folio)]--; - } else if (pte_marker_entry_uffd_wp(entry)) { - /* - * For anon: always drop the marker; for file: only - * drop the marker if explicitly requested. - */ - if (!vma_is_anonymous(vma) && - !zap_drop_markers(details)) - continue; - } else if (is_guard_swp_entry(entry)) { - /* - * Ordinary zapping should not remove guard PTE - * markers. Only do so if we should remove PTE markers - * in general. - */ - if (!zap_drop_markers(details)) - continue; - } else if (is_hwpoison_entry(entry) || - is_poisoned_swp_entry(entry)) { - if (!should_zap_cows(details)) - continue; } else { - /* We should have covered all the swap entry types */ - pr_alert("unrecognized swap entry 0x%lx\n", entry.val); - WARN_ON_ONCE(1); + nr = zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss); } - clear_not_present_full_ptes(mm, addr, pte, nr, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); add_mm_rss_vec(mm, rss); From patchwork Thu Oct 31 08:13:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13857706 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F68DE68958 for ; Thu, 31 Oct 2024 08:14:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A5DE36B0099; Thu, 31 Oct 2024 04:14:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0CC36B009A; Thu, 31 Oct 2024 04:14:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8606E6B009B; Thu, 31 Oct 2024 04:14:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 676636B0099 for ; Thu, 31 Oct 2024 04:14:29 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1318612131E for ; Thu, 31 Oct 2024 08:14:29 +0000 (UTC) X-FDA: 82733184114.13.8C157CB Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf19.hostedemail.com (Postfix) with ESMTP id B93151A0005 for ; Thu, 31 Oct 2024 08:13:53 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ApHWNwbX; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf19.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730362348; a=rsa-sha256; cv=none; b=g3ew4AFXT8nywsh+6g5NNxGDqcTC+8dgAdj7342ZbsEVhYy+DjNvHyISusof0MThmheQb2 02rvx0fGymcG52IUMk/r03pwbbsThNG+1yH4WRnNhImFTqQe+4eBxAUk6AfzJw3pLmOTXQ 61W5Cm1imUNqsKf9WE4fcrINddTnNek= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ApHWNwbX; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf19.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730362348; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zhaSq+JiwY2eXMq9iHjNOMwdUtMvBCWvXjuj5WTX7eI=; b=jJiILgI4zVCYdPXRAvM3FdYiAsVGU2hcRNKHs1QZoQpJYbZ2G+bvzAcdUglJU3DnIX05Xp hmoT0pv42eztDbbUqDo42okZJCUPXe11igj6+EvZ+btvp5uN12/M+cDXryIzQ2LufvJrPW +7i8OA8iQV8PFnXHbeUIvy9xG4hDSlI= Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-2e30db524c2so502761a91.1 for ; Thu, 31 Oct 2024 01:14:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362466; x=1730967266; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zhaSq+JiwY2eXMq9iHjNOMwdUtMvBCWvXjuj5WTX7eI=; b=ApHWNwbXZkS5tKJQyOL9LPCB57azynTA3F7ckNxwm6YcqWEMnDR4yhrDlPL8/94MZ6 0RdZO0IiQa+pM4Bgy/4vNmgbj/VjgIyTz7NS2C8Nv9Ea4gxO4gexUORTrZ2U6JiMIyJ4 telskKcwxMlQU6LFrCoZ0NHb9pHinhTcSOQ0jA+EkhjglgqpAGvHjwlLu6nYQqtss5X0 1JZg+7GRuvX8gaVCYa8isUeaEznC6KdVKcWY8ClJQ5khwk1R4q2OtJ++TNjg54U6y38J YAAjqNBoVQe8dft4yVk5faNKA34EtFN1TjFJcqC6Oo/YifFnJCqEd/H9VbVRTkJUbnL0 BjsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362466; x=1730967266; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zhaSq+JiwY2eXMq9iHjNOMwdUtMvBCWvXjuj5WTX7eI=; b=uEy2xpMh8hm9rpWL9DqZvzhxuLWzDbcKvV7CtCYWJTep7HeowAtBVcteol+nt9XfDf JOotKu4JvrdqP043o6HK59QwOTzXT3B5EdmAETKtMYFo8B71OnL+L1HX0OPKNzfRMeCk GxHH1X7R9cyE5ux+T6FwAYIZQh50dTqVosbXbHdhvh4jL+qaN4AH54hBhDRtIbnFHXl6 CgKB3ukpJjiULnS54NO2zhoKJkeyy9Aa/P3mOE/AbEr+I59+TJvCPEmRuccqgsReQ6FZ qD8DyneHQjqb75j51USp8Hy1zFN015F31p7tMsd4PZx3yoikerIS+U0aJhT/zqKY2KqV LvaA== X-Gm-Message-State: AOJu0Yyve9rOrUpqK/AySu4ALA2OkZxhAY7Ud4F6TX95lDnrlwR4ka1t WXjjsX3nGX8D+/iHK8YM98qe/+fTLAk/pcHLepxbuXCzRuUYAXTSrRM2V+UoOhA= X-Google-Smtp-Source: AGHT+IHK1eQB1HIOVcDML9/U+qgad2lrGxmYCIjO8Kr0cktQfTqjB9fcEN6v3D+sRskXXQsEGQyaZA== X-Received: by 2002:a17:90a:bc83:b0:2e2:d17e:1ef7 with SMTP id 98e67ed59e1d1-2e8f1054a67mr19337853a91.3.1730362465831; Thu, 31 Oct 2024 01:14:25 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:24 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 3/7] mm: introduce do_zap_pte_range() Date: Thu, 31 Oct 2024 16:13:19 +0800 Message-Id: <1639ac32194f2b2590852f410fd3ce3595eb730b.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: B93151A0005 X-Stat-Signature: 463by7mr9ieryabod33ij9morn13zgsi X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1730362433-646319 X-HE-Meta: U2FsdGVkX19pk8NS7fL4k+UK5B/1nliSS42bYkEZa4Icm5mACSW61aWJ/5D0zwh0PTvwr/Z9aHKvVDmeSoPZl1VoD3uO37ObTI5Ku0EKDlSF4RDGPJTMZy2Xec0NcOMfKUN/EENSmvh2+3n0owQ6SqEa89LqS7nnFxA6dhe2MDQjmUQTGpG4kIhL+NSpOWf1+Dm6VTtbfsMxnRsqB7jBfZKUcoAthXAjShl56UXk59Cpin3Qla3X+Vd1/YxFSJOFv07MrFRyErSjQPaUjU6jidnE36YVByK8qagfVwqxMpl8+S7m6ZFDVS0zMJMdTL9GPamcunf0EErrrGyE0h8jq1oSJu8bm85SyCbruCaMWZ3Iq6smNawZ4KiptyPo4MMyjy/wPNiGEUOa8GYiNBhpZ0yg4hXvtuL8Cq7Acol4/Sevp9mRcTQUD1lW2zjOOH40or2ICqTJKRvjkwfwv7XTAaPGVouCHrBDfj7AYmBIsL7WTyq0wa3YvCa4R/pgvnlLGo5unNkaXWimCdAMScPvzKVXnQt6gdC8vIwDU7rlzal6TLa9rCgXbGRq5b9OXqBRNF9e6sbvcAJdfGwCgpCe2cUNriU1gqVk6fZgBbJsXsKjHBx8RL08RwdSV50x5C0HAUGPW7AEOBlnr10smOOfFoN5Vs6Xw0ThoQeX1dp1HGK9EgwfnwzNTJLRhIE/CEWnfc/PMeAan95+CBWhArFJOI3wtlbJw9fVirO+yEy++oJlHkeHn92GEF4BbT3PMXzmncCrsnYxEMC6uEQZPOYRTn916L1c5GxzS/pnoTkewgLNbEHCJz6pIB1qpcFQZfYRviF8gtDe8u17FN31Wvv+gle/StqsPkabcrpj0jS/yiaKglfzMft+ZsTMXKHzwLgFkGGktVloPbocpSdh3+EhhOsFvUGddgMTKlcgg4vzeHHq9KaWKQa11fubbdJa0t+QAZ/wNXe/W36BsFIRMh8 3oEo8LOq QZACPNwUPWHrjRFwh2nnFMWXx1E4oYNaNOwlHGCluEtc42rzUKbH4PxrickP0LA+wX4KbOYxcJdo/4uzCaI/FOcYhhFBp/JtURgzwrjlF9TgfamIIM5ytl5Vjt5HI9b0qHoGE6k1YtOVYkbOssLdfEcvc+V1VkdQvOkUBOJelMjhbRPH8QL+jWm1y874u9nCS8GCE9pCEDbGZdG15fcg8l0PW/jMCO6KWKfw0QW+O2hss0r/gykHoCF4qQNjyHSLIl0VYQ9hbiHKayvukSIOVWsbnGYO2QEbmwme7x13hyYb2l7zNxiNoRxWuNeBbWERArX2zwUKPzR1vH78BK0lpSADIRipOuh171XJV/RR9HPn3Aao= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This commit introduces do_zap_pte_range() to actually zap the PTEs, which will help improve code readability and facilitate secondary checking of the processed PTEs in the future. No functional change. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn Acked-by: David Hildenbrand --- mm/memory.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index bd9ebe0f4471f..c1150e62dd073 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1657,6 +1657,27 @@ static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, return nr; } +static inline int do_zap_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, + unsigned long addr, unsigned long end, + struct zap_details *details, int *rss, + bool *force_flush, bool *force_break) +{ + pte_t ptent = ptep_get(pte); + int max_nr = (end - addr) / PAGE_SIZE; + + if (pte_none(ptent)) + return 1; + + if (pte_present(ptent)) + return zap_present_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss, force_flush, + force_break); + + return zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, addr, + details, rss); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1679,28 +1700,14 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); do { - pte_t ptent = ptep_get(pte); - int max_nr; - - nr = 1; - if (pte_none(ptent)) - continue; - if (need_resched()) break; - max_nr = (end - addr) / PAGE_SIZE; - if (pte_present(ptent)) { - nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss, &force_flush, - &force_break); - if (unlikely(force_break)) { - addr += nr * PAGE_SIZE; - break; - } - } else { - nr = zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss); + nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, + &force_flush, &force_break); + if (unlikely(force_break)) { + addr += nr * PAGE_SIZE; + break; } } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); From patchwork Thu Oct 31 08:13:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13857707 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60028E68958 for ; Thu, 31 Oct 2024 08:14:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF0756B009B; Thu, 31 Oct 2024 04:14:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA19A6B009C; Thu, 31 Oct 2024 04:14:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D41A46B009D; Thu, 31 Oct 2024 04:14:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B547B6B009B for ; Thu, 31 Oct 2024 04:14:35 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 62C1AACC01 for ; Thu, 31 Oct 2024 08:14:35 +0000 (UTC) X-FDA: 82733185458.14.2BC2214 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf28.hostedemail.com (Postfix) with ESMTP id 766D6C0009 for ; Thu, 31 Oct 2024 08:14:06 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="VB2Cs/uJ"; spf=pass (imf28.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730362342; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PXIV/VGyywoEEnyyPnCjvRn11K4iZ53lY17h1S4nRMU=; b=ZIPX0h2wASATi70DT4kaDr/erEnkug3EM5edvYiJdUnnkjKgPgiy1wU9OZxjhI2qHVRiQ3 ntA8eg1aioQMMEcjvDM68x+aMPYkZXrfL+M2SIP4Db8g9rt6qFPX2AJvH7Gr69p+FVH/Wf ZfnCNqfhtwuNYjVY1+MYTMLi+X9BrAs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730362342; a=rsa-sha256; cv=none; b=zJit5QDjw3DgJI9YFrH4dVCgbR9zDck067JelBWcHK3lKAigK8Gy3cgGugg9k6+ull94Zp SCR/ZN6k0+fsnc9m2BjgihW0JPaMlBBMfZM3D4M+UIa2pd1LfBmhGb9pDjIAv7v4lg/2Yd 0RhVEVaq2i0xMJy/O6Huvyfn4YCDhKs= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="VB2Cs/uJ"; spf=pass (imf28.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-2e2cc47f1d7so499239a91.0 for ; Thu, 31 Oct 2024 01:14:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362472; x=1730967272; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PXIV/VGyywoEEnyyPnCjvRn11K4iZ53lY17h1S4nRMU=; b=VB2Cs/uJZW03EiI2DQ2AagrIXDQU8JoQGqJwFPbh6jw+XeUYvqR2YGpa5bDB5+cohq qxOOL0tuZ9DF4G1Kwa6EC+sP1VpfJgNJengFDr8fhy3hRgSZgpU9/PYoR11l1m6JO5jY RFwRfojCak2y5J1lSmLywMTINrCPb5SiiFWjN+qxA5q+dmlg1irb6j4HV9ON8g3QJV7L Fp14GanKFN5cx6Dj4VQQMPpvP2T/EP6gj+PAmCHfXRa6u2uHuTexe+oMfCvsQ6Y2HCMH oT3eZY9aMXf6p2bEer1ZUzPvh7z8rpfmbAshXWm1wV7UP68ju9kNdmbkBWpV9lQpMXW6 RM4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362472; x=1730967272; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PXIV/VGyywoEEnyyPnCjvRn11K4iZ53lY17h1S4nRMU=; b=HR3KKEwQHzrX4eTd+hZlcxdbwecHQQpi55qQ+8NnywEL2+hyMomCtHDc4k4aeOzxHz aVxpZl8RvEHjHkdDcO2HMP7euOC5ROHl1DYguVLxX2v0Eha+CjPrt+EHWiTAIh5ZIrWU 6h75hpBjS8PbVsc5wYa32x5JDe3Z6uL2MtZkMtUQovChI9T/FOoAUR6SR/zk/AdvDI6L 1ujoFWn2VFGliKYpCSqJONlJIT2Ca3iqqsoW5hhTak7QYMzQAMUxRQczB6UOpm5Qlm3p NOYHArVBD1ukx18/vRgo5ihaw0FpJlYF8487oT5v8oI3fJn+xAgyrHj845Bi5sYfEDOy l45Q== X-Gm-Message-State: AOJu0YwS4FZT8OoSV0IGIaLiaHh/EtxtUzxxr6nUOAg0e3MlTOzsV+qq LLPmJSZD0CNhkHkiulfQ4sm6koMrqasU0maLxLI4Q8BRHbEMjukIpu5u7xjEFPQ= X-Google-Smtp-Source: AGHT+IHX23XnGunJsTXB9lUCwZL9x2plTKY2+aB5lG85OOgVbjA/C6wsUFYSS/R0cYtYXLJhUsQsBA== X-Received: by 2002:a17:90b:1b4a:b0:2e9:20d8:414c with SMTP id 98e67ed59e1d1-2e93e0139a2mr2423394a91.5.1730362472334; Thu, 31 Oct 2024 01:14:32 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:31 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 4/7] mm: make zap_pte_range() handle full within-PMD range Date: Thu, 31 Oct 2024 16:13:20 +0800 Message-Id: <63c92f1e2a2fa6267490ab9fad5b090316b39b98.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 766D6C0009 X-Stat-Signature: wxhrb68iai7rb1wom4n1ycmnozbtdna3 X-HE-Tag: 1730362446-342290 X-HE-Meta: U2FsdGVkX19vQImwaG6vjXXo0QmAvcMIRPz5pmyBhvILuROrdYFU0eTdvWEx59Wk7YGTN2x9NcJfoRIiIBPjRUdFLzHwqM5pw9vikniM5GuiSmjP79piwbuOWVYm4sC6U/qEwnSrw7RpwzRWHSUvtm1KVnB3gERJhM7A3qI3DiDBnb44Ssj0lm3LaczD2mGN4IeVGnE1IRPiYKfzSZgd20kISM7Iz/Jbp9PcEavgfO9Ten4l4Ct+rADeCH/IyyMmuLTU0OqlZJQezdPHiqPXovkn5Tne55ULGkbvdt85wYV1IWDGBU2Epu9IcA3zcUeoQI/c7067wtWyOggXkVp8Aw4cLck7qHQ8tZ8Tub4ZJgfjZy1WSLa4SAFeq6r+WVZIrriwqKYbk5Ycg339dCrW4K46BaaAR3ft+Hlh4R0uIkpyrEdpd+DKNRDqQMGmNZfvr5DTAa8r6y7+abwJ3m9w0nfYKJjx8BnbZh2WpWqdQTZ87hYiNnLAv5m9eT+6KYuxiagmY7841QLbZB69+Vwiiav9sD9M6QL/nHFCdPvZTzIYXiJoGwOkp5wJ0mn4HNId5tNeU6ZSYon23tq6pD41vxYI8dWbfL99Iz5zdAzunwq7LKnChc842CWLijNBtFdOizTohTun2BDQEBqQzMDl/x6bm8pgsJNNLxalKtyf/AhIMS4ycqSjyUP0C+xs+Wn+2J60h8r2j2kFm+FKpsf9KaJz1dcX4EPJQ+kowjyZ2AeZgAsTMKGyquzgZ8eb3D4io3gFU/f0JCKyZYKuJxEcMcTrtiKSA1Rgv+mLPTpoeF4rcQ2tYVldg4iRyKpkKawgtpGOhl4LNtx6wCcvucsrmueImYgBtYKGNOIsch/QqAPKEkLBvVkdv3YCYDGmDtV90cAeTcvW80VVqp5MftlxRG3Nq3AQBdBCQsIRuRwBRVwCaXA32pBdChIqOWG3BkrnONTAt006jLQBYFhhppt eMCU1CPZ t96lr+udVDPfZZjIgumljlbbJFwtTqw/Sp1/TmBi5C8J1oHU61z2LwftC7VpNLtH9ie9ZEWcMkltmBxYczqoUBG65/SmRJkBPkLdME0oNGll5zH791/Ocm/kNp3N/WkGO61RI/SY5P0ky17YJYcx31xP8K9PO0AQvPEarFVhU7JY4KiUiI69JbE6ad/fi8uOiMGM7lPrgqd9zPPHHwKxVq7bysyXG3+ZY1z8xsbxukZhvsciCjY1DZWE19YcPNjuc1PkpGG/d7kVgPeK7cRsk7ufPBShus/kq/RrsOFls23jCxM7LB88AcVAyViOo9877BiRe69EE6IXYUhjabytinIaDxjQ5VlKLtTZ05K594kdxe5U= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In preparation for reclaiming empty PTE pages, this commit first makes zap_pte_range() to handle the full within-PMD range, so that we can more easily detect and free PTE pages in this function in subsequent commits. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn --- mm/memory.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index c1150e62dd073..002aa4f454fa0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1691,6 +1691,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t *pte; int nr; +retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); start_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); @@ -1730,6 +1731,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (force_flush) tlb_flush_mmu(tlb); + if (addr != end) { + cond_resched(); + force_flush = false; + force_break = false; + goto retry; + } + return addr; } From patchwork Thu Oct 31 08:13:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13857708 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEA38E68958 for ; Thu, 31 Oct 2024 08:14:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 41D476B009D; Thu, 31 Oct 2024 04:14:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3CAD36B009F; Thu, 31 Oct 2024 04:14:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 245A16B00A0; Thu, 31 Oct 2024 04:14:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 058066B009D for ; Thu, 31 Oct 2024 04:14:41 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8EA74ACC2F for ; Thu, 31 Oct 2024 08:14:41 +0000 (UTC) X-FDA: 82733185416.16.8C04764 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) by imf14.hostedemail.com (Postfix) with ESMTP id E49C6100014 for ; Thu, 31 Oct 2024 08:14:09 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=FkNbDsp+; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.54 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730362318; a=rsa-sha256; cv=none; b=sCdwePPAP05dBrLcZRiVbBzS/xUVnUNmdoYw+mGrdrWBFzSowKOzcxm1hcmODzARFx27p0 x3Nj7enAepxD3sJWWoQybdBupQcMTjDRhdj4eh87G7iIdDcbS4y/UtnYAkdN9FKR5BQIW9 ldeq5gDyyelODml0sM7n01ws6Cn9/tM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=FkNbDsp+; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.54 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730362318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/4RftZhl83d9xOlPt19CbxNfudxdUAgo7jhJTmA1K/8=; b=WwY6xWCaJWHYcEbp0rHG6fpQZHXRRmta+Sh8K+Pvwn4ThKEiwF5+z/3/KgDaYBNVG0/F13 1XXP5fzhibIwbFdkM2X07egYQQBybRlq0yF+H+DpoHiESt7vFKffy8hjJ00upISxiCjI1y 2ViYXDuPgUbzBGOkLHTKPCj2pG/YduI= Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-2e2bd0e2c4fso565520a91.3 for ; Thu, 31 Oct 2024 01:14:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362478; x=1730967278; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/4RftZhl83d9xOlPt19CbxNfudxdUAgo7jhJTmA1K/8=; b=FkNbDsp+4pKV8qJU9VK/DWJmtXxSBxdy6NZ1OV+sAN3TpDaLJlYBBON2fUOfN5GoQz LoPF2yNeJ3FMwdA6dP1WtePsYtktopJw4nVmFRQL1Zy2UWdbEB5MjClI8dL/kaXqZzor SlQbbeCw4ncxL7zToTKuwKtZZGAh3Vb9LWk2AtOZQ3uJb2qz3quicWCR+n0wkmQoE8sv hnjWudjTMehgztkxkpJ3jijHp3lkQXhVSOlI5LXOSV3iX0CbXTrmUNvJLPrchnsEkjB8 Dx/Y9YgTBNybApnStXjPg7bBgZFRZJX1vyeBxlTLu+R4F+P20E9ZdkM4yMLVTjJAiQRQ vDfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362478; x=1730967278; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/4RftZhl83d9xOlPt19CbxNfudxdUAgo7jhJTmA1K/8=; b=rWJ8OYON8YmiaUzgfHf55A1Pnyg2byLX9X+PDMTw+Ed2vVTPht2ET7Uw4y4IYWu8T6 WwXyP47jIG92qtg11Rwv5WeoRALX2DE2Q2Jps5DdAbbvEIOTgJPRSYSfqZk0nfI8vxgu Dz8+gSAgGooP79l0WwJiaWxzo2v+K/c3eIf83fVJMUQrrv7ozgR8vBSf5RO03/U2svYu 8LeCLlQ05b14hgCQLGusIbpqfHpEwIDuTz6YqRZCb4L/BrdcLwOFP51MgnhxEZzKWQM+ wUn9vJzOCxQzrO9Afg9K7D7QhBAw1VDZdl/P+kVU2go1SHWCFVEYFs0XPmPz94/KeQmB 5HTw== X-Gm-Message-State: AOJu0YxOCAKsqtPGDZqBRacpF2+phi1dCFiTZabfIb3sAk4kbwxw4kQJ rfeBAsbNj/6porj97vcf/i2NBGXn7zmzVhOfJlJ5ZmayHPgwo8EXfNHMJLPXOZs= X-Google-Smtp-Source: AGHT+IG9/W35C0GmYjhqi67a7cjSq3d8MXDjUDLzpnbNphLEBFgOx4BfbHz6qsffBssbygHke8ycIQ== X-Received: by 2002:a17:90b:1fc3:b0:2e2:a661:596a with SMTP id 98e67ed59e1d1-2e93c1860femr3011162a91.13.1730362478602; Thu, 31 Oct 2024 01:14:38 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:38 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 5/7] mm: pgtable: try to reclaim empty PTE page in madvise(MADV_DONTNEED) Date: Thu, 31 Oct 2024 16:13:21 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: sgjz4tprnwg3m5qmbcdbknwzusfx9r74 X-Rspamd-Queue-Id: E49C6100014 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1730362449-603818 X-HE-Meta: U2FsdGVkX1+yZqlqyIMpgUECG2b0gGs55f7IW37KEy838N8q7jcSIN60CNxa3T/JdMEj0W7yVaZCT0vp+1Fvh6ZvKNDwuZTMGWrDBEZYicog0XFRzlAie56ha/npV1PeL3DU/4co4kE+sSXxnTYQw57tfSOE8jaNYIQfXmRAmvAJT2qkwFNDMppFQCauEWdue+ce8PRIS2SOdy96Ws19iyqIV1kk55fGAmbK+dnxTKWOT2tPiRoSCwwXjnhdVi4VGNFEHgLHiPlA3T3Cmt9WpYGvD/iEyVawV2MPL58HgAiayo85VVM6XMGCUv/II3kwF3ZdC7LZPS0XX60iApls1SMxEQ8dLoPyV4kDgynVGc6v4WfFy0NGWjcG4O67InARoi5toqnJGiWDY0cKbRd6nwnWQS1ROietyPd4+eZ5Y9UrquMjlELOVhcMCDC97vYUgyLrzXQAu3lMLLuqDvWPdKXmCE0UgOOfKs2UmgUQLeTWJd/4P0+joW3DwgDlkSFXobCugCqago833Likalavsp8LBapxT0P14+CnnBnJuhNiIc/On91F04l95YGDsejvLbqCvbGtLO2HLN/Vi6FXXJLVjxQKdrD6+5p2e6pwdJKO5Sc3GiT7Ik8n219Ty70VfrokQmUwrrAJ95r6HVIKcMXBzDYn2wOC0J0TTH4UmfdFtMveiuTjGxPM0TuUIdsxVvr0xbHtwicOMoKMgNKV935sqq9iP4d4blIcodTsCBdy9zQdFc9JqDPlNuJErIBQZ3vplOiXiDnbXW0iajY4Mq/Wf1F1qPqe5GUPBqF40j/8C7mcwfung2djjmsQBnp23JoQGhCMF63ptzH7V9gNzK8ev74EKGPvkQq+fh8By0Rtw/SCcFiGswoDNAp2NleB0rJYMKgcxz/Hc3r2lCbQb/dlneKDHY316IBUstGM6g/CYgH7E9zIsavHUf69qzmo6pm/y40k7BmWesH3Jig rDy5bMmK G8iN5m2dsQwyLzMsBvRtCkQW35VSH8+lONKdkG7Tx5DK1mzfbxF1U/84JxEHISBUmS3OTGf2GHuYjAnti3jmBYBGg454Yt36U5D/jZIgnmmt9+KHaYbBbCX5RXYmjHANYiCEGZhanKzzE/bEFPoludgm1E9KS+EX6ShhScpzDGTqbIyrFUGkB6/vs837VyIIcy9pksrgI2XEC5S4J/NmlfUo7j+bKAr3QTP8JBCHUB2ezOJAwvhzBt+mHbmOQGhyooGAjDBr1xb6FqRK5k9QlV2DnlSFRgweSDJU+JZNP0r/Z1ZMw95pcCHm946VvWJ7aR+KpJ/l7Rl62+m9ntK+tkW9jVgW3MIzoad/SioXNP74JcmU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now in order to pursue high performance, applications mostly use some high-performance user-mode memory allocators, such as jemalloc or tcmalloc. These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release physical memory, but neither MADV_DONTNEED nor MADV_FREE will release page table memory, which may cause huge page table memory usage. The following are a memory usage snapshot of one process which actually happened on our server: VIRT: 55t RES: 590g VmPTE: 110g In this case, most of the page table entries are empty. For such a PTE page where all entries are empty, we can actually free it back to the system for others to use. As a first step, this commit aims to synchronously free the empty PTE pages in madvise(MADV_DONTNEED) case. We will detect and free empty PTE pages in zap_pte_range(), and will add zap_details.reclaim_pt to exclude cases other than madvise(MADV_DONTNEED). Once an empty PTE is detected, we first try to hold the pmd lock within the pte lock. If successful, we clear the pmd entry directly (fast path). Otherwise, we wait until the pte lock is released, then re-hold the pmd and pte locks and loop PTRS_PER_PTE times to check pte_none() to re-detect whether the PTE page is empty and free it (slow path). For other cases such as madvise(MADV_FREE), consider scanning and freeing empty PTE pages asynchronously in the future. The following code snippet can show the effect of optimization: mmap 50G while (1) { for (; i < 1024 * 25; i++) { touch 2M memory madvise MADV_DONTNEED 2M } } As we can see, the memory usage of VmPTE is reduced: before after VIRT 50.0 GB 50.0 GB RES 3.1 MB 3.1 MB VmPTE 102640 KB 240 KB Signed-off-by: Qi Zheng --- include/linux/mm.h | 1 + mm/Kconfig | 15 ++++++++++ mm/Makefile | 1 + mm/internal.h | 23 ++++++++++++++++ mm/madvise.c | 4 ++- mm/memory.c | 45 +++++++++++++++++++++++++++++- mm/pt_reclaim.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 155 insertions(+), 2 deletions(-) create mode 100644 mm/pt_reclaim.c diff --git a/include/linux/mm.h b/include/linux/mm.h index 3e4bb43035953..ce3936590fe72 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2319,6 +2319,7 @@ extern void pagefault_out_of_memory(void); struct zap_details { struct folio *single_folio; /* Locked folio to be unmapped */ bool even_cows; /* Zap COWed private pages too? */ + bool reclaim_pt; /* Need reclaim page tables? */ zap_flags_t zap_flags; /* Extra flags for zapping */ }; diff --git a/mm/Kconfig b/mm/Kconfig index 84000b0168086..681909e0a9fa3 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1301,6 +1301,21 @@ config ARCH_HAS_USER_SHADOW_STACK The architecture has hardware support for userspace shadow call stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss). +config ARCH_SUPPORTS_PT_RECLAIM + def_bool n + +config PT_RECLAIM + bool "reclaim empty user page table pages" + default y + depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP + select MMU_GATHER_RCU_TABLE_FREE + help + Try to reclaim empty user page table pages in paths other that munmap + and exit_mmap path. + + Note: now only empty user PTE page table pages will be reclaimed. + + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index d5639b0361663..9d816323d247a 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -145,3 +145,4 @@ obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o obj-$(CONFIG_EXECMEM) += execmem.o obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o +obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o diff --git a/mm/internal.h b/mm/internal.h index d5b93c5b63648..7aba395a9940f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1508,4 +1508,27 @@ int walk_page_range_mm(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, void *private); +#ifdef CONFIG_PT_RECLAIM +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval); +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval); +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb); +#else +static inline bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, + pmd_t *pmdval) +{ + return false; +} +static inline void free_pte(struct mm_struct *mm, unsigned long addr, + struct mmu_gather *tlb, pmd_t pmdval) +{ +} +static inline void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, + unsigned long addr, struct mmu_gather *tlb) +{ +} +#endif /* CONFIG_PT_RECLAIM */ + + #endif /* __MM_INTERNAL_H */ diff --git a/mm/madvise.c b/mm/madvise.c index 0ceae57da7dad..ee88652761d45 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -851,7 +851,9 @@ static int madvise_free_single_vma(struct vm_area_struct *vma, static long madvise_dontneed_single_vma(struct vm_area_struct *vma, unsigned long start, unsigned long end) { - zap_page_range_single(vma, start, end - start, NULL); + struct zap_details details = {.reclaim_pt = true,}; + + zap_page_range_single(vma, start, end - start, &details); return 0; } diff --git a/mm/memory.c b/mm/memory.c index 002aa4f454fa0..c4a8c18fbcfd7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1436,7 +1436,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) static inline bool should_zap_cows(struct zap_details *details) { /* By default, zap all pages */ - if (!details) + if (!details || details->reclaim_pt) return true; /* Or, we zap COWed pages only if the caller wants to */ @@ -1678,6 +1678,30 @@ static inline int do_zap_pte_range(struct mmu_gather *tlb, details, rss); } +static inline int count_pte_none(pte_t *pte, int nr) +{ + int none_nr = 0; + + /* + * If PTE_MARKER_UFFD_WP is enabled, the uffd-wp PTEs may be + * re-installed, so we need to check pte_none() one by one. + * Otherwise, checking a single PTE in a batch is sufficient. + */ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + for (;;) { + if (pte_none(ptep_get(pte))) + none_nr++; + if (--nr == 0) + break; + pte++; + } +#else + if (pte_none(ptep_get(pte))) + none_nr = nr; +#endif + return none_nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1689,8 +1713,16 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, spinlock_t *ptl; pte_t *start_pte; pte_t *pte; + pmd_t pmdval; + bool can_reclaim_pt = false; + bool direct_reclaim = false; + unsigned long start = addr; + int none_nr = 0; int nr; + if (details && details->reclaim_pt && (end - start >= PMD_SIZE)) + can_reclaim_pt = true; + retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); @@ -1706,12 +1738,16 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, &force_flush, &force_break); + none_nr += count_pte_none(pte, nr); if (unlikely(force_break)) { addr += nr * PAGE_SIZE; break; } } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); + if (addr == end && can_reclaim_pt && (none_nr == PTRS_PER_PTE)) + direct_reclaim = try_get_and_clear_pmd(mm, pmd, &pmdval); + add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode(); @@ -1738,6 +1774,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, goto retry; } + if (can_reclaim_pt) { + if (direct_reclaim) + free_pte(mm, start, tlb, pmdval); + else + try_to_free_pte(mm, pmd, start, tlb); + } + return addr; } diff --git a/mm/pt_reclaim.c b/mm/pt_reclaim.c new file mode 100644 index 0000000000000..fc055da40b615 --- /dev/null +++ b/mm/pt_reclaim.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "internal.h" + +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval) +{ + spinlock_t *pml = pmd_lockptr(mm, pmd); + + if (!spin_trylock(pml)) + return false; + + *pmdval = pmdp_get_lockless(pmd); + pmd_clear(pmd); + spin_unlock(pml); + + return true; +} + +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval) +{ + pte_free_tlb(tlb, pmd_pgtable(pmdval), addr); + mm_dec_nr_ptes(mm); +} + +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb) +{ + pmd_t pmdval; + spinlock_t *pml, *ptl; + pte_t *start_pte, *pte; + int i; + + start_pte = pte_offset_map_rw_nolock(mm, pmd, addr, &pmdval, &ptl); + if (!start_pte) + return; + + pml = pmd_lock(mm, pmd); + if (ptl != pml) + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + if (unlikely(!pmd_same(pmdval, pmdp_get_lockless(pmd)))) + goto out_ptl; + + /* Check if it is empty PTE page */ + for (i = 0, pte = start_pte; i < PTRS_PER_PTE; i++, pte++) { + if (!pte_none(ptep_get(pte))) + goto out_ptl; + } + pte_unmap(start_pte); + + pmd_clear(pmd); + + if (ptl != pml) + spin_unlock(ptl); + spin_unlock(pml); + + free_pte(mm, addr, tlb, pmdval); + + return; +out_ptl: + pte_unmap_unlock(start_pte, ptl); + if (pml != ptl) + spin_unlock(pml); +} From patchwork Thu Oct 31 08:13:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13857709 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90B2EE68944 for ; Thu, 31 Oct 2024 08:14:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F8906B00A0; Thu, 31 Oct 2024 04:14:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A93A6B00A1; Thu, 31 Oct 2024 04:14:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04A316B00A2; Thu, 31 Oct 2024 04:14:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DC23E6B00A0 for ; Thu, 31 Oct 2024 04:14:47 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 99B1D1C7185 for ; Thu, 31 Oct 2024 08:14:47 +0000 (UTC) X-FDA: 82733184408.20.8075214 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by imf19.hostedemail.com (Postfix) with ESMTP id 586C21A0004 for ; Thu, 31 Oct 2024 08:14:12 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jx6qXOOZ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf19.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730362430; a=rsa-sha256; cv=none; b=UBZm1/rMONqQZnMdjsJWG36QFopnmzNYTbJlFexl8bSx3QltOXpRsYchA+/PwPghnWvsKs wbv6iw00xLQWm6anYWFWCInyGxP/J6nDLdHqXxNVFzhqkM/0NyCPa4K7NONZq/broZ9SqP c3ELlGX8nheRijvfVsDx+KJ+LtiI9oY= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jx6qXOOZ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf19.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730362430; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=k+0O1T1ubfcUj3LC/61rPnmIHW1OAmyO8YYpc8MQNykh63xRJ61EKMd2S40TcuRCcRWrzV 9W2WqpW8FyIY8Vz95oJpi37luhTC3oBI7P6spQcNA8IajqicM5xBmzWviP7fctJx9DOpmB Ct/opFOuPWO5U6RAstZCalfz8+WOd8E= Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-2e2a97c2681so486529a91.2 for ; Thu, 31 Oct 2024 01:14:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362484; x=1730967284; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=jx6qXOOZap74sYIm0iOmlxxyTh0kv1cpuBES1fghMbUhiD4VYFEjSZgJVvxiM9kZcr SXGNeAc05hvboLpXouO2UCTbcts/Rw4zCWf5CjZS/s8kgfQrxR52KsWZA6/+U30cQQTz 9ni0wEbmvIPMg1FmWWYwvBms8eb8qHb5fdtVWdFEYqoNfbNYBZXaDg5uobKNAKfxJgIj GBNVeMz+gb2R1GJafLsDj6K+cIv3pCkKcHPPxoUNVTydXmaCTOuBzOZ7x9nzxfg1tZpJ IJWwdL3K4yLGnEuYD/KS930IweI1phzEhxlLl1DpFWxpGVQwr5Q61S3YdSsn6pr4ABvJ NCpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362484; x=1730967284; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=CavVAOgmG3FWQ9DC177Os2mCHkuchr4N3TyBKSI/aD+MOZeWAb6b8/NEjJwI4a2w9l Q48E/95Vckryq6jkUQflhlW0cIzYvzychzxflC6pai+GHM6fAiivFKPXuHIhXInct8GF OoDZsqDtd9dGKzX3GRJVYDOogbYdV8mp76YNIWqt+UpX4GzIqCYpT979QUx1bC6kddIN Xqm4mjsYv3PFRiWAXgR2QG08czvlK0eTvw0VNZP1g2SYbRD5a0tOH53wAXECeAIdAyzJ B0aaZHc3lOHXL7Rs25V0Dq89bkGH4zfPxEzAaIfhxmtBQPYpmG0MHSzVJJJVvepjRbxI flPA== X-Gm-Message-State: AOJu0YxCL8NBgrGBik8iXbe9nqCS/Nyhra8ohPgfapBdlGYgxYqkMtmp 51fmHEI/mV0DXv8YvziX6U4pfTV06XOfWtp5K0NjnLSWEEstxhRiknIyjbi0kmk= X-Google-Smtp-Source: AGHT+IHc3twInabx1E+TNdhn783JHqCF2GYMwoTNrA61R6GZcT2sSHxVlBm8Eg/JbQvoGhNtrFU0aA== X-Received: by 2002:a17:90b:3b4b:b0:2d8:8509:85cd with SMTP id 98e67ed59e1d1-2e92cf687efmr5271210a91.40.1730362484634; Thu, 31 Oct 2024 01:14:44 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:44 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 6/7] x86: mm: free page table pages by RCU instead of semi RCU Date: Thu, 31 Oct 2024 16:13:22 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 586C21A0004 X-Rspamd-Server: rspam01 X-Stat-Signature: hyoa4gi7httbnufzxz5gtnc9tjm98qy3 X-HE-Tag: 1730362452-511237 X-HE-Meta: U2FsdGVkX1+Wx0C+lXusLzYZMuk9S0ZmOQJxCWtPMUYZH+OsnqR06K3LMNAnd3lpFkTOq23w/9rUrGU+WwK+0s8Foea4fR51YzSrIDfGiFoOOs8YAwxjHvC5bPuDxxvlglCc1KlJuDEGcgQa5S/1Jmcsp0JrRQhhjGhJmUVtAX891kLKcDZStkSr27QZvMe36qHGbcKDqiAsFWtdv/L5LpXbKC1AtPzsA6Dvlht4PutKYdnBu9NaoTBKxovjR8t1KkxF2T3syOTegSDa7GSi9HoJWvPzzduTefEObgBigbU/azBgJnKdStMxzukTmoWhPLbuKUT6iY00F0/rY6gikbaGEFYA4Ox1ItJaTtSe9VJOFsw5jVr3cQ+KJ8DR6sj92VbjaocE5+HLBOALC50zTrzxpCc1Uf4XuTcBEuSQz7JuqYjJPYUXOibh1pSoBLtCOtp9s8AUKzerDPqJVosPhZex6W5Mnwv4lN32z3ytL/Hu6rWOcAduEcwyELjDFmAu65BPPW2GPqjzg8/WKhzW3J1iZAF5+rp5HXBH6uBA2VqHcG2b8QYYAj/SCGsupAMKzIN3KuVWPZDq6oUK8NqXh3pZzZxT4qputSy3GCPb/Pxl3/+VOFq1sGwX/MumoyXu5o1ZKAdMAijH7aoT8hUkAUVu48pimlweirc+QyoXQnP8MNkKlY7miP48SihVc0VcMqlsdDcNjh3eyvvt0Xg+HhpiRpMobkSeDpQ1MzlHCFLT08I4I3C1td7OyIGda9OjYV8n+2uJseYb3FX3UAj/amNzvJdQhA5YJ4oqbKHXXlU+IBuzNiXlDb3bLJlcY7A9q380cl4SjAbkxoIyDzFTGJMiUp+aIpfSMb9Man10bxGByknlgQawPHZE7UlZwEqQT3K7iy/iG/JbYa6zstA5jcPP/VQqvmeOQcmYBJRsLKp+MnmcYVnxV6nQIVCg+O/Jna+3yQWjdqn/vWqEFKz KxxGtEta 2Y0C0w9qDKxRtVEhQPIFssapOVokYLCrYKwPkdB4Aug1CDUpzo0kTRebzvV8K18AQ3HKBDxb+ZOWug1u0AzIHAe8johN2I/GynsLceaZGu5X7OcOB0+GvI2AjHEX8Qil/cK00g8nYXHCW4Lj5U6NIt6Gp4FmNIhm451BZE8Q2Bwv02t0zepW4XNMxnLizmEYwh/40zHPzBaWolmwgIeWy4HWOTwpPvOZ9cHBUe1OV9SUOdVmhs+9GbQVuXPwJOcEI1hfzNbdmneqjyQOkZ8S03OHIf5hPDoWNuNxSFQ85hponG7W1pZhiZPzCvg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, if CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, the page table pages will be freed by semi RCU, that is: - batch table freeing: asynchronous free by RCU - single table freeing: IPI + synchronous free In this way, the page table can be lockless traversed by disabling IRQ in paths such as fast GUP. But this is not enough to free the empty PTE page table pages in paths other that munmap and exit_mmap path, because IPI cannot be synchronized with rcu_read_lock() in pte_offset_map{_lock}(). In preparation for supporting empty PTE page table pages reclaimation, let single table also be freed by RCU like batch table freeing. Then we can also use pte_offset_map() etc to prevent PTE page from being freed. Like pte_free_defer(), we can also safely use ptdesc->pt_rcu_head to free the page table pages: - The pt_rcu_head is unioned with pt_list and pmd_huge_pte. - For pt_list, it is used to manage the PGD page in x86. Fortunately tlb_remove_table() will not be used for free PGD pages, so it is safe to use pt_rcu_head. - For pmd_huge_pte, we will do zap_deposited_table() before freeing the PMD page, so it is also safe. Signed-off-by: Qi Zheng --- arch/x86/include/asm/tlb.h | 19 +++++++++++++++++++ arch/x86/kernel/paravirt.c | 7 +++++++ arch/x86/mm/pgtable.c | 10 +++++++++- mm/mmu_gather.c | 9 ++++++++- 4 files changed, 43 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257b..e223b53a8b190 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,4 +34,23 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); } +#ifdef CONFIG_PT_RECLAIM +static inline void __tlb_remove_table_one_rcu(struct rcu_head *head) +{ + struct page *page; + + page = container_of(head, struct page, rcu_head); + free_page_and_swap_cache(page); +} + +static inline void __tlb_remove_table_one(void *table) +{ + struct page *page; + + page = table; + call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu); +} +#define __tlb_remove_table_one __tlb_remove_table_one +#endif /* CONFIG_PT_RECLAIM */ + #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index fec3815335558..89688921ea62e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,10 +59,17 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } +#ifndef CONFIG_PT_RECLAIM static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } +#else +static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 5745a354a241c..69a357b15974a 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -19,12 +19,20 @@ EXPORT_SYMBOL(physical_mask); #endif #ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PT_RECLAIM static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } -#endif +#else +static inline +void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif /* !CONFIG_PT_RECLAIM */ +#endif /* !CONFIG_PARAVIRT */ gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 99b3e9408aa0f..d948479ca09e6 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -311,10 +311,17 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb) } } +#ifndef __tlb_remove_table_one +static inline void __tlb_remove_table_one(void *table) +{ + __tlb_remove_table(table); +} +#endif + static void tlb_remove_table_one(void *table) { tlb_remove_table_sync_one(); - __tlb_remove_table(table); + __tlb_remove_table_one(table); } static void tlb_table_flush(struct mmu_gather *tlb) From patchwork Thu Oct 31 08:13:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13857710 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83672E68958 for ; Thu, 31 Oct 2024 08:14:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10E156B00A2; Thu, 31 Oct 2024 04:14:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BF0C6B00A3; Thu, 31 Oct 2024 04:14:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA2BA6B00A4; Thu, 31 Oct 2024 04:14:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id CAC196B00A2 for ; Thu, 31 Oct 2024 04:14:54 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8C719A0F52 for ; Thu, 31 Oct 2024 08:14:54 +0000 (UTC) X-FDA: 82733185164.13.BED4DFD Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf11.hostedemail.com (Postfix) with ESMTP id E849F40012 for ; Thu, 31 Oct 2024 08:14:19 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=D4Dyk6UZ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf11.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730362437; a=rsa-sha256; cv=none; b=LnW+FAbS7DvmIewYGCy9AG7Xf84zdoj/AiRewPCfEotKfQ6OuM+aGNhbFru+/wH3t9eo94 kl1Sx8VOgz56LN8AQqHjodlLKIUFAngXk/DB9xu7lS6Tes+QpAFPCdv7rOId0YIbmjVgAg cd1nK48D8zwHMP9GWfYvDWiD9yhZHso= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=D4Dyk6UZ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf11.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730362437; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iWdHYUgzbXn/U7M7SYCKm/NpDuEpbf7wkXka4/ylaW0=; b=zbjuYF/14dCkkuUfUG0IDc+nvzHvLrW5f79NerWgx5JwUJe/+BSa86rq/dWA0afXX7R1r5 G6OGCjIifomMD+Vn0tA7vEdxSVY26QWYnaxpwPJRh7MLvIf7Svy9OZD03E31Uj/gLetQlZ 89pAplH4Qyz9JkAAel7ccMSCQhPzoNQ= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-20caccadbeeso6940615ad.2 for ; Thu, 31 Oct 2024 01:14:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362491; x=1730967291; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iWdHYUgzbXn/U7M7SYCKm/NpDuEpbf7wkXka4/ylaW0=; b=D4Dyk6UZVkSdpaX3FnKKYtbwm49RULGp5E40kWXMkvJkQmXzuBtwmAJHF5tAKlXMXO 24Nd2oTDtfNmad1QysjWGz32zkwdB2gJMcDcRVYTR/b954G5RbdYYYZL85tn/apjgRE9 t7XT0306pFTt10+/E2bE44K+N41qrGympfqagpeTMqS4xV5Dma3LXRxElT3wqV4r3ORQ ImI23Zgc1IdM/76+4mg2R9uHiNf3k6QockM+mhJQEIF6RweonzYP4v6Po3XGRMy326gL /g2Ku5E5y4RQ3+H41oSvx4qSjok7mIri1VIepCaqNfG45W1HCtUF0nTjdwEvI9U1dln5 O//g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362491; x=1730967291; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iWdHYUgzbXn/U7M7SYCKm/NpDuEpbf7wkXka4/ylaW0=; b=JYcv4Dk3JxAFyQEf3+R7++BZVcPa8vIjdGxkOBK/2ytY/aU0ioqMpQI4mv+Oh88hLx 5J1kOVIwN2bpIbpnBbmOASN+rXI3PuG8oC+VyYVBy/wPK1i15wtVUkbIHtapBke+h4zp VQCvc4cq1X6Ok8Aqe6ZVI2k94/dNR9BOav53QS3ie9Zr1gKDqSyBMebKest53G2c7fK2 av0YmTe341U3ql670fXzjBXU3DXKk1FolXidGzaniNpdRDkUeU4tQ3c/LI/99lQaH8a2 Fe0r1/quMEFWegXImW2cERgFRs4+PmnCXnqHaDieCwoyh76RRN2cFL/Ns3C51TwePk6Y uz2g== X-Gm-Message-State: AOJu0Yw5C7xe57A2ONtCGTni1KyLs7rSkvBtAbmEq/Lf5z7yZbqMhsPs rZhvj2wJjLbPUS0PJ2OC4RQmFMjK7hS/3oSdXER/hSmHPVjZf2jBoc4Qo0xGJMk= X-Google-Smtp-Source: AGHT+IE3rqtNftE0bEC7q6UBe3srVwCXipxTTFqti2PirnO+trKsr5sr4dmd4R0xR3d7wiAw+7f4XQ== X-Received: by 2002:a17:90b:52c8:b0:2e2:a013:859a with SMTP id 98e67ed59e1d1-2e93c128a9cmr2763869a91.7.1730362491551; Thu, 31 Oct 2024 01:14:51 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:51 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64 Date: Thu, 31 Oct 2024 16:13:23 +0800 Message-Id: <8ed7cdb1e5fa5e0a357bec9178b8932d596573c9.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: E849F40012 X-Rspamd-Server: rspam11 X-Stat-Signature: 39uiukw6dbcysng6u4w58ggcekwecae4 X-HE-Tag: 1730362459-114471 X-HE-Meta: U2FsdGVkX1+wiyHTiwojoEC4jcnlH/bR28ATbDdIjCGpPTjE5aspgRxwF90wvCvJz9sAUj5xcHwvWJmqb+IeZFwvt1431DyCUtSu7EgiRCi7odqQ5S8yD5N734HQSNwjHbav1CqRWEZu6jkdXYJWWF+K7GSnZcH7E29fSqvyxX0G7nHjK8vnYfpPy8Yd36hLS/LDlhoVNm2UIJUadGClDKwEnNSvrt2QUdVYj5z+lD3PNhSuQuspsqHJ7LlfxNAFM/Br4KQKw95EaX6Fu1bvkz4FPLmoTD5aSuwghbTuHYeCgqsFfvD8XPy28gfioghRMrnh6kUucqpPn2KJw/2obFlcEkLRkLHd+c0yb9pL4Zk98GdYIHUzwoP8aajRiQekDFd4KGOXt889ALhmsdKGIPiyYeIbos4+fPt7INoEGsTIDvV1u/WB4e1RTpoblRgpAKu3T/WAO6tGevyG4gsL14Lw0YZhWhrOuU48Gv92FIohCXQN6j0yN9eqb02EsH2znx16g9O3E0BfPRlmrk3CVcNQyJ5HW2OyECLGtw5tgH+JbjSBjwrfVebmlqIWh9jRbj07ozJ1X5z8UVbaZ/u6+ZXsPmkbcUSSYSd7Be/THGQgnL8+14ac8zZelElN47FEBjELkq6hfAv4fi1ToQ513blHTEw7HUWDsZAwaHt2XmcizWCw69XxqLpm2OYEkITmmO0UNvShbscGQURyfjdu8ZcGBX5zQCPx9Vz4f773RY9pv70AIKfh9zP4wUlIWRZPymV9OkEf2GeHBTq9B1A/ceqDwxSSd5ArI34GefwkA9U1PirXaYbTStJUWEnbKUlQZy/XzICmA4KZmKAEGLjSxgjVs7ngkn4JXwWUZTYcJ9/fowU5BFmYgU/xedoGCt4pZVklTDVXMtfDKF5U++wxmH1+oKIjOntO3TycMZOnJpS5QDFQNHnrjS6dz9TO6CmBOtYc/g1Nr6l4geFjLG4 PsoLxtj8 +1ERak5IH6YVAqym/7ORlaSO/085SvBEr1F/BjF3UjXbulyD5MP3t+poHjoLLzriIZ99V/ICTWpBKuw4do7Z2CHktY7ZNGIP7QDGZfrqLGJu1QBBJCjbnWEoyb5hV+3h7EpCbQfy54zaqS2H+V9VHgiGhJeZc3JN68u64nf/V67XbPQRwC8UEVUciD0Z3kE/JmOGGHH6lglGsOU7iyOP1FUfQkb6ojXppfzdydBqS5K4ZjsADsWfL3reOQzvMfouqScGDzUIiMr93miQOxePR/Zww4Tl7bVymBiGHg9mCO0U+JR3thdajgph3xFIP1g97pbLnilcWm9RWQ+zwWDHmHsfzbecNePRRyXOiNYHmMnXTXb4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, x86 has fully supported the CONFIG_PT_RECLAIM feature, and reclaiming PTE pages is profitable only on 64-bit systems, so select ARCH_SUPPORTS_PT_RECLAIM if X86_64. Signed-off-by: Qi Zheng --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1d194fb2f979e..194baed21ae5c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -320,6 +320,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_PT_RECLAIM if X86_64 config INSTRUCTION_DECODER def_bool y