From patchwork Tue Feb 11 11:13:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969516 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4502EC0219B for ; Tue, 11 Feb 2025 11:13:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 939356B0082; Tue, 11 Feb 2025 06:13:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E8586B0083; Tue, 11 Feb 2025 06:13:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B09F6B0085; Tue, 11 Feb 2025 06:13:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6344D6B0082 for ; Tue, 11 Feb 2025 06:13:58 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DD2C6814B1 for ; Tue, 11 Feb 2025 11:13:57 +0000 (UTC) X-FDA: 83107404114.13.002B2C0 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf16.hostedemail.com (Postfix) with ESMTP id 24F3F180012 for ; Tue, 11 Feb 2025 11:13:55 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272436; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SEzyZQvQrnLJNxZJEbAjPzV8PlKSlEoBugccDpWTTt0=; b=rWVHcwie6Jmqr72ujjwkVvw9hna6ku+Po7aTGT0Jr3XMQhZJSJdiRa5Fg2o2ezYXxj8C2p H9OK/CZUK88w9vcp9G4kadQw3LiAXdoVmJFVQwRcAw9pCbjiDKr9uqQiaLTZXlYgWCK1bz adFQrjogGUjzXCJniSTXqLRkZrLWp/I= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272436; a=rsa-sha256; cv=none; b=vzvIION/iDlY0yuM4jAvKxRBeL/ssiAh7Zx5lWsBKfbJwI1GGFEODCUGVovCTRF2UIm9bP MAobApBmNr09Ar/L8fR4LjzuHFg5UgG7Im/jX4P5spghBOn0iLIIi0/CEjMa2/BAt+4eLT V0UwqYWxRR7oFY7MqoDrcTudBLfatFA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9B3321477; Tue, 11 Feb 2025 03:14:16 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 24D5F3F5A1; Tue, 11 Feb 2025 03:13:44 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 01/17] khugepaged: Generalize alloc_charge_folio() Date: Tue, 11 Feb 2025 16:43:10 +0530 Message-Id: <20250211111326.14295-2-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 24F3F180012 X-Stat-Signature: qunomz3mcujuqom5rf5h8o3s7rbdpiws X-HE-Tag: 1739272435-823550 X-HE-Meta: U2FsdGVkX19pxt0/+ZupT4msfHiRdnzuY18jW6keQ/LS1f9gBperWRV8kmd/Nzhzp2g1w8WgUFy8FoMG6RTRF2x9Qa+/LjrztJs5Yn/0YL2rMCVpaWBCAbuM8xqgbt+FWNpVTsgVyXt7F9+Rnt7FcsfwTvIwZSQYE2w+lRG7ORd6OkAU1AJBsAsVTxxXSKoPf1Rjt53cHya8sUYvFKlHnZ5Ard5wa3D/aMM2pEQJjKopPMLo+GoLjZ1mD7ZHCH3WUdwvbVEnGaLz4rzVQEbLavvHsV/bmvGQvkNNdKhR1e8nYpWsFCJImkd1PBVHgM9OrTKYzWetKV8JXOOpdMKkYFCIJl87tGP+JOF+MikkhvBzQXkzCrHA9WGFgt2+b+UcqYfPltJybNMXA9buQB3xr2Zzb2NpdgYnaAtD46huFiytFwr7doOxbmKh2i8IpLZ25Kpxp7fwm5Keb1rRHRZ4idyJmAQwrcpepWiAgTnsL3MSNkMvn4qM3AUBmAoMr55kuecRDF8AFIQ/24Z4l1Q0ornGC4ahN9hWB0GeVF+KLImI4s4Yq2MUKnKd9fedyCAtha917pVfNprM4+dOi4ZAqhNucOoUTr8OWjqNpSoslWzY+m2WB5YXB8egMcGRVPLOhjX97Mxhm7C55y9cfpR9vN1DbxtzZll1SmYZ/6IAfTQhPmH08PCKhTSJuLL3gOF54f4goKKGS0hXoXVl/vbYrlBSEK/mVEXCHBWGzblKI604QojybgBcJVHrDf7cfkTNLmAaEJLGG0iDyGOgrVEME78e5CVjKfS5aPeNMjKcFDLbGytk6GLTUIrNr2FrYVX73S1EVaZYQW99koYinkpIuxwCI+Njv7DAtKTFZh+TRico0PFxap7kPumyr+4+Pa5oZUOqOav5m3+Rajtxa6AcLw0kqzQzPXMizy8WnMHEhxH2DzZIVYWANwwufwyMvyLlQQz96F6hcVZjRJbKyez VaqR8cmo 7S7zVv1GCnukeBZpylTT/yvmFAa+dW9hnFJm7lg5eawbHiQKNClI+ijiXozgQPeCUCIvKRkuuhhbkc8bFEx5mTB6bPR6aK7K+WIa9P5o+I4spxMEf5+CkyPNwVOhOCU4ulCS6Ma/2i9fIvIoMD7c1FlKTWyp4L6ybEzdWIhG/QJxNUL/M3QlZIaTbSmqxjZVBJO8GU6iRT6/xUYfMSPwpIxYh3c9GC1R/LZD6nNKDIhBFfMZTVG669Jwts4BwPSKaDW1gAbMCVEdnuddkqt5CYcWkSQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Pass order to alloc_charge_folio() and update mTHP statistics. Signed-off-by: Dev Jain --- include/linux/huge_mm.h | 2 ++ mm/huge_memory.c | 4 ++++ mm/khugepaged.c | 17 +++++++++++------ 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 93e509b6c00e..ffe47785854a 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -119,6 +119,8 @@ enum mthp_stat_item { MTHP_STAT_ANON_FAULT_ALLOC, MTHP_STAT_ANON_FAULT_FALLBACK, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, + MTHP_STAT_COLLAPSE_ALLOC, + MTHP_STAT_COLLAPSE_ALLOC_FAILED, MTHP_STAT_ZSWPOUT, MTHP_STAT_SWPIN, MTHP_STAT_SWPIN_FALLBACK, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3d3ebdc002d5..996e802543f1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -615,6 +615,8 @@ static struct kobj_attribute _name##_attr = __ATTR_RO(_name) DEFINE_MTHP_STAT_ATTR(anon_fault_alloc, MTHP_STAT_ANON_FAULT_ALLOC); DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK); DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); +DEFINE_MTHP_STAT_ATTR(collapse_alloc, MTHP_STAT_COLLAPSE_ALLOC); +DEFINE_MTHP_STAT_ATTR(collapse_alloc_failed, MTHP_STAT_COLLAPSE_ALLOC_FAILED); DEFINE_MTHP_STAT_ATTR(zswpout, MTHP_STAT_ZSWPOUT); DEFINE_MTHP_STAT_ATTR(swpin, MTHP_STAT_SWPIN); DEFINE_MTHP_STAT_ATTR(swpin_fallback, MTHP_STAT_SWPIN_FALLBACK); @@ -680,6 +682,8 @@ static struct attribute *any_stats_attrs[] = { #endif &split_attr.attr, &split_failed_attr.attr, + &collapse_alloc_attr.attr, + &collapse_alloc_failed_attr.attr, NULL, }; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5f0be134141e..4342003b1c33 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1074,21 +1074,26 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm, } static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm, - struct collapse_control *cc) + int order, struct collapse_control *cc) { gfp_t gfp = (cc->is_khugepaged ? alloc_hugepage_khugepaged_gfpmask() : GFP_TRANSHUGE); int node = hpage_collapse_find_target_node(cc); struct folio *folio; - folio = __folio_alloc(gfp, HPAGE_PMD_ORDER, node, &cc->alloc_nmask); + folio = __folio_alloc(gfp, order, node, &cc->alloc_nmask); if (!folio) { *foliop = NULL; - count_vm_event(THP_COLLAPSE_ALLOC_FAILED); + if (order == HPAGE_PMD_ORDER) + count_vm_event(THP_COLLAPSE_ALLOC_FAILED); + count_mthp_stat(order, MTHP_STAT_COLLAPSE_ALLOC_FAILED); return SCAN_ALLOC_HUGE_PAGE_FAIL; } - count_vm_event(THP_COLLAPSE_ALLOC); + if (order == HPAGE_PMD_ORDER) + count_vm_event(THP_COLLAPSE_ALLOC); + count_mthp_stat(order, MTHP_STAT_COLLAPSE_ALLOC); + if (unlikely(mem_cgroup_charge(folio, mm, gfp))) { folio_put(folio); *foliop = NULL; @@ -1125,7 +1130,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, */ mmap_read_unlock(mm); - result = alloc_charge_folio(&folio, mm, cc); + result = alloc_charge_folio(&folio, mm, HPAGE_PMD_ORDER, cc); if (result != SCAN_SUCCEED) goto out_nolock; @@ -1851,7 +1856,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); - result = alloc_charge_folio(&new_folio, mm, cc); + result = alloc_charge_folio(&new_folio, mm, HPAGE_PMD_ORDER, cc); if (result != SCAN_SUCCEED) goto out; From patchwork Tue Feb 11 11:13:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BF64C021A2 for ; Tue, 11 Feb 2025 11:14:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD17F6B0083; Tue, 11 Feb 2025 06:14:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B80986B0085; Tue, 11 Feb 2025 06:14:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A48506B0088; Tue, 11 Feb 2025 06:14:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8AFF86B0083 for ; Tue, 11 Feb 2025 06:14:07 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 175CCC1594 for ; Tue, 11 Feb 2025 11:14:07 +0000 (UTC) X-FDA: 83107404534.26.08FB200 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf23.hostedemail.com (Postfix) with ESMTP id 8B313140004 for ; Tue, 11 Feb 2025 11:14:05 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf23.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272445; a=rsa-sha256; cv=none; b=IcJw2cEVs1+pww9IhG9dkfLXV1GEyTMBkEX2g2nASXMh0Id270gqn3rOm9McS42xR4aqe0 WnnggaRp5G/SCWzIBhjKk//1ZYKV76eEy9K8+0menV4mjd/ZIQjQ4rE72m/ru4gRTB5lPH wBxPmZHEVedOcMlPUzOaWzf18iZ65fo= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf23.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272445; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B7knzAimfj1pqFL9d20w6zGU8KuzzqWsCIiYDH8PVCg=; b=plv6sVSFHL9+aXF+6ZbNYxwtV05fq/xTkOtwGScnveodgaHnNhF74u/+KTSN+IrIx3d9kQ HOPENk2OWI63KA+D8XAIth3lZ5Krp7bkmj/cIR+npYTWFG7zFFdHwQN3KUi/uLlFWgECAL sy0g4d9qzGZBWBS3VNhRYtgWyan7jv8= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 556BF13D5; Tue, 11 Feb 2025 03:14:26 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A73603F5A1; Tue, 11 Feb 2025 03:13:55 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 02/17] khugepaged: Generalize hugepage_vma_revalidate() Date: Tue, 11 Feb 2025 16:43:11 +0530 Message-Id: <20250211111326.14295-3-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8B313140004 X-Stat-Signature: qyxtxga73g16myz8167tauyjehsxwhgo X-Rspam-User: X-HE-Tag: 1739272445-975453 X-HE-Meta: U2FsdGVkX1/JYktB1EPnTISXNYuol/sJRTUAo3iOUpCvA7N7KTFRtI92ECxtPbaHtXe773dIt364FfPOmgsitSd2yNi5kFutDXa1NZZJhX0Jzeo4m8kGX9l8JuaNJNhye6xt+mNH/0RItwDBiQ5XnuCw9OyIRxMeWWimaLNDPI1/4rE8qwnXnxrEs3bGPOz0Cc0wZgWBWYRnZGrLhMVExzuuOHTMhSecb+EL3NybAmLoTRf8i6nh0q2PfMguIRe4Ggghwt1+eDx+2ZfatZ48JGzOoBbe1DiubrRV51yX379m5Z9/jOmgipCoOLQpz35eNuO9l/io13R2lCDAEjcT53XC7YXDjk4Oqg0nd5usgiqXOsMTZ4McJ8mh2nLaTcie+nmZitIs3+TOSjNEQ8UsJtAqVKMljPM1qvMxE64oBQPXsX8U8CqCg13uvV4ws9wOtf2ET04FD+hcxIjMgqoujoU6bVP018t69SOYvyXzPFlWqBVtL1z6X+oO210y7lGqkeasQVEZ9RCnv03RsP8lgwJg0BSYkGl2zR0W1+xaJFl6YUvn0U+OGCbv4XrG3a9nMKNEIdtKMYvxjC+BWAJB8cYRuaswdq4a854sQNACzwPV+7b2ZDjLTDFubv0BxbFsXyI3ZXPbAwLK+McNELwK7N7QQdXSQC4TWrquX737acmnCQKh5rXIpyU61w7Ugwd8/a07mQZsa7n/ARjINB4jcG1cLcrbaQjuvM5wFS2LUP9LSSlRDJ2rpCvfR0jl+jXEoL2NgxsWDcTlnFGq6t9uo1BLrgQiEKn669606fu2dBUM03tglaNxni7d9bWxVq6XDFSHziKP/M4LIeifzdYP1oXRcvjcU8alUm6zW5eHgPgrbXuMTm07MDTs4YI7BUv/C0RzgulwX5L4k+n9Kz+xVvrcTeiy4/+7jCjmzMTE/77Rde9xpqPvQn4XFG6StDBLrxOh8WoBuWCwUt+H/xT qS4OKjdL vTruaED2qOPXOuh2oc0dX8+LT1OZP4QIZE6cJtOdH2W9yVs2A9uftcysPMqmR66C2ckfPWghny1iXHwxyJ+5Qj5fVlmxFSvbx7aeUYliy73Zye8Cp6Wj/EncoXPDDRDyVi1fyxLzViNSKf6/qxUpi1lCuvCDcZBg3hlWCtd2c1nJTAB2DaPwMDY94G4Vl1FEQz/mK8v9k7yowe8pDgEu2Fcw2RECXcs9dZDWUBQyu8of7hruNEbNMQmHUx+hNbck4jebk01LLQ+1NpP13jQCrW5emSw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Post retaking the lock, it must be checked that the VMA is suitable for our scan order. Hence, generalize hugepage_vma_revalidate(). Signed-off-by: Dev Jain --- mm/khugepaged.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4342003b1c33..3d105cacf855 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -919,7 +919,7 @@ static int hpage_collapse_find_target_node(struct collapse_control *cc) static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, bool expect_anon, - struct vm_area_struct **vmap, + struct vm_area_struct **vmap, int order, struct collapse_control *cc) { struct vm_area_struct *vma; @@ -932,9 +932,9 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, if (!vma) return SCAN_VMA_NULL; - if (!thp_vma_suitable_order(vma, address, PMD_ORDER)) + if (!thp_vma_suitable_order(vma, address, order)) return SCAN_ADDRESS_RANGE; - if (!thp_vma_allowable_order(vma, vma->vm_flags, tva_flags, PMD_ORDER)) + if (!thp_vma_allowable_order(vma, vma->vm_flags, tva_flags, order)) return SCAN_VMA_CHECK; /* * Anon VMA expected, the address may be unmapped then @@ -1135,7 +1135,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, goto out_nolock; mmap_read_lock(mm); - result = hugepage_vma_revalidate(mm, address, true, &vma, cc); + result = hugepage_vma_revalidate(mm, address, true, &vma, HPAGE_PMD_ORDER, cc); if (result != SCAN_SUCCEED) { mmap_read_unlock(mm); goto out_nolock; @@ -1169,7 +1169,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, * mmap_lock. */ mmap_write_lock(mm); - result = hugepage_vma_revalidate(mm, address, true, &vma, cc); + result = hugepage_vma_revalidate(mm, address, true, &vma, HPAGE_PMD_ORDER, cc); if (result != SCAN_SUCCEED) goto out_up_write; /* check if the pmd is still valid */ @@ -2779,7 +2779,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, mmap_read_lock(mm); mmap_locked = true; result = hugepage_vma_revalidate(mm, addr, false, &vma, - cc); + HPAGE_PMD_ORDER, cc); if (result != SCAN_SUCCEED) { last_fail = result; goto out_nolock; From patchwork Tue Feb 11 11:13:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969518 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95C38C0219B for ; Tue, 11 Feb 2025 11:14:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0117D6B0085; Tue, 11 Feb 2025 06:14:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F02FF6B0088; Tue, 11 Feb 2025 06:14:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DCAA46B0089; Tue, 11 Feb 2025 06:14:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C2A366B0085 for ; Tue, 11 Feb 2025 06:14:17 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 46908C155E for ; Tue, 11 Feb 2025 11:14:17 +0000 (UTC) X-FDA: 83107404954.17.383B70E Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf04.hostedemail.com (Postfix) with ESMTP id B1C2940007 for ; Tue, 11 Feb 2025 11:14:15 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf04.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272455; a=rsa-sha256; cv=none; b=40hekq7xrHiqV8ONRV/vqVpN03Y8reIOLkzCCTC1U63UvvlsampNAH+1iuxzRtdc9EJEzP DAEwl0gR/Tzl+a0NdeuRC2yc8GzD83EYXR7O+Ql7uxjMCJwSIA4FgxUA29wf8BL+D0ncOf jhlh4X8DQqh8iKvcSVqEbhgKkwaju6c= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf04.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272455; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BaWFRE1ZfuHe1+eG4gdr7Z9foWbt+F9JqvFVKB17XDA=; b=CUmXWZNdbiSh3OjpeiDtCFQHNrGzC6em6ODl29LRJKZkFLr5o+cBmw8LmbKz1ZZrxbcNGh uo5BKvVymqbpUZPYc910jIm+2u03JFSIqBNV6fUCDgCODUFLTPlLHMdxag3nX9R0uaKV35 us5EDNtKMNQNGA6nCJ2dUmYdW+5gNZs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7D2761477; Tue, 11 Feb 2025 03:14:36 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 61B503F5A1; Tue, 11 Feb 2025 03:14:05 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 03/17] khugepaged: Generalize __collapse_huge_page_swapin() Date: Tue, 11 Feb 2025 16:43:12 +0530 Message-Id: <20250211111326.14295-4-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B1C2940007 X-Stat-Signature: hfinfs5d8d6sy69xb5dgxern1x9gyzxh X-Rspam-User: X-HE-Tag: 1739272455-742793 X-HE-Meta: U2FsdGVkX19RM9szs2zL+Pm1ApeWf9pGawV8o2VfoIF7OupOEURrdqqeBvzF6zWpqLea0UM3KpVwNAboTIAk8Ulwu0VsPWN9PK38nLHHoDT+a3MtgnrV7xHpjYgUdGoh4CkQBI8cLGmVQ9Lt4SdJHHF7o6rlh0uCIKni9h13+FTBF/HOpUxEYK8zF6ZkmddSF79O0VAS7f9yWWaBM7/iFoRpVgfqZm/B0RW75b7cLxUSKdDkhZIQrSk3bOZa5aF/piuTKFewwKibYL35oVqSKMjL7yDYGhmCVNRmDsUbPH4xTRX8OW/pW9uvlusbXRpe8Wj+jc8RKO/El7BFUaKRnnUIiHkvyPPN+cCpa/SmnJbwnM1kkXAIFi+vQ8Be4uj7q/MCvuQCx2ZuHs76LtL522RiROcPZABHd+dkE4gL4ODKb5wGenTqF7HkZ4qlX5WkX9OqfR5TnsHpK4m4KUe50kPWOQLz6u2t4c1gWqU0d8W/QfTiD2ZkXoNNcrqYKy7llcGHH00ZURsgpXwi8J2CzmPCv92O4kYaoNLaqapQfiaeym3U6t7QW4vkzqBl2ZNX2isot5jhu2qrt6BlPGpAbXWdFiSQ2lfEnC2M4EPi/reiR0hVL6I35slED/CksOswaA8tRxF/f7suCzBsimfxRtNZ4CdRSgx1rSyMSWbDc87wdZwkyKg3PjRYroxWh61UkJKgypTgu9020g3d+lQsX6QMAJUGrN86dFJkEtbMXFfbQGRI6po+GnRMBp+5dFTXcZRXb44LJUDomUXTOk9W1JnGMG39oTF2jGn3JNGIC5xwzFkQ+5piBYr2ZTrBA2MvqYsUR4A7GIGPnboyKumkNqLQJB3cas87AfxhZnTxF1W3UC4gkFcPNwlhgMLKAWmtOtr5XOvVjoUidozw4MFehisfOcTuLW+Sr0dd2adaVmP9pegJ0+XQifzWt0ySRc5VC8bzYVyHPw4nvwAhIIy tVyOKH3Z 9q2r5T4dN9XCfoJchJqZUiExXt8UX+CzF6Q5KrTZ4bbVBVZJQD636FzqHcmMa38XoMMZSrYtEezb+5dTek7jsSZO2NVbuxIKNaRSck7CghoYzDhdVGEk/J+HfY3n2oPHg6LZPqk1DM7a7lVdwd0k/Ym7T4z+hmgKh/VxsvvMFgR5gIARPqlS5PlGq5RnVxInirzeD3U6UwJf68dDwibpMQRTfkqsjhLZjCAdCDBirD+P+5swnYRCBIYgE0agFh/vlKXIg8zZIUL7+dRkGn3GtngPbvg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If any PTE in our scan range is a swap entry, then use do_swap_page() to swap-in the corresponding folio. Signed-off-by: Dev Jain --- mm/khugepaged.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 3d105cacf855..221823c0d95f 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -999,17 +999,17 @@ static int check_pmd_still_valid(struct mm_struct *mm, */ static int __collapse_huge_page_swapin(struct mm_struct *mm, struct vm_area_struct *vma, - unsigned long haddr, pmd_t *pmd, - int referenced) + unsigned long addr, pmd_t *pmd, + int referenced, int order) { int swapped_in = 0; vm_fault_t ret = 0; - unsigned long address, end = haddr + (HPAGE_PMD_NR * PAGE_SIZE); + unsigned long address, end = addr + (PAGE_SIZE << order); int result; pte_t *pte = NULL; spinlock_t *ptl; - for (address = haddr; address < end; address += PAGE_SIZE) { + for (address = addr; address < end; address += PAGE_SIZE) { struct vm_fault vmf = { .vma = vma, .address = address, @@ -1154,7 +1154,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, * that case. Continuing to collapse causes inconsistency. */ result = __collapse_huge_page_swapin(mm, vma, address, pmd, - referenced); + referenced, HPAGE_PMD_ORDER); if (result != SCAN_SUCCEED) goto out_nolock; } From patchwork Tue Feb 11 11:13:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 716FDC0219D for ; Tue, 11 Feb 2025 11:14:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E8F4D6B0088; Tue, 11 Feb 2025 06:14:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E41776B0089; Tue, 11 Feb 2025 06:14:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D07876B008A; Tue, 11 Feb 2025 06:14:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B46D26B0088 for ; Tue, 11 Feb 2025 06:14:27 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3FB08141497 for ; Tue, 11 Feb 2025 11:14:27 +0000 (UTC) X-FDA: 83107405374.01.E6CB558 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 32E05C0006 for ; Tue, 11 Feb 2025 11:14:25 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0B/BjYfM1rmrtdkSfDBfxUUAPl8sbglM5IrY/T7IWKY=; b=Wvcc+jhqwvYKpoA61IHUxfFeIUv8O4i1MM1kImSHtxkJi2eMK/D/aFAYNV4qeAP8dpTPOZ KIrAluvX1FPuZwLZuKRDnWP01yl8MS3PMXH7/9H4vAx/xjBqJgBSDM35AyVbk9rS19HA8e /pjBJp9teC+khb1t7wjIk5J9nAWK/c8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272465; a=rsa-sha256; cv=none; b=0c1tRazYPrG7znEA6Df/QJmZKUQMZMUs8aVPYpxlD+sUymux5fWBm2VUEn9phXglIjWnjk qraUYBJmyy75h5Sum0teNm+AWe4EtrEiN69VN1EPFdAQjR3itngyFzaiIuy3SfWXAvAWqh 9/bCvy+fV9BqlitGye9j/aTgt4IonxU= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DC9E113D5; Tue, 11 Feb 2025 03:14:45 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 8E5383F5A1; Tue, 11 Feb 2025 03:14:15 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 04/17] khugepaged: Generalize __collapse_huge_page_isolate() Date: Tue, 11 Feb 2025 16:43:13 +0530 Message-Id: <20250211111326.14295-5-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 32E05C0006 X-Stat-Signature: tgd1t9kgyshobkyny4wkf8kfrgx4xs1r X-HE-Tag: 1739272465-919572 X-HE-Meta: U2FsdGVkX1/Kf3ldicCBPg4NHTA8VG4f2gpj9GxIEDsTRgTVyqA+QkxFCyrsNk3qRPyrtsOgLHeyIMwT9Y/cBvtobtQyOfWMVLMBgP4iatwjZu0wAyOCs5KLeIadPzWIv1swG+5jHEMgMeHKX+iKW2p4mVdxybxMxkcFmq3P7D1cztOH0Zc7LjUH8EwKag5IpFuzKbn9OpQ4Y7QssgIkY9CVpZO3DZo/NQGOn7pLyrAc762AT5xRY1AzA0IzUEK4GFx0TKlGnljba0NW5fip2nmEzZChVh2ANtaR2Run7J7O5X20aMY24iSfZQiNH6i13Fa8KPs/kqvV7e++2tucgC5nE2r0+faz+z3ZQuRbbBdbkydHVx00j/Lyl3qMo8sPaWJ8zBXJ8+Eg9yeWwEFDq1NDDoVlN5tMKHSSr6a277t9PsaGb/KlKiiMafkgiiRUJaMSJ/oBVF1rMM0m73/cbCEuZddF8FEtAM1CKP4B5iYAuJlXjjWeMiDeacR7oRxQ8SetvZZYDIoPbgrllsH9o5QueDELSdbFmMXmSjb+RKlvOBZfYC5BZUcBV9nJ9pZHdkkzdRX3lII0fHXiv80yA7T1+ru6Bbss64q6r8L/vP7JJzXxt8oiIevUFZC388eXpv8x0ogk4QYswbEqwECVy6TBq3yjn2b37bQtEKEk0j+Pye38MDTUQo/brqb2Fm7Wxoyuctcg0rac3kPuoFe+e4jDRIXA2gRkrS0fNoZ0CyYsQC72vF1+Y/0NBIo9cjioAFOWY4TnF13rl3Uc51FAVGAsjvnxIUXfQK8aPr4j3U85Oz155fn31bq/XsxwlrJSpQZVQxcVsw4JaGt3pJU4GmVaDyYn+1qZR84j16Jc6eeuGVd9u7GGIKhXXakvDVgA0Xjr6DJAoL+TRtqyMJKXw04Lp9gWHjVAGWNzZ/Yjnfmtn1DheRwti7yYJkXDNXLXnSorfGkYoSpxizRBJvL BloFb0i/ UBd7j3/A11J4lHsU36sRS/Rb4BSkO/ONN+WQeetMZ8s2d56DCESqVmhpKXVrR6SPKBpTw6dX5ABfeOI5i1wOTBBGfDexk3xoU/iPvQqJkxoCAIPCIpLnJTyX+46RWD5qUKyWfkAg4sQz2TgOPoNXH1wKkUSD9tiiuf4/AUzif5FKPFEpq6TMYbTpmiOvPM4IiK5WPug8th9Ob9eNaMXwsdhjwbWfUS4BOwvEUbFZQKmRWgrvqIUZwKPerD0UuHUw8SRQvHDMPFMFw9T3Byj5uDbXwRQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Scale down the scan range and the sysfs tunables (to be changed in subsequent patches) according to the scan order, and isolate the folios. Signed-off-by: Dev Jain --- mm/khugepaged.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 221823c0d95f..0ea99df115cb 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -565,15 +565,17 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, struct collapse_control *cc, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, + int order) { - struct page *page = NULL; struct folio *folio = NULL; pte_t *_pte; int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0; bool writable = false; + unsigned int max_ptes_shared = khugepaged_max_ptes_shared >> (HPAGE_PMD_ORDER - order); + unsigned int max_ptes_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); - for (_pte = pte; _pte < pte + HPAGE_PMD_NR; + for (_pte = pte; _pte < pte + (1UL << order); _pte++, address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); if (pte_none(pteval) || (pte_present(pteval) && @@ -581,7 +583,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, ++none_or_zero; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || - none_or_zero <= khugepaged_max_ptes_none)) { + none_or_zero <= max_ptes_none)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -597,20 +599,19 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, result = SCAN_PTE_UFFD_WP; goto out; } - page = vm_normal_page(vma, address, pteval); - if (unlikely(!page) || unlikely(is_zone_device_page(page))) { + folio = vm_normal_folio(vma, address, pteval); + if (unlikely(!folio) || unlikely(folio_is_zone_device(folio))) { result = SCAN_PAGE_NULL; goto out; } - folio = page_folio(page); VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio); /* See hpage_collapse_scan_pmd(). */ if (folio_likely_mapped_shared(folio)) { ++shared; if (cc->is_khugepaged && - shared > khugepaged_max_ptes_shared) { + shared > max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -1201,7 +1202,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, pte = pte_offset_map_lock(mm, &_pmd, address, &pte_ptl); if (pte) { result = __collapse_huge_page_isolate(vma, address, pte, cc, - &compound_pagelist); + &compound_pagelist, HPAGE_PMD_ORDER); spin_unlock(pte_ptl); } else { result = SCAN_PMD_NULL; From patchwork Tue Feb 11 11:13:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24AA3C021A1 for ; Tue, 11 Feb 2025 11:14:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F9D26B008C; Tue, 11 Feb 2025 06:14:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 47CC56B0092; Tue, 11 Feb 2025 06:14:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 259D76B0093; Tue, 11 Feb 2025 06:14:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0B1146B008C for ; Tue, 11 Feb 2025 06:14:51 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 764FA81516 for ; Tue, 11 Feb 2025 11:14:37 +0000 (UTC) X-FDA: 83107405794.10.115702B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP id ED890100009 for ; Tue, 11 Feb 2025 11:14:34 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272475; a=rsa-sha256; cv=none; b=QDXRW7E77yKa3SrfUI/Ey5mEOgOC1UrH/XOvha6WKgIvLE2T865W8eXxNV1MyyoxlM5ZNI AUF33uVNqkl0jHwPjs0ZMJNHqNJnHijsEUe8LtFGO9J3juu14uz5aQrAw0vpkGWSEBoiPU TeCM0Ah/3z1ScvoDbZZABBKYQKvS4Fk= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272475; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nSf+OiB3/bLwawjGnCMjW6lhWp2HI9Dj7B3czOtKfxw=; b=TEDby2Ih2oAMPAg7rSnsVZRp3HQVhLVeLvXCWa76R9uMgD7OYY6ls/+2feqwXo5odfdHgj VjgtWOyF9iy4DqUxjiUFPpyP3NlSApbyaFtyaJP1D99hhBZhKIc2xx3YS9xumy9OuETULm lsO2thB7B/Dofdq3SHkEE9VwTCJGK2w= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8F15213D5; Tue, 11 Feb 2025 03:14:55 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E91073F5A1; Tue, 11 Feb 2025 03:14:24 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 05/17] khugepaged: Generalize __collapse_huge_page_copy() Date: Tue, 11 Feb 2025 16:43:14 +0530 Message-Id: <20250211111326.14295-6-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Stat-Signature: 31gnzsrqwnsg6hfyo336ui61sgxrcksj X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: ED890100009 X-Rspam-User: X-HE-Tag: 1739272474-388043 X-HE-Meta: U2FsdGVkX1/nXigN3glqgTCO9bs/YPSJWbsioXO+rn0ewKUiwV1Q46N1aWdDNxECqLS60LiR7WaNLrBNIXSq7YaJnf8yF+x09A16a3e6JB0mToY7YxfKHY5yKCj44mFNXK45PYspKcrn/x6vMHPoN52Fswd25dab/Yvd0zFs54/Swx09e8jKlnIdOq3s8eOrc6E9hTLryi+W9UxpOkC1JWRvL7lcOLTFeJWrLmjVBRJCUyXsE4Kj6kuBIifyx+YeudXo98u6QepwPf7385bebSCekEFsNcoUcg0tY6l3tXJAuQGj2/cTKJyVXWWNzqvT3k4sldOc9iPhoKebhuCOaCfFdTsnnW90Rw5g0olxIbO1SDu9NAoyMSiyv5m5xufzPWoA1ciKH0FwzJYweyNTZHR4Ghdc7EA9eFIyWLDT5Nb3ivyOhzxsKof/O6OXtL3r2iYESgvVZYWDwrjJrDLp06UH2LYFOzP0cVg0B3uvhUMhLZ5eqXwpzC9B2GVubiAOdbS1Edna6iXD0KBn6pdaw0ZD3NZBpc8HrYvcZTqgSWnDugmoMP06UBROMQP6FHMA8rOrRrTYLUXrnjeGyehsb506haVYZaulUoqd+EXQkvt7qkDNeSBVRH3zmDo9fD8jkAmfL7MAY/kYJw2eKBGRpfr8nS942okrsnefgJG1LZFoYVCLeEgeASB+gdQccBgJwkxM9Z1fJQKav4sKOApyoL93Ml+/nhIOkAQuSL3mC8b9AFL3ybuTYjCwcMG4g7klVwlwmCjtE6AX/7sOb8SNtF4lqV/D19lxc/6sKtvzLmKBmyB8rmDA4mvOtQZa+DdlFhrdBgcAXUBtqRc1lS+hRVyD8MyccedXG9oGfoLjuWa70Uy8qCMIZlzjx9ZVjuTj0M5IS5N84gKTbZIsyQX5KD5mvEOigKTkOFin2aDMxhI3+c26bdP3nps3D0/0lTeJJrI+y6RIAwcltWMPRbJ cG4Qtbs1 2m+Lt9vPLEdA5go33iUE2t6xAoPLiI0pVbZ30WSzeHOEo6zckBjAstBj8e9panQ/BY1LwvaVL2Cesr6k3LRkBrIpILrsGyYz7AUqyv3Ym4swntCwoKQrhJN0NKDunuGsT4u8voTjQbDTEXHYD0L+tj0HZDK0FbQuQjZfOepcORkcX9Lioan0vINwElfLv+NWWJOHQqxedzP6y2irw5+FbldFB4cgL4/KkusgAelJTST9//YXyapx7Md9H7CzBum8LMxqrjLhlkHS4EV34Pl5y9FrVKw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Generalize folio copying, PTE clearing and the failure path. Signed-off-by: Dev Jain --- mm/khugepaged.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 0ea99df115cb..99eb1f72a508 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -712,13 +712,14 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte, struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, + int order) { struct folio *src, *tmp; pte_t *_pte; pte_t pteval; - for (_pte = pte; _pte < pte + HPAGE_PMD_NR; + for (_pte = pte; _pte < pte + (1UL << order); _pte++, address += PAGE_SIZE) { pteval = ptep_get(_pte); if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { @@ -765,7 +766,8 @@ static void __collapse_huge_page_copy_failed(pte_t *pte, pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, + int order) { spinlock_t *pmd_ptl; @@ -782,7 +784,7 @@ static void __collapse_huge_page_copy_failed(pte_t *pte, * Release both raw and compound pages isolated * in __collapse_huge_page_isolate. */ - release_pte_pages(pte, pte + HPAGE_PMD_NR, compound_pagelist); + release_pte_pages(pte, pte + (1UL << order), compound_pagelist); } /* @@ -803,7 +805,7 @@ static void __collapse_huge_page_copy_failed(pte_t *pte, static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio, pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma, unsigned long address, spinlock_t *ptl, - struct list_head *compound_pagelist) + struct list_head *compound_pagelist, int order) { unsigned int i; int result = SCAN_SUCCEED; @@ -811,7 +813,7 @@ static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio, /* * Copying pages' contents is subject to memory poison at any iteration. */ - for (i = 0; i < HPAGE_PMD_NR; i++) { + for (i = 0; i < (1 << order); i++) { pte_t pteval = ptep_get(pte + i); struct page *page = folio_page(folio, i); unsigned long src_addr = address + i * PAGE_SIZE; @@ -830,10 +832,10 @@ static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio, if (likely(result == SCAN_SUCCEED)) __collapse_huge_page_copy_succeeded(pte, vma, address, ptl, - compound_pagelist); + compound_pagelist, order); else __collapse_huge_page_copy_failed(pte, pmd, orig_pmd, vma, - compound_pagelist); + compound_pagelist, order); return result; } @@ -1232,7 +1234,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, result = __collapse_huge_page_copy(pte, folio, pmd, _pmd, vma, address, pte_ptl, - &compound_pagelist); + &compound_pagelist, HPAGE_PMD_ORDER); pte_unmap(pte); if (unlikely(result != SCAN_SUCCEED)) goto out_up_write; From patchwork Tue Feb 11 11:13:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969520 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E7A8C0219B for ; Tue, 11 Feb 2025 11:14:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A49E46B0085; Tue, 11 Feb 2025 06:14:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FABC6B008C; Tue, 11 Feb 2025 06:14:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C12D6B0092; Tue, 11 Feb 2025 06:14:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 696346B0085 for ; Tue, 11 Feb 2025 06:14:50 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4B0808156B for ; Tue, 11 Feb 2025 11:14:46 +0000 (UTC) X-FDA: 83107406172.19.620EF0B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id AA7D840011 for ; Tue, 11 Feb 2025 11:14:44 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272484; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qID2uk17t1uu87PO4uOK/zIqL4Gfv3A88xyN1Vfxjzc=; b=fxn3hbAS9bJPbRIbZDuPBSjZMIBTQaQrivQfQoABwHNM1GsHYUjMI3SZ5op2dpWP5cUDZ5 xFJ6tyCctbcmLcOiIj6+eX1fOqI09Vcl8Xk2r6QQCgk039vN3/q2eqQgi2VlARATEODAr/ A7QTONjeOL166fR4PVKdG7IJWqaPPTo= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272484; a=rsa-sha256; cv=none; b=v0c/vORrwfBdvJZkDdyx6HbwUyNiq5cusSHds+Ndxn4ezDVcJ+y4XkIENMIPyBdQ7QSvo8 pQCH59sflW2u+bf4aAT74YojXz1fBWmUrNb2nKAytqYrAC3lOr2VGWTCuBboGNW1Y+xBwh jSMceCpbGJXrlYqY5B2qvcNILMyZV6w= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8B5BD13D5; Tue, 11 Feb 2025 03:15:05 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9B1463F5A1; Tue, 11 Feb 2025 03:14:34 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 06/17] khugepaged: Abstract PMD-THP collapse Date: Tue, 11 Feb 2025 16:43:15 +0530 Message-Id: <20250211111326.14295-7-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: AA7D840011 X-Stat-Signature: mctg81mdo8n6ejcmxke6m4iidk1qexka X-HE-Tag: 1739272484-241790 X-HE-Meta: U2FsdGVkX19an8WNNj1zwul3UNjFVaGVbBQJar+DFFA7bUKHSdxf49mUa4JVkcjjosKTzuCn1Oc2XLFhDK8E44mdShNWGG7DHYy0Q8aawhlKLqVX940+XdRz2IZBOt8MrBYBjwU8JM4g6F4W4AJrNf73CcyDW76CYceeL8+aL4URuYlrQiedVaXSTdfEOanxlbPoniiyNucSQ0XljPFwTDxsyraHtNnWh8pYD2l9J1Q5JPbmmix9eERIDRbFkZOVpX7/+JYR0pvBWGTpGCdXpNkiDa99scVmzsij63oxnNsr5LMrrrBhZg5/nyGgtlbXa97PsVEEW3i+bt9NSPGNvB81l+g4BswobiWusfsNi7MgruysbHVt0VF2dqtU1MPaaIgDWHvdv4ihNQtdqhSFlp+d0hlKEVdRnqmHxeNzcmW8IzXHJXd0TDG+o853U06SMjOh4UCx+7echiZZuU+QTmWloxHgOtO1bVvXGFYMZHvfDtB1nlfEA/R2V/3tl+xtjB4TcsGhrCBIfanEabLHO8TFwNfMO7lN+y25y6z3lR5/shEjfi3IZm3tnnSpRMciRl7xqyBwFDBrlRI5csx1Z6UHoIq3fNBDvz7c8NEjBntrrcqmISFhwda9mRKpo78c6xIXR9AEmxsKSHppwDSpbfdvQoYAODd7nSAmCpxSjVhMbSjMw62pI36pi3DrsNfaCCJxolEZvMBqDLmWGlFK5ZoKJwIRYg9hJpr8Qdj5hWa/qhcghjzikFEeY7YYZZpK8MkwgF0swJr5z+og2f5+NSouKMDSKl6Y/m15nBhNOl1VOI6icjILWAM0TGE35s59k84Q0HMbFgpUv+MDj637J+K3d+MsttMzRzFKa3NFN3ftY2X+IW3moWAqwqc/2eGcr+iIN/6VAgPl3uXvkDyTJUyIHxZad09izyDAGbDRjY4rozHx2WFK/CzFhsBig7TQZEp2uV/brEW2AFjRbqH ll7rwx5w 4+gvXI114rWtdFD8dGVY/hItq7mSt/6Y0HjP74Lu+KjGgAQ2/0ErrwfO1KVnb3ZN2fTFnZDU9iZSqDkmmkf+9WOZjMtQvQDEj2aCHi6so+tAhXd3GwFSpC1CTr3VCorK2vcvP+LdZ+vwNhMCnyabldCxRN6b2LhhcLsomYG5tT4ZsCnHesfCUlT1rliLSaKYnpyT/ThvAm2FcLGFi+0ZaY9/TrnRthxQ/BPRe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Abstract away copying page contents, and setting the PMD, into vma_collapse_anon_folio_pmd(). Signed-off-by: Dev Jain --- mm/khugepaged.c | 140 +++++++++++++++++++++++++++--------------------- 1 file changed, 78 insertions(+), 62 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 99eb1f72a508..498cb5ad9ff1 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1109,76 +1109,27 @@ static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm, return SCAN_SUCCEED; } -static int collapse_huge_page(struct mm_struct *mm, unsigned long address, - int referenced, int unmapped, - struct collapse_control *cc) +static int vma_collapse_anon_folio_pmd(struct mm_struct *mm, unsigned long address, + struct vm_area_struct *vma, struct collapse_control *cc, pmd_t *pmd, + struct folio *folio) { LIST_HEAD(compound_pagelist); - pmd_t *pmd, _pmd; - pte_t *pte; pgtable_t pgtable; - struct folio *folio; spinlock_t *pmd_ptl, *pte_ptl; int result = SCAN_FAIL; - struct vm_area_struct *vma; struct mmu_notifier_range range; + pmd_t _pmd; + pte_t *pte; VM_BUG_ON(address & ~HPAGE_PMD_MASK); - /* - * Before allocating the hugepage, release the mmap_lock read lock. - * The allocation can take potentially a long time if it involves - * sync compaction, and we do not need to hold the mmap_lock during - * that. We will recheck the vma after taking it again in write mode. - */ - mmap_read_unlock(mm); - - result = alloc_charge_folio(&folio, mm, HPAGE_PMD_ORDER, cc); - if (result != SCAN_SUCCEED) - goto out_nolock; - - mmap_read_lock(mm); - result = hugepage_vma_revalidate(mm, address, true, &vma, HPAGE_PMD_ORDER, cc); - if (result != SCAN_SUCCEED) { - mmap_read_unlock(mm); - goto out_nolock; - } - - result = find_pmd_or_thp_or_none(mm, address, &pmd); - if (result != SCAN_SUCCEED) { - mmap_read_unlock(mm); - goto out_nolock; - } - - if (unmapped) { - /* - * __collapse_huge_page_swapin will return with mmap_lock - * released when it fails. So we jump out_nolock directly in - * that case. Continuing to collapse causes inconsistency. - */ - result = __collapse_huge_page_swapin(mm, vma, address, pmd, - referenced, HPAGE_PMD_ORDER); - if (result != SCAN_SUCCEED) - goto out_nolock; - } - - mmap_read_unlock(mm); - /* - * Prevent all access to pagetables with the exception of - * gup_fast later handled by the ptep_clear_flush and the VM - * handled by the anon_vma lock + PG_lock. - * - * UFFDIO_MOVE is prevented to race as well thanks to the - * mmap_lock. - */ - mmap_write_lock(mm); result = hugepage_vma_revalidate(mm, address, true, &vma, HPAGE_PMD_ORDER, cc); if (result != SCAN_SUCCEED) - goto out_up_write; + goto out; /* check if the pmd is still valid */ result = check_pmd_still_valid(mm, address, pmd); if (result != SCAN_SUCCEED) - goto out_up_write; + goto out; vma_start_write(vma); anon_vma_lock_write(vma->anon_vma); @@ -1223,7 +1174,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, pmd_populate(mm, pmd, pmd_pgtable(_pmd)); spin_unlock(pmd_ptl); anon_vma_unlock_write(vma->anon_vma); - goto out_up_write; + goto out; } /* @@ -1237,7 +1188,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, &compound_pagelist, HPAGE_PMD_ORDER); pte_unmap(pte); if (unlikely(result != SCAN_SUCCEED)) - goto out_up_write; + goto out; /* * The smp_wmb() inside __folio_mark_uptodate() ensures the @@ -1260,11 +1211,76 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, deferred_split_folio(folio, false); spin_unlock(pmd_ptl); - folio = NULL; - result = SCAN_SUCCEED; -out_up_write: +out: + return result; +} + +static int collapse_huge_page(struct mm_struct *mm, unsigned long address, + int referenced, int unmapped, int order, + struct collapse_control *cc) +{ + struct vm_area_struct *vma; + int result = SCAN_FAIL; + struct folio *folio; + pmd_t *pmd; + + /* + * Before allocating the hugepage, release the mmap_lock read lock. + * The allocation can take potentially a long time if it involves + * sync compaction, and we do not need to hold the mmap_lock during + * that. We will recheck the vma after taking it again in write mode. + */ + mmap_read_unlock(mm); + + result = alloc_charge_folio(&folio, mm, order, cc); + if (result != SCAN_SUCCEED) + goto out_nolock; + + mmap_read_lock(mm); + result = hugepage_vma_revalidate(mm, address, true, &vma, order, cc); + if (result != SCAN_SUCCEED) { + mmap_read_unlock(mm); + goto out_nolock; + } + + result = find_pmd_or_thp_or_none(mm, address, &pmd); + if (result != SCAN_SUCCEED) { + mmap_read_unlock(mm); + goto out_nolock; + } + + if (unmapped) { + /* + * __collapse_huge_page_swapin will return with mmap_lock + * released when it fails. So we jump out_nolock directly in + * that case. Continuing to collapse causes inconsistency. + */ + result = __collapse_huge_page_swapin(mm, vma, address, pmd, + referenced, order); + if (result != SCAN_SUCCEED) + goto out_nolock; + } + + mmap_read_unlock(mm); + /* + * Prevent all access to pagetables with the exception of + * gup_fast later handled by the ptep_clear_flush and the VM + * handled by the anon_vma lock + PG_lock. + * + * UFFDIO_MOVE is prevented to race as well thanks to the + * mmap_lock. + */ + mmap_write_lock(mm); + + if (order == HPAGE_PMD_ORDER) + result = vma_collapse_anon_folio_pmd(mm, address, vma, cc, pmd, folio); + mmap_write_unlock(mm); + + if (result == SCAN_SUCCEED) + folio = NULL; + out_nolock: if (folio) folio_put(folio); @@ -1440,7 +1456,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, pte_unmap_unlock(pte, ptl); if (result == SCAN_SUCCEED) { result = collapse_huge_page(mm, address, referenced, - unmapped, cc); + unmapped, HPAGE_PMD_ORDER, cc); /* collapse_huge_page will return with the mmap_lock released */ *mmap_locked = false; } From patchwork Tue Feb 11 11:13:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969522 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A624DC021A2 for ; Tue, 11 Feb 2025 11:14:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 25AAB6B0092; Tue, 11 Feb 2025 06:14:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 20B286B0093; Tue, 11 Feb 2025 06:14:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DC836B0095; Tue, 11 Feb 2025 06:14:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E604F6B0092 for ; Tue, 11 Feb 2025 06:14:56 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 95FF112061F for ; Tue, 11 Feb 2025 11:14:56 +0000 (UTC) X-FDA: 83107406592.22.BB89E8B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 030A41A0009 for ; Tue, 11 Feb 2025 11:14:54 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=osKbVa4uo+Y7wX/dV75sV1DxfxhmrFukEpqmQkJP73o=; b=aJPPLrtRTSMqlK+gRdhQzQ93/KVZPfZR9PD0WMq8Id6ExCWUMjthGuLrtywvMRXF/X8v0W N7rjrx4VpUrdfjFMbMsTZqBF2+lMDkvx0Wl84VScef9RyfjcPra30mt18QL0BXJG47Jehd g42DW104ScN/fS5fX2dFS5s+q+x5GZc= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272495; a=rsa-sha256; cv=none; b=qhTRUmq3Um4t7PZyXv/7A/13+routTkAhjM6huVJAN8ambzLkPiV6QxH759lNS7FvTsUqc tiSPpn0gEZzzZ5D6/MrxFEHn6vO7Eoe2BX9BH1EWqfOoQG3wyd7b8lYP8W02Gky8VzgIo3 DtKyX39iLGHPCntbW2MNh8hgjYYunc4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C658E13D5; Tue, 11 Feb 2025 03:15:15 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AA1D23F5A1; Tue, 11 Feb 2025 03:14:44 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 07/17] khugepaged: Scan PTEs order-wise Date: Tue, 11 Feb 2025 16:43:16 +0530 Message-Id: <20250211111326.14295-8-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 030A41A0009 X-Stat-Signature: 8im9p14qu668u4tx71nxzxwcd8z5ftzk X-Rspamd-Server: rspam03 X-HE-Tag: 1739272494-736973 X-HE-Meta: U2FsdGVkX19u26nilOaPesZ8SVj3/C6BSFl6EdjdCHKzIENBREDg6OZ8mux1EA+PhNrfYuT0M4JJO/iq5GL9KpuYzuj4NRuN8JHJBgWMu+95L4ockRK09MGGtuCpcMyfGbzxq75JRoqBOGN+lHc/+vK0aHqYAAZfcCqO4Akb1pLRNLOnHqMN8JaJDBt4kaX9kTcLS7h66INKSpddjTfwBImJhIpOrcziliIdGzWv9+QdCXtkdzec9sm1XOr2crkmd0C2+ioB7WhvCjoq5LuWwhYHw5M2l7dfy+WBhavX6Hzau5GChh6svhFaOOYE2g+/JizlU3SlgjgG3jFxkaCVGszjW9NBdx740j8X6pptxkzTSdwSvukDjsdyCvIpYZVCAAZBl06kxOUqwel3GZKguc/ZfsAQvbnQAMQYkHWtgAy/FHTS+R1QqcmXVF1iUz2IWv1f9un/YLRiVh4UjmWzjKixnkVX5uJ3ENkr2r6vbt9D4boUzUZNelVqeNEq6MPFWD+zVhROYWW6QrPXXt/8YlERh5K6JGEx8kztRsfeyT/FSqDamoORscN0tjUZsJhO6yOAk5IdL21JKwjcuhsqhXWlgCB0Lr+hpvJ202mFV6nhyN/MPgOV3vr5MmkcWw0p66lG5/puI6+Jy5QNn8EYZbnoQdWtuieQWB61RObssHM64mXWd30GH9Nwvj7/Uzs08Nl1gbtoMpDbvDdcF45NXb35Sbn9QcuCqQR/bXxdRw3VCx3Fh6ShMRpjv0RFmNmUUlTPYrlaLR6I4Q8wBYRYL5AjRePX7elS+vMxnRACv7iQLUd4m7EkpbHRd0hYu+wPvsC04nRqJfmetFIr+uam2yHtVv79Gj8CWAxUKieMhfoX/+YT9l3srhMV3a2xqr0pxYTnxw25GbYi/8HsyLqELVXWPJDV6MJBAev6JhV0Lyv3hROS7LAbVHtN/+4OBl1l5ESMJoywzAaRCB9LAUj AqSX4tuA BWfZrarm0TUU21xo0bve77FbB46lT+meYYdbad6b/0aL2Pcz/tSnXVrhQkTwyj0UWQKicPwt9syBhDZyJUFCIzEmvEKmK1vFh2iL9+NhE8oIOpx4HGrxgY0fE90nD/U8DXTYDmEy0x5s0MEwwNKyhQpZ+qD48/yTn62PUBMLPo+3ZEYpognGrlz6QXzCer3CXgGD2HvQ5OzTcLwTxymuApH4H6f2fOQFNoVBZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Scan the PTEs order-wise, using the mask of suitable orders for this VMA derived in conjunction with sysfs THP settings. Scale down the tunables (to be changed in subsequent patches); in case of collapse failure, we drop down to the next order. Otherwise, we try to jump to the highest possible order and then start a fresh scan. Note that madvise(MADV_COLLAPSE) has not been generalized. Signed-off-by: Dev Jain --- mm/khugepaged.c | 97 ++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 83 insertions(+), 14 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 498cb5ad9ff1..fbfd8a78ef51 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -1295,36 +1296,57 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, { pmd_t *pmd; pte_t *pte, *_pte; - int result = SCAN_FAIL, referenced = 0; - int none_or_zero = 0, shared = 0; - struct page *page = NULL; struct folio *folio = NULL; - unsigned long _address; + int result = SCAN_FAIL; spinlock_t *ptl; - int node = NUMA_NO_NODE, unmapped = 0; + unsigned int max_ptes_shared, max_ptes_none, max_ptes_swap; + int referenced, shared, none_or_zero, unmapped; + unsigned long _address, orig_address = address; + int node = NUMA_NO_NODE; bool writable = false; + unsigned long orders, orig_orders; + int order, prev_order; VM_BUG_ON(address & ~HPAGE_PMD_MASK); + orders = thp_vma_allowable_orders(vma, vma->vm_flags, + TVA_IN_PF | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL_ANON); + orders = thp_vma_suitable_orders(vma, address, orders); + orig_orders = orders; + order = highest_order(orders); + + /* MADV_COLLAPSE needs to work irrespective of sysfs setting */ + if (!cc->is_khugepaged) + order = HPAGE_PMD_ORDER; + +scan_pte_range: + + max_ptes_shared = khugepaged_max_ptes_shared >> (HPAGE_PMD_ORDER - order); + max_ptes_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); + max_ptes_swap = khugepaged_max_ptes_swap >> (HPAGE_PMD_ORDER - order); + referenced = 0, shared = 0, none_or_zero = 0, unmapped = 0; + + /* Check pmd after taking mmap lock */ result = find_pmd_or_thp_or_none(mm, address, &pmd); if (result != SCAN_SUCCEED) goto out; memset(cc->node_load, 0, sizeof(cc->node_load)); nodes_clear(cc->alloc_nmask); + pte = pte_offset_map_lock(mm, pmd, address, &ptl); if (!pte) { result = SCAN_PMD_NULL; goto out; } - for (_address = address, _pte = pte; _pte < pte + HPAGE_PMD_NR; + for (_address = address, _pte = pte; _pte < pte + (1UL << order); _pte++, _address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); if (is_swap_pte(pteval)) { ++unmapped; if (!cc->is_khugepaged || - unmapped <= khugepaged_max_ptes_swap) { + unmapped <= max_ptes_swap) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1345,7 +1367,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, ++none_or_zero; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || - none_or_zero <= khugepaged_max_ptes_none)) { + none_or_zero <= max_ptes_none)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -1369,12 +1391,11 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, if (pte_write(pteval)) writable = true; - page = vm_normal_page(vma, _address, pteval); - if (unlikely(!page) || unlikely(is_zone_device_page(page))) { + folio = vm_normal_folio(vma, _address, pteval); + if (unlikely(!folio) || unlikely(folio_is_zone_device(folio))) { result = SCAN_PAGE_NULL; goto out_unmap; } - folio = page_folio(page); if (!folio_test_anon(folio)) { result = SCAN_PAGE_ANON; @@ -1390,7 +1411,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, if (folio_likely_mapped_shared(folio)) { ++shared; if (cc->is_khugepaged && - shared > khugepaged_max_ptes_shared) { + shared > max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -1447,7 +1468,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, result = SCAN_PAGE_RO; } else if (cc->is_khugepaged && (!referenced || - (unmapped && referenced < HPAGE_PMD_NR / 2))) { + (unmapped && referenced < (1UL << order) / 2))) { result = SCAN_LACK_REFERENCED_PAGE; } else { result = SCAN_SUCCEED; @@ -1456,10 +1477,58 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, pte_unmap_unlock(pte, ptl); if (result == SCAN_SUCCEED) { result = collapse_huge_page(mm, address, referenced, - unmapped, HPAGE_PMD_ORDER, cc); + unmapped, order, cc); /* collapse_huge_page will return with the mmap_lock released */ *mmap_locked = false; + /* Skip over this range and decide order */ + if (result == SCAN_SUCCEED) + goto decide_order; + } + if (result != SCAN_SUCCEED) { + + /* Go to the next order */ + prev_order = order; + order = next_order(&orders, order); + if (order < 2) { + /* Skip over this range, and decide order */ + _address = address + (PAGE_SIZE << prev_order); + _pte = pte + (1UL << prev_order); + goto decide_order; + } + goto maybe_mmap_lock; } + +decide_order: + /* Immediately exit on exhaustion of range */ + if (_address == orig_address + (PAGE_SIZE << HPAGE_PMD_ORDER)) + goto out; + + /* Get highest order possible starting from address */ + order = count_trailing_zeros(_address >> PAGE_SHIFT); + + orders = orig_orders & ((1UL << (order + 1)) - 1); + if (!(orders & (1UL << order))) + order = next_order(&orders, order); + + /* This should never happen, since we are on an aligned address */ + BUG_ON(cc->is_khugepaged && order < 2); + + address = _address; + pte = _pte; + +maybe_mmap_lock: + if (!(*mmap_locked)) { + mmap_read_lock(mm); + *mmap_locked = true; + /* Validate VMA after retaking mmap_lock */ + result = hugepage_vma_revalidate(mm, address, true, &vma, + order, cc); + if (result != SCAN_SUCCEED) { + mmap_read_unlock(mm); + goto out; + } + } + goto scan_pte_range; out: trace_mm_khugepaged_scan_pmd(mm, &folio->page, writable, referenced, none_or_zero, result, unmapped); From patchwork Tue Feb 11 11:13:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76F58C0219B for ; Tue, 11 Feb 2025 11:15:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2DD56B0095; Tue, 11 Feb 2025 06:15:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EDD336B0096; Tue, 11 Feb 2025 06:15:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA4146B0098; Tue, 11 Feb 2025 06:15:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BDBE56B0095 for ; Tue, 11 Feb 2025 06:15:07 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7F5221A15DE for ; Tue, 11 Feb 2025 11:15:07 +0000 (UTC) X-FDA: 83107407054.13.0903CCF Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id B19EE40006 for ; Tue, 11 Feb 2025 11:15:05 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YE5yRynr1Dp9nNko1h3HDsMNEVJMytTso3JYAphAGZQ=; b=svZIPqfKFiGjh7yA4m4OhJPW96Sb7tfTsLVos3F5G7wjkkbcMh1TDkYH1+aOGdHXOTPcDh 8ucFENzUBmhsMmmBasbxDcR0geLErE06BYYZbWLtVDhKY42m/KSIjh/RxGAhiBnUWalST0 rXTqA7hk8v/OYKLG+2iB14iukoxgvf0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272505; a=rsa-sha256; cv=none; b=6JoOvDtp830ZAnJr9b1NWZ0OUmvLQsMMqMdKJx+M66V9uciWLdvgDHgwqS/QSxzReTv0VZ 2ul8u3q2GovBIhbF1FBUDi9VZpqd+D7wtmtHHPerQ+cPTlxxGeSETPmj+jF2CB4m8HzmPP EIZ6UuSF6nfaMvHt4OvUqXNQkRcN0YE= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf01.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7967113D5; Tue, 11 Feb 2025 03:15:26 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F17C83F5A1; Tue, 11 Feb 2025 03:14:54 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 08/17] khugepaged: Introduce vma_collapse_anon_folio() Date: Tue, 11 Feb 2025 16:43:17 +0530 Message-Id: <20250211111326.14295-9-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: B19EE40006 X-Rspamd-Server: rspam07 X-Stat-Signature: xgkefkitk9wae3bn7zcoomz7t1m6cgxt X-HE-Tag: 1739272505-623954 X-HE-Meta: U2FsdGVkX19ffppdUbaZg3t1jB5Gzp+YFEU95J3qu9riWRRV+eB0z4fInyRlnoa9Karnim+D9kprR56yxHfvD5wX8VgzMzemhVoM5ykJZKKqnIcq75llVMIE9VvgJiArIggkFRmoJlfluKROEBk8tP6JXyrRctx29PBAWs0q27a1zXemlN09TR2Ju9na3Aje8VfUi0wGDOtg9SEewxmO5Q058c+HdWjdrqUQ2IT4buqCobNSfhPZSVboczLiO/18WefDgiYKmnx6znSqoM/MSNe719wpmic6qe6bJSXZaua6iPRYc6BK6VplmLPVHpXCC8929KpNDbyoxcjJJrHp4RNJG4kZhf00DAfDVQ4R/NW5dUPcADwTmk33UUiuybhEUla0wOf2H1dF1BHG9NmaooFQaeELlbxjakxZ8aIq9BgCUkxlUbsqCZn6lvdKqz622jEQWJcdTzOTMoQaovqECke/Wp9ZGfh7UhAzUPMVaIdfmuP/N5bZOsBeu+syJyUo6X2PnzOBIuERPgDjp7F19LLAfcD2gjgiw7kfzL+Nglng+hReh7goli6Ty2Mzzk71nXVppzWmx/6qbphMGtC81FpkTWELTV66X6g07VXZvijsBjPHObBgid68Jdc0tVotODLAwwS/YqR4eao83kKJAyXGk42JAKUVRFNeVpl/o7FflApmcNWXxbWawJ2jIpyPnuhQePYLRyPOQ6VOvvY5XyWSzL9aKTyLloQDEftR2UWeO1cR4I4OZwuZ1x+oS3l0QxjgWT2hDil5aHhKJwQHt1a+iEE1YI2gGs6rwfANVHYx3aNDIYEgjzMBBf8j3CtfYv3pPX6aZw0Zd12fW7Y7kgVWsQqZ1CCrRMD1YUCHOxJpzOlxImgykPYtzgGAKNWZywnXLiwvZvTWQKWRKdbV7bIoQn5L8Hysz+AVeX/YVVykNOzKnWv6hUkw0so0bwIGsjVGzSkRRLNqHq5SD9a 2gnxcY77 vKqxuiy6vBU2zU+d+o6VeZD7xSj2jFxwte/r6DZXwq+YyWofIQd1g20E7A2Uxvi1E0RLdHNTqoBDX943CNfBgszj6LFCZMT1ReXIVwZuudpMFOOc+0emKb3zuT+m4I116SyFR9KDnpJm2FITxEGX4JnjPqwmPyMx26jAAu+VZEw2kFaF9IGGdwPIn5WnC53Ozoe2SoWFeeZYrSILwpr3KFxr+rN7x33DgCP31nNM/3MZ9JaPZTGnW7oja8ZR3REx3dW6hY45ErnK0xt7EhsqfUqg10w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Similar to PMD collapse, take the write locks to stop pagetable walking. Copy page contents, clear the PTEs, remove folio pins, and (try to) unmap the old folios. Set the PTEs to the new folio using the set_ptes() API. Signed-off-by: Dev Jain --- mm/khugepaged.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 92 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fbfd8a78ef51..a674014b6563 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1217,6 +1217,96 @@ static int vma_collapse_anon_folio_pmd(struct mm_struct *mm, unsigned long addre return result; } +/* Similar to the PMD case except we have to batch set the PTEs */ +static int vma_collapse_anon_folio(struct mm_struct *mm, unsigned long address, + struct vm_area_struct *vma, struct collapse_control *cc, pmd_t *pmd, + struct folio *folio, int order) +{ + LIST_HEAD(compound_pagelist); + spinlock_t *pmd_ptl, *pte_ptl; + int result = SCAN_FAIL; + struct mmu_notifier_range range; + pmd_t _pmd; + pte_t *pte; + pte_t entry; + int nr_pages = folio_nr_pages(folio); + unsigned long haddress = address & HPAGE_PMD_MASK; + + VM_BUG_ON(address & ((PAGE_SIZE << order) - 1));; + + result = hugepage_vma_revalidate(mm, address, true, &vma, order, cc); + if (result != SCAN_SUCCEED) + goto out; + result = check_pmd_still_valid(mm, address, pmd); + if (result != SCAN_SUCCEED) + goto out; + + vma_start_write(vma); + anon_vma_lock_write(vma->anon_vma); + + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, haddress, + haddress + HPAGE_PMD_SIZE); + mmu_notifier_invalidate_range_start(&range); + + pmd_ptl = pmd_lock(mm, pmd); + _pmd = pmdp_collapse_flush(vma, haddress, pmd); + spin_unlock(pmd_ptl); + mmu_notifier_invalidate_range_end(&range); + tlb_remove_table_sync_one(); + + pte = pte_offset_map_lock(mm, &_pmd, address, &pte_ptl); + if (pte) { + result = __collapse_huge_page_isolate(vma, address, pte, cc, + &compound_pagelist, order); + spin_unlock(pte_ptl); + } else { + result = SCAN_PMD_NULL; + } + + if (unlikely(result != SCAN_SUCCEED)) { + if (pte) + pte_unmap(pte); + spin_lock(pmd_ptl); + BUG_ON(!pmd_none(*pmd)); + pmd_populate(mm, pmd, pmd_pgtable(_pmd)); + spin_unlock(pmd_ptl); + anon_vma_unlock_write(vma->anon_vma); + goto out; + } + + anon_vma_unlock_write(vma->anon_vma); + + result = __collapse_huge_page_copy(pte, folio, pmd, *pmd, + vma, address, pte_ptl, + &compound_pagelist, order); + pte_unmap(pte); + if (unlikely(result != SCAN_SUCCEED)) + goto out; + + __folio_mark_uptodate(folio); + entry = mk_pte(&folio->page, vma->vm_page_prot); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); + + spin_lock(pte_ptl); + folio_ref_add(folio, nr_pages - 1); + folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); + folio_add_lru_vma(folio, vma); + set_ptes(mm, address, pte, entry, nr_pages); + spin_unlock(pte_ptl); + spin_lock(pmd_ptl); + + /* See pmd_install() */ + smp_wmb(); + BUG_ON(!pmd_none(*pmd)); + pmd_populate(mm, pmd, pmd_pgtable(_pmd)); + update_mmu_cache_pmd(vma, haddress, pmd); + spin_unlock(pmd_ptl); + + result = SCAN_SUCCEED; +out: + return result; +} + static int collapse_huge_page(struct mm_struct *mm, unsigned long address, int referenced, int unmapped, int order, struct collapse_control *cc) @@ -1276,6 +1366,8 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, if (order == HPAGE_PMD_ORDER) result = vma_collapse_anon_folio_pmd(mm, address, vma, cc, pmd, folio); + else + result = vma_collapse_anon_folio(mm, address, vma, cc, pmd, folio, order); mmap_write_unlock(mm); From patchwork Tue Feb 11 11:13:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969528 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C627CC021A2 for ; Tue, 11 Feb 2025 11:16:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E4576B0088; Tue, 11 Feb 2025 06:16:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 31EBF6B0093; Tue, 11 Feb 2025 06:16:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12360280001; Tue, 11 Feb 2025 06:16:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C6E226B0088 for ; Tue, 11 Feb 2025 06:16:08 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 97BEB1C9393 for ; Tue, 11 Feb 2025 11:15:17 +0000 (UTC) X-FDA: 83107407474.09.D3A3A24 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf23.hostedemail.com (Postfix) with ESMTP id F1BF2140002 for ; Tue, 11 Feb 2025 11:15:15 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf23.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272516; a=rsa-sha256; cv=none; b=S4RNmcmyv7ExQ8vXEBw44UuPAmrTvVUPv8meV6zKZWu24VeJMUYsmbrVTvHPmKWWojwlX9 Uq9vtweDhfpwMWoTZFKfoWEqwOaJ/SJuECPeOmjCoFPYGKefUmN9JmTpvj7L4xV/L44UpV Hhg3myK1aIyLe5W/TbrAOQcRCb7hhuI= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf23.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272516; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fO93T+MZuXBGjgk/TFn0xI+IcnCBkWuxazq1WnoeDws=; b=NbQy9644ndlf4zQ0pXWSkUGJ92UA+Os2HTlSk0ci8Qe8tH3Lx9ahrWOt1DnOjJksA5/rjr U0lQpkxhbHu7qW7nyVoRLIpOf6Wf7ZtWZEq5fZY9k2VvtpFrHYUVNVSrbilwwt0qUdfe+0 WQfTt//PWwHoXx/8NyrTuPiujNfae8Y= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CD31513D5; Tue, 11 Feb 2025 03:15:36 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 863983F5A1; Tue, 11 Feb 2025 03:15:05 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 09/17] khugepaged: Define collapse policy if a larger folio is already mapped Date: Tue, 11 Feb 2025 16:43:18 +0530 Message-Id: <20250211111326.14295-10-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: F1BF2140002 X-Stat-Signature: 8b6roon4364i6784nhxuuc7tm8r1oywb X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1739272515-829427 X-HE-Meta: U2FsdGVkX18uoLo+8A/W0xiKdAhI0FGSXFcoG7G2F4TqQPgD7Demk4xPEuy6b1auUI3AIl+yMjEA5PgJQMgvnYdagjALGYte8q+vn/CDSp2U7DVIz0IQxU47MsgeFzX8bou7ZvNDVDjAQ8qo05aO4YSit14ary9HBFTqgfAAEriX4z8ye8RKF1QUEj8HsvPr4vXI6K2TU+ioONo9N0N6Hr/Si74UVvqu6ZPW/OwK9oZ4FHtET4VKQ/pdhW350ptlFZfVxabpSKVgjvjW4/2RlraFLTaohd/lDycsNGvH7UP/+v3lMDDqqHfoBIkulau8KYKURxm/6qQyAWCQfcDs/if0rFuTy1tD+K905G3l0Jm5LDDfSejJGpXifbvHX9FGLlGBAdKjDrr5nLJG4TeZqsOUcPSDg/IRF1mtWuYnkJire5vrhikNOYVle5WHP2h41KQ+SU4ZmoNcYS6i+59JWh5zMV+6ge/jrvnqVIsW6XvGsG3cFV3OB2C9wkyRQDHoEaFpWMaAnOA/PXtP0yU1zeeKPH4Elgp6B/ncX/rqnjhyjk0vOzVw84CJ227O7xDpMw27+TrgO2d4Yoi5z18qWPgJ3kehoO6QaLvDiUUPm5jCZb59wCVI1DQ7PzkE6APQSZAI7dm8ZT/oNORphLlVHFQI5X9cewsWoSpN8BdBqBR+V/JLeyZExwnKSZ642a4dkDQHTNA3sucx6Xaj1BPmM9UAPaiBQ1uQaIS2D91TwTip2AqzO7K88q2rFXxy5XCMlS7+Olkf0Z/LcbDwclO9zwWviGDaJulehOszVew82pk7nSHA86irXacAQ3SB9W/k29QJVp6zc8BZL7InjmI2s36Cdv2FTpOBBUb8GlT9kb/oVrcJthbJlCxadEzoZvgCI5WXaXamzvB04D8G2b9LCuj69dC+6gicJcZMmqDW1IYmE9C1klGmIq1+5P4cIEybZ1uIG9gt2IVZdHGHUZG 7ARHpKKe fqRHu6UT3u4YmIRCVYNI7qP1amKV3giSvmays9dnwg/nxTtWoqrpMJLio97bt6GN1DMMM2tRjuvmSLbFzrtjkGAr0zDQRc7va+td3mAcfyTUCpG56kJS8tn0JD4x4nwEZEZjWTc94yeDSKfzbIV3KjetMNL8FB6ftsyuqEwPJQWDIcMa7QN7W8He7eyymL9e/jpN2jg7oiG0XPfu/NtzLbgS21hlW/DoCK6NEpPVHWRpDEFaUlBe74VKmVA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As noted in [1], khugepaged's goal must be to collapse memory to the highest aligned order possible. Suppose khugepaged is scanning for 64K, and we have a 128K folio, whose first 64K half is VA-PA aligned and fully mapped. In such a case, it does not make sense to break this down into two 64K folios. On the other hand, if the first half is not aligned, or it is partially mapped, it makes sense for khugepaged to collapse this portion into a VA-PA aligned fully mapped 64K folio. [1] https://lore.kernel.org/all/aa647830-cf55-48f0-98c2-8230796e35b3@arm.com/ Signed-off-by: Dev Jain --- mm/khugepaged.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a674014b6563..0d0d8f415a2e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -34,6 +34,7 @@ enum scan_result { SCAN_PMD_NULL, SCAN_PMD_NONE, SCAN_PMD_MAPPED, + SCAN_PTE_MAPPED_THP, SCAN_EXCEED_NONE_PTE, SCAN_EXCEED_SWAP_PTE, SCAN_EXCEED_SHARED_PTE, @@ -562,6 +563,14 @@ static bool is_refcount_suitable(struct folio *folio) return folio_ref_count(folio) == expected_refcount; } +/* Assumes an embedded PFN */ +static bool is_same_folio(pte_t *first_pte, pte_t *last_pte) +{ + struct folio *folio1 = page_folio(pte_page(ptep_get(first_pte))); + struct folio *folio2 = page_folio(pte_page(ptep_get(last_pte))); + return folio1 == folio2; +} + static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, @@ -575,13 +584,22 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, bool writable = false; unsigned int max_ptes_shared = khugepaged_max_ptes_shared >> (HPAGE_PMD_ORDER - order); unsigned int max_ptes_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); + bool all_pfns_present = true; + bool all_pfns_contig = true; + bool first_pfn_aligned = true; + pte_t prev_pteval; for (_pte = pte; _pte < pte + (1UL << order); _pte++, address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); + if (_pte == pte) { + if (!IS_ALIGNED(pte_pfn(pteval), (1UL << order))) + first_pfn_aligned = false; + } if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { ++none_or_zero; + all_pfns_present = false; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || none_or_zero <= max_ptes_none)) { @@ -660,6 +678,12 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, goto out; } + if (all_pfns_contig && (pte != _pte) && !(all_pfns_present && + (pte_pfn(pteval) == pte_pfn(prev_pteval) + 1))) + all_pfns_contig = false; + + prev_pteval = pteval; + /* * Isolate the page to avoid collapsing an hugepage * currently in use by the VM. @@ -696,6 +720,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, result = SCAN_PAGE_RO; } else if (unlikely(cc->is_khugepaged && !referenced)) { result = SCAN_LACK_REFERENCED_PAGE; + } else if ((result == SCAN_SUCCEED) && (order != HPAGE_PMD_ORDER) && all_pfns_present && + all_pfns_contig && first_pfn_aligned && + is_same_folio(pte, pte + (1UL << order) - 1)) { + result = SCAN_PTE_MAPPED_THP; } else { result = SCAN_SUCCEED; trace_mm_collapse_huge_page_isolate(&folio->page, none_or_zero, @@ -1398,6 +1426,8 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, bool writable = false; unsigned long orders, orig_orders; int order, prev_order; + bool all_pfns_present, all_pfns_contig, first_pfn_aligned; + pte_t prev_pteval; VM_BUG_ON(address & ~HPAGE_PMD_MASK); @@ -1417,6 +1447,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, max_ptes_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); max_ptes_swap = khugepaged_max_ptes_swap >> (HPAGE_PMD_ORDER - order); referenced = 0, shared = 0, none_or_zero = 0, unmapped = 0; + all_pfns_present = true, all_pfns_contig = true, first_pfn_aligned = true; /* Check pmd after taking mmap lock */ result = find_pmd_or_thp_or_none(mm, address, &pmd); @@ -1435,8 +1466,14 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, for (_address = address, _pte = pte; _pte < pte + (1UL << order); _pte++, _address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); + if (_pte == pte) { + if (!IS_ALIGNED(pte_pfn(pteval), (1UL << order))) + first_pfn_aligned = false; + } + if (is_swap_pte(pteval)) { ++unmapped; + all_pfns_present = false; if (!cc->is_khugepaged || unmapped <= max_ptes_swap) { /* @@ -1457,6 +1494,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { ++none_or_zero; + all_pfns_present = false; if (!userfaultfd_armed(vma) && (!cc->is_khugepaged || none_or_zero <= max_ptes_none)) { @@ -1546,6 +1584,17 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, goto out_unmap; } + + /* + * PFNs not contig, if either at least one PFN not present, or the previous + * and this PFN are not contig + */ + if (all_pfns_contig && (pte != _pte) && !(all_pfns_present && + (pte_pfn(pteval) == pte_pfn(prev_pteval) + 1))) + all_pfns_contig = false; + + prev_pteval = pteval; + /* * If collapse was initiated by khugepaged, check that there is * enough young pte to justify collapsing the page @@ -1567,15 +1616,30 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, } out_unmap: pte_unmap_unlock(pte, ptl); + + /* + * We skip if the following conditions are true: + * 1) All PTEs point to consecutive PFNs + * 2) All PFNs belong to the same folio + * 3) The PFNs are PA-aligned to the order we are scanning for + */ + if ((result == SCAN_SUCCEED) && (order != HPAGE_PMD_ORDER) && all_pfns_present && + all_pfns_contig && first_pfn_aligned && + is_same_folio(pte, pte + (1UL << order) - 1)) { + result = SCAN_PTE_MAPPED_THP; + goto decide_order; + } + if (result == SCAN_SUCCEED) { result = collapse_huge_page(mm, address, referenced, unmapped, order, cc); /* collapse_huge_page will return with the mmap_lock released */ *mmap_locked = false; /* Skip over this range and decide order */ - if (result == SCAN_SUCCEED) + if (result == SCAN_SUCCEED || result == SCAN_PTE_MAPPED_THP) goto decide_order; } + if (result != SCAN_SUCCEED) { /* Go to the next order */ From patchwork Tue Feb 11 11:13:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969524 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B34EC0219B for ; Tue, 11 Feb 2025 11:15:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D57FD6B0092; Tue, 11 Feb 2025 06:15:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D07DA6B0099; Tue, 11 Feb 2025 06:15:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5AC96B009A; Tue, 11 Feb 2025 06:15:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 950FC6B0092 for ; Tue, 11 Feb 2025 06:15:32 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 094F84BA82 for ; Tue, 11 Feb 2025 11:15:28 +0000 (UTC) X-FDA: 83107407978.15.C5C8D05 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id 54D6D140011 for ; Tue, 11 Feb 2025 11:15:26 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272526; a=rsa-sha256; cv=none; b=JWXtoUDw+whe6Ntj0S8D6enlHxFw3LZF9THdIvS223LNIU9flMYR3+gvy3ptUU7RXJTZgt YPZaSFdmx807bzRt9W4rP2Q7qTYvkaEgyhK40Y8N6We83TqPK26ybH50erfkz7UH/ix69L 7iIrFoT6lwaCldfJdwgIBIUp1gB2J2U= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272526; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=20r1ifiVTnofGXAZmkg7JRpuBOwZYU47bd0iakC4IHQ=; b=q2ig+EaPoztsxrjS9IDcIfhjW+KaKo96llcbfgqUSBQVyjjeqHdmLZWZxS/oEaArXY2CFT fKjRflUsaNvp2PkX9yh+oHsgjf72Uk6P6P+bnc/twXPe4QOTZZX1KwXE0o6qVzE0uKuNtP j0w6ppOxzs1DZkbL7Klav2J4sxhPXzQ= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 11C3313D5; Tue, 11 Feb 2025 03:15:47 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D8CDC3F5A1; Tue, 11 Feb 2025 03:15:15 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 10/17] khugepaged: Exit early on fully-mapped aligned mTHP Date: Tue, 11 Feb 2025 16:43:19 +0530 Message-Id: <20250211111326.14295-11-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Stat-Signature: kgr3qmjpdi3hgyoffju1ut6ro3hg3xn8 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 54D6D140011 X-Rspam-User: X-HE-Tag: 1739272526-77618 X-HE-Meta: U2FsdGVkX19sfQPe+nKUqFnaZ91e+cFPqT7VVsIrB+F/ZqXxKKMqCOV0CHR+dV8aZaGuDSRCieqObsEHIxy4hvZb08EJ+jijgvGYjhXZgjBeo+GmqeLyxaPRjdrXS9opZm3qaWWOvmBcOyMZzQL2Ev+NS3ChTSn5RozoMik6tJOOFOXZSrrfV5oiMgcCqce/gCWPWGlz7qE1TWkNHrhsSfLln0BGb+8E85HxEaYayrVHKENfkmMhtD2A4iwJuj6f9A0DDvF3tqcDoQ3DDEgSklPyWabGUlHBJZMifX1UF0Cypf7GASHfiHWUIJBqOdXU/rU5yJA3A0hRySYtZ916jiC8+Eo9RypztG0h0k0BgJ+smWdH6PgkV6i0KJPJ8+wTzd4VTmYj/i7TsbQNyYCBUYL4yLTgbFlJmABpSxCN5bVXjJHSLS2BU55vKYp9+gGasXs+UtndBMa30aj+nQX3aEQk7W09o/qGg20y+uBzoDVb4qHKdlhczhlYrbFJ0jOyJJrqVn8eqtxul40yCZTSn//JYXZdgNlIthbNZ9jB3Y8glDY+OMQ1Sgr66Zv9Dl9YW2rP0jtrhjoKY2aUYOJImoalKdS9D4QsvsmwRFyjg9gPI4yrHOmke1LS74GfENvdmSzhTQSwO/l96Un2ZjpcNKx47tcmp6q0B29fbSEjcsttorlb6+Es48agbt832I4wpmmIt9TR5e+Dd7MVFRQ8rqPGMU7bqm87wbO7n4QHbtA0woLy54IfIUucybvApLn8Xna3Yn+jBeZROm+Us2qgien/51sCOaikhsMmdadTWftTayR1cmV6EQnPVmomQP7lmrPt1bfmgHwP+W0a1Wbqxj15u50DsY8QgA45/Z4AUlPhOo97woMDosyZnNl3Smpdy96DfPNzDkeWrAEmlhmN4MwI5i+jpWLO2FHH1HlW2Xmx8wm4jQFUcpCp3/9IyTDFBbNFGns1pDNQvdEwRbO dPSSgVhp UXN70MTLRKnrTYRq7mPkbdC1UYyE8QGSM+PDxC+8/XSQHmoA2XNbwcj7yGytlrSg08+w/PwXLkwZTXK0lVtDajfZoz3vUyHptxH6nYkNk2ut4tDkE+WOhchpIJSpflbpTY9Uvyq9fS5+x+cNmdxZJagM+3lMP3HpJfLpuafky4Q5LWgJy6E0DvoHTzafgGvh5DdjifmoR+dXDaoEimXM/z0xwm2yDo6Ow95w6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Since mTHP orders under consideration by khugepaged are also candidates for the fault handler, the case we hit frequently is that khugepaged scans a region for order-x, whereas an order-x folio was already installed by the fault handler there. Therefore, exit early; this prevents a timeout in the khugepaged selftest. Earlier this was not a problem because a PMD-hugepage will get checked by find_pmd_or_thp_or_none(), and the previous patch does not solve this problem because it will do the entire PTE scan to exit. Signed-off-by: Dev Jain --- mm/khugepaged.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 0d0d8f415a2e..baa5b44968ac 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -626,6 +626,11 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio); + if (_pte == pte && (order != HPAGE_PMD_ORDER) && (folio_order(folio) == order) && + test_bit(PG_head, &folio->page.flags) && !folio_test_partially_mapped(folio)) { + result = SCAN_PTE_MAPPED_THP; + goto out; + } /* See hpage_collapse_scan_pmd(). */ if (folio_likely_mapped_shared(folio)) { ++shared; @@ -1532,6 +1537,16 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, goto out_unmap; } + /* Exit early: There is high chance of this due to faulting */ + if (_pte == pte && (order != HPAGE_PMD_ORDER) && (folio_order(folio) == order) && + test_bit(PG_head, &folio->page.flags) && !folio_test_partially_mapped(folio)) { + pte_unmap_unlock(pte, ptl); + _address = address + (PAGE_SIZE << order); + _pte = pte + (1UL << order); + result = SCAN_PTE_MAPPED_THP; + goto decide_order; + } + /* * We treat a single page as shared if any part of the THP * is shared. "False negatives" from From patchwork Tue Feb 11 11:13:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E51FC021A1 for ; Tue, 11 Feb 2025 11:16:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D58346B0082; Tue, 11 Feb 2025 06:16:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CE1406B0088; Tue, 11 Feb 2025 06:16:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5E0E6B0093; Tue, 11 Feb 2025 06:16:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 943876B0082 for ; Tue, 11 Feb 2025 06:16:07 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2774F1C94E7 for ; Tue, 11 Feb 2025 11:15:39 +0000 (UTC) X-FDA: 83107408440.04.FC0DE7C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 6DB8040011 for ; Tue, 11 Feb 2025 11:15:37 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf11.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272537; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JRcfpllcre96RyoGVaAk/jWyIQ4JWR8C1VWKHOtp9b8=; b=ICvBPJfV5TBn4Rho61NggJm8RnjvF30H3hMZlYDcm1Wodx8xi5SAeVIa95Kke9UOavrgzb QzHXCwU2BVfLhy8OyYQ3DRQ+rHi6TRZVYTJWjy1elkVx+ev2m9LzKcPzb4ii+huZ9wgh4N edqu821W2J/xDIB+zFPxK/nHiWvpZ5s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272537; a=rsa-sha256; cv=none; b=jxD5sP82nXgWEm0LU1G9t1EYySs49WWOAviQqa3gHrQ0r9oxYl/jiHhQqTYa+f85f0wAB5 SZ8iKT8XqWLkyvculRldkoCUC3r6KxP+cxcUP4aOAOFi6/GzqXrmmU/B2RNGEO4e23NkIp eIKTtvfZpsXxziSlxf7W+0O26UTDgak= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf11.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 03FCF13D5; Tue, 11 Feb 2025 03:15:58 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 3600B3F5A1; Tue, 11 Feb 2025 03:15:25 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 11/17] khugepaged: Enable sysfs to control order of collapse Date: Tue, 11 Feb 2025 16:43:20 +0530 Message-Id: <20250211111326.14295-12-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 6DB8040011 X-Rspamd-Server: rspam07 X-Stat-Signature: ipu61c9e9u7muiw83cozqch7kdaqtqjb X-HE-Tag: 1739272537-505162 X-HE-Meta: U2FsdGVkX18XlB9EkbEIIDxAbb3MDVEKYcnmX6JdsXY0IIqVUaylexvegSzsstkigYgaHgDhSwXBU1YQJqFN4Wcoufk65wYZAXKXimivInQjH/G/T/sBopuvmlpSRu54pdBclAP1kySXbz0H3HfaTOJTwXrmEAwDs+nxr3lTbysMvw0tvZ/SRiyCibfkmsQ8NnChZYZEPNtsFltsROS8nL9oEvFy52kf1mFtvgXpo7+pPcQRY8jybeUhlkUj6sEhiapnFEUKJllSqdN/ZiSeGJwvEL6HgkQNotvQn4/xvI0jTjdPLcx5PQcwgFqJHauRnXf1Cs/IDCyC5DWZ+ilQomqrWAxTQhaQZtAys2cyghAJ+ds6IuwZmGXFsO2E8qJPVFM6WEs8HQbU6kFEb7RZyZiScTimffDKp65s3UI36YoXVpQ7XyBVVju/C7gi9QRvpxXlcfzVBA8uS0V9YokP1rn4uJCCJ1r1R58wzsg57P19RllKFZMnLT7HZbPHk4Ei1uDMZzD+vg/YCPSsCVz44xK6FeU/IDXPeZ+7xgpseTMiyPUFU/Ntmj8r5FhcgcWuexGsBcNyO9P+pFSXDmOrvo4fuhdCLT188xFVE2SEHnDaWL4F8+O44mngwcOd/7tRRrpil1vrL2A+9eJQIBp8/ZiW4RiTAyHP8lmQ2RAKQX2azpTMqJhUJ8g8zzOB4iOSoReXBJ7DKMFf6LjorE8UNGwaj8btztwATgJHmwk2LQr0DXulWXBqIq3Xfh2o3HtqqFZeHB6Sa2BCXO+W7pFY7DhWWr8Wfv0zp5UEViDjBXTrWSO3XRoEf08Yz26o2wrP5ILFZm6GG+hzAal8E7Ey7aKaj99jpP/EjcGDhdZXt/V+5PROpf0IZnb+vtAKqjd6Ate53SUCKqBUF9cBuT6Avtzeer9oBkIBE0zUWw4+yOOHDFF0EUSMpODWJsDYv+NxM3jFo+IV92NZy0wzvUo OgmCTkc8 Xw59qS297s/oKpenfxEBp/kjerfjXqUbslDaXvo09Npv/itRAMuZzQbKbNn3+h8Q87QUsH+3QIqiF6lePVNfajADgVa3nzLpU8ebWCfoePbRjpgYIiklk/JidzTSWZe0yzWvLxVvO6o8vYIVXEJVw0HLTZKOopRNiHFZawJ3zIyrsCXUVYKswabfoV1gjKE6EwTTVWna8Vl/AAENKE6BQQSpGlnKLZM0h+dUF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Activate khugepaged for anonymous collapse even if a single order is activated. This condition will be updated upon by subsequent patches. Signed-off-by: Dev Jain --- mm/khugepaged.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index baa5b44968ac..37cfa7beba3d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -415,24 +415,20 @@ static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) test_bit(MMF_DISABLE_THP, &mm->flags); } -static bool hugepage_pmd_enabled(void) +static bool thp_enabled(void) { /* * We cover the anon, shmem and the file-backed case here; file-backed * hugepages, when configured in, are determined by the global control. - * Anon pmd-sized hugepages are determined by the pmd-size control. + * Anon mTHPs are determined by the per-size control. * Shmem pmd-sized hugepages are also determined by its pmd-size control, * except when the global shmem_huge is set to SHMEM_HUGE_DENY. */ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && hugepage_global_enabled()) return true; - if (test_bit(PMD_ORDER, &huge_anon_orders_always)) - return true; - if (test_bit(PMD_ORDER, &huge_anon_orders_madvise)) - return true; - if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && - hugepage_global_enabled()) + if (huge_anon_orders_always || huge_anon_orders_madvise || + (huge_anon_orders_inherit && hugepage_global_enabled())) return true; if (IS_ENABLED(CONFIG_SHMEM) && shmem_hpage_pmd_enabled()) return true; @@ -475,9 +471,9 @@ void khugepaged_enter_vma(struct vm_area_struct *vma, unsigned long vm_flags) { if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) && - hugepage_pmd_enabled()) { - if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS, - PMD_ORDER)) + thp_enabled()) { + if (thp_vma_allowable_orders(vma, vm_flags, TVA_ENFORCE_SYSFS, + THP_ORDERS_ALL_ANON)) __khugepaged_enter(vma->vm_mm); } } @@ -2679,8 +2675,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, progress++; break; } - if (!thp_vma_allowable_order(vma, vma->vm_flags, - TVA_ENFORCE_SYSFS, PMD_ORDER)) { + if (!thp_vma_allowable_orders(vma, vma->vm_flags, + TVA_ENFORCE_SYSFS, THP_ORDERS_ALL_ANON)) { skip: progress++; continue; @@ -2704,6 +2700,10 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, khugepaged_scan.address + HPAGE_PMD_SIZE > hend); if (IS_ENABLED(CONFIG_SHMEM) && !vma_is_anonymous(vma)) { + if (!thp_vma_allowable_order(vma, vma->vm_flags, + TVA_ENFORCE_SYSFS, PMD_ORDER)) + break; + struct file *file = get_file(vma->vm_file); pgoff_t pgoff = linear_page_index(vma, khugepaged_scan.address); @@ -2782,7 +2782,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, static int khugepaged_has_work(void) { - return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled(); + return !list_empty(&khugepaged_scan.mm_head) && thp_enabled(); } static int khugepaged_wait_event(void) @@ -2855,7 +2855,7 @@ static void khugepaged_wait_work(void) return; } - if (hugepage_pmd_enabled()) + if (thp_enabled()) wait_event_freezable(khugepaged_wait, khugepaged_wait_event()); } @@ -2886,7 +2886,7 @@ static void set_recommended_min_free_kbytes(void) int nr_zones = 0; unsigned long recommended_min; - if (!hugepage_pmd_enabled()) { + if (!thp_enabled()) { calculate_min_free_kbytes(); goto update_wmarks; } @@ -2936,7 +2936,7 @@ int start_stop_khugepaged(void) int err = 0; mutex_lock(&khugepaged_mutex); - if (hugepage_pmd_enabled()) { + if (thp_enabled()) { if (!khugepaged_thread) khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged"); @@ -2962,7 +2962,7 @@ int start_stop_khugepaged(void) void khugepaged_min_free_kbytes_update(void) { mutex_lock(&khugepaged_mutex); - if (hugepage_pmd_enabled() && khugepaged_thread) + if (thp_enabled() && khugepaged_thread) set_recommended_min_free_kbytes(); mutex_unlock(&khugepaged_mutex); } From patchwork Tue Feb 11 11:13:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDC98C0219B for ; Tue, 11 Feb 2025 11:15:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 604FC6B0089; Tue, 11 Feb 2025 06:15:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 58E376B009A; Tue, 11 Feb 2025 06:15:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 407E76B009B; Tue, 11 Feb 2025 06:15:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2004A6B0089 for ; Tue, 11 Feb 2025 06:15:51 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 846664BCB3 for ; Tue, 11 Feb 2025 11:15:48 +0000 (UTC) X-FDA: 83107408776.22.F13C542 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id A90CB1A0002 for ; Tue, 11 Feb 2025 11:15:46 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf19.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272546; a=rsa-sha256; cv=none; b=rIQSjGBzPDx8bC+spkx9pEvuadJQDZTKA/J088zOpV5iS9Lu/LvwiyvtU9im4MW2eqzRmM /yGdDdpPo1LIY3tLio8o9Uuaqrt5yEve80D9n5ZPkZEfk9G+qkorFb0BsoclCnPXzSJWJi DhBhQ7xvlheD/aP5prp2dQ0bRdUeAOk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf19.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272546; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HrjpiOlY2x1k+HdVgKeBXcgeGJ6kMifIq2ZuufMPTg8=; b=5t349GfPsJy1C6LiaGOAf12WRN04rV4JwONdDCXCjdvdPOJ3CjurmpavR3NHuEanM4skis E3uRX7f5kEdnxn4uvry3qn+Q/wvXBzSoS5Gqw9MqGRoZn1bnso9bFeD/ASTb/qCDbHJXNi Rsc/8eKWPoXEuzRKai4boyzTGBvrRHU= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 87B9413D5; Tue, 11 Feb 2025 03:16:07 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 1289E3F5A1; Tue, 11 Feb 2025 03:15:36 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 12/17] khugepaged: Enable variable-sized VMA collapse Date: Tue, 11 Feb 2025 16:43:21 +0530 Message-Id: <20250211111326.14295-13-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: A90CB1A0002 X-Rspamd-Server: rspam12 X-Stat-Signature: wea1iqpziru91eoqfuikeixs6ca6hjrh X-HE-Tag: 1739272546-85875 X-HE-Meta: U2FsdGVkX18hU8XQWKH/IFfaWABht75/6DryisM5keBsWJiOrg+ProT7EJuBpUtgF11/OyzJtqL2UxHxeKUm6t6dlmN6eAQ8PvDaIZKGrdTnahFZQ/0VTZd/oonBsmNWn3TXf35cpVFZf0pPmU94Cq2jk8RUeE/hN3vUkQjD+CxX1GzuG+KSUVer6T0n6DV1TBSQqCgAjD6HgKpzOrn7G7sSaXpEH3EXvNKs5jjJXAuH9WR0xrGlrGjjg1zQ6zQRjOlEpe5H8qRNPzLhmZCHMy+Q4CfXCkegiaNVVDts2uX2DAcIXy5Fm/5MGw4odl8LdAbwfPWRtyVaJo9KOTBQ3pPnXSZyGaugUXrLySfHu9GZWFEAJVUbcsWepxEMyOYz0Ay7ks76VkLuem3gk2K5N2nRMAQYRxuH3y3XGEXNddiwBd0nG8WYLCA3KRVQapg47igusQxUeVuh2WfvI4zGi/N7pKlYqP9K7Qwmq0W2KXeaXzlYxJcmn2aEblSLnWBvo2lU6ziu+wTR15iCHi/C3PojKuXlVMtd8xiD6uPlU745nZHbgqgwNtRlTo57hV5KUc7mJOJiKd/NN0brR6UCDII4BQNjwsV3yKbjiRWqkfDs8nmCsK39enluUQ9YEPYsnVXCaBBmd5Lj5hzQ88teEbedlnlNUVCV0bUQUKqbCCOJJnLW+roGSoAhfKIWpeg4y08kxvZwgVGTgVAk1xFrrLFwfbhiXmrYyEh6rBH4C2ebVEEXlX/0/3M4G9LWFRPozPExbhIC+oVsxQZBe1j8Ho3PDMt7edvZmrFtkHJRCmKdlY9at/JA50qqsG8Xou5inoatjA49EXJ8WHb6+qEAKXCd12tW+jBO8gMofUsJ5lK4qhhbH6urOnI6DvaI1Rw0cxSer+5avoTllq7GHmUlr+Ia6GcsL/JvLxmSC9VNFpqqDH1eyLq4TicTBN7dVzZDUYUyf0vy1Zz1tay7s4C +xjI0svD lgxTb38ESnUJ94bb6wkHNjPLrqW+TKLLJOSMn2kydooGF0T1nvP1jB7thkOFJ7fwGvgbtAmefR4Y5ARno1APLWFxF1qi9ztCHkrMpljyHqXTpGJCdqHUTPjUBWVydfNVfIoHYDk7QjMiU1xGvqoPRmf64a3OKJ/p4ZXQrBXRlGj1nqWIH/eGoUX0PH6AwSkW2CiiDDv1IAYP2tMpCLUYYg2aeS5XnsvaANoBx X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Applications in general may have a lot of VMAs less than PMD-size. Therefore it is essential that khugepaged is able to collapse these VMAs. Signed-off-by: Dev Jain --- mm/khugepaged.c | 68 +++++++++++++++++++++++++++++-------------------- 1 file changed, 41 insertions(+), 27 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 37cfa7beba3d..048f990d8507 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1413,7 +1413,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, static int hpage_collapse_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, bool *mmap_locked, - struct collapse_control *cc) + unsigned long orders, struct collapse_control *cc) { pmd_t *pmd; pte_t *pte, *_pte; @@ -1425,22 +1425,14 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, unsigned long _address, orig_address = address; int node = NUMA_NO_NODE; bool writable = false; - unsigned long orders, orig_orders; + unsigned long orig_orders; int order, prev_order; bool all_pfns_present, all_pfns_contig, first_pfn_aligned; pte_t prev_pteval; - VM_BUG_ON(address & ~HPAGE_PMD_MASK); - - orders = thp_vma_allowable_orders(vma, vma->vm_flags, - TVA_IN_PF | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL_ANON); - orders = thp_vma_suitable_orders(vma, address, orders); orig_orders = orders; order = highest_order(orders); - - /* MADV_COLLAPSE needs to work irrespective of sysfs setting */ - if (!cc->is_khugepaged) - order = HPAGE_PMD_ORDER; + VM_BUG_ON(address & ((PAGE_SIZE << order) - 1)); scan_pte_range: @@ -1667,7 +1659,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, decide_order: /* Immediately exit on exhaustion of range */ - if (_address == orig_address + (PAGE_SIZE << HPAGE_PMD_ORDER)) + if (_address == orig_address + (PAGE_SIZE << (highest_order(orig_orders)))) goto out; /* Get highest order possible starting from address */ @@ -2636,6 +2628,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, struct mm_struct *mm; struct vm_area_struct *vma; int progress = 0; + unsigned long orders; + int order; + bool is_file_vma; VM_BUG_ON(!pages); lockdep_assert_held(&khugepaged_mm_lock); @@ -2675,19 +2670,40 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, progress++; break; } - if (!thp_vma_allowable_orders(vma, vma->vm_flags, - TVA_ENFORCE_SYSFS, THP_ORDERS_ALL_ANON)) { + orders = thp_vma_allowable_orders(vma, vma->vm_flags, + TVA_ENFORCE_SYSFS, THP_ORDERS_ALL_ANON); + if (!orders) { skip: progress++; continue; } - hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE); - hend = round_down(vma->vm_end, HPAGE_PMD_SIZE); + + /* We can collapse anonymous VMAs less than PMD_SIZE */ + is_file_vma = IS_ENABLED(CONFIG_SHMEM) && !vma_is_anonymous(vma); + if (is_file_vma) { + order = HPAGE_PMD_ORDER; + if (!(orders & (1UL << order))) + goto skip; + hend = round_down(vma->vm_end, PAGE_SIZE << order); + } + else { + /* select the highest possible order for the VMA */ + order = highest_order(orders); + while (orders) { + hend = round_down(vma->vm_end, PAGE_SIZE << order); + if (khugepaged_scan.address <= hend) + break; + order = next_order(&orders, order); + } + } + if (!orders) + goto skip; if (khugepaged_scan.address > hend) goto skip; + hstart = round_up(vma->vm_start, PAGE_SIZE << order); if (khugepaged_scan.address < hstart) khugepaged_scan.address = hstart; - VM_BUG_ON(khugepaged_scan.address & ~HPAGE_PMD_MASK); + VM_BUG_ON(khugepaged_scan.address & ((PAGE_SIZE << order) - 1)); while (khugepaged_scan.address < hend) { bool mmap_locked = true; @@ -2697,13 +2713,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, goto breakouterloop; VM_BUG_ON(khugepaged_scan.address < hstart || - khugepaged_scan.address + HPAGE_PMD_SIZE > + khugepaged_scan.address + (PAGE_SIZE << order) > hend); - if (IS_ENABLED(CONFIG_SHMEM) && !vma_is_anonymous(vma)) { - if (!thp_vma_allowable_order(vma, vma->vm_flags, - TVA_ENFORCE_SYSFS, PMD_ORDER)) - break; - + if (is_file_vma) { struct file *file = get_file(vma->vm_file); pgoff_t pgoff = linear_page_index(vma, khugepaged_scan.address); @@ -2725,15 +2737,15 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, } } else { *result = hpage_collapse_scan_pmd(mm, vma, - khugepaged_scan.address, &mmap_locked, cc); + khugepaged_scan.address, &mmap_locked, orders, cc); } if (*result == SCAN_SUCCEED) ++khugepaged_pages_collapsed; /* move to next address */ - khugepaged_scan.address += HPAGE_PMD_SIZE; - progress += HPAGE_PMD_NR; + khugepaged_scan.address += (PAGE_SIZE << order); + progress += (1UL << order); if (!mmap_locked) /* * We released mmap_lock so break loop. Note @@ -3060,7 +3072,9 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, fput(file); } else { result = hpage_collapse_scan_pmd(mm, vma, addr, - &mmap_locked, cc); + &mmap_locked, + BIT(HPAGE_PMD_ORDER), + cc); } if (!mmap_locked) *prev = NULL; /* Tell caller we dropped mmap_lock */ From patchwork Tue Feb 11 11:13:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FD4BC0219D for ; Tue, 11 Feb 2025 11:16:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B82C7280002; Tue, 11 Feb 2025 06:15:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B0C2F280001; Tue, 11 Feb 2025 06:15:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 986B2280002; Tue, 11 Feb 2025 06:15:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 73CA2280001 for ; Tue, 11 Feb 2025 06:15:59 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 26B0A1A15EE for ; Tue, 11 Feb 2025 11:15:59 +0000 (UTC) X-FDA: 83107409238.04.6F91260 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf27.hostedemail.com (Postfix) with ESMTP id 98CEF40010 for ; Tue, 11 Feb 2025 11:15:57 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272557; a=rsa-sha256; cv=none; b=aCXNdv+q+ncSubMwhbOWhgT6kekx9Tii2LCenf1T2lwuetn7ScAzyjXKENOuSnfx+NJ1vP v6dVwZGL5/5tBsh5Wjp4d4T6WQ38FDY6PC8tcLgFY1DrEamjWLOm1hQql5QI9ZH3ZWaVFm ks4MO+eJfdllu6lMk8U5jxRRscaVnFI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272557; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N3Fb69C97b7PsqQngaY/05193CPXQY7N+6PYYN1lD7A=; b=kn9Bn+Z7DAZgIPlbiRepCfCS4ePB+N354xPjwhqWvdfExFe1n7+QxeUQyurHoIIA+el02f mzlKe/m0G86/GBvSf6BT/tiOwXCzEzT9YRkp4/bUlyaAVQCA3jQRKaUv4yJBUfW41j/1JP AilKEk8sbLNYrsQqH2vVn+eTnN4N/dw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 68D7913D5; Tue, 11 Feb 2025 03:16:18 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9313B3F5A1; Tue, 11 Feb 2025 03:15:46 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 13/17] khugepaged: Lock all VMAs mapping the PTE table Date: Tue, 11 Feb 2025 16:43:22 +0530 Message-Id: <20250211111326.14295-14-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 98CEF40010 X-Rspamd-Server: rspam12 X-Stat-Signature: pijjhxjc61e5wcjdxghdasfeezodijfn X-HE-Tag: 1739272557-474477 X-HE-Meta: U2FsdGVkX1+WZoYQIFWQXHao09tGDfZMXoX7tmCzTD5Pdlavy67Jvf8s5vMFX5vbyMS9nFCIA3JmmkEphT7xfm+8IbjnZpG+sDr1/Q8YgeoNqzCBnAyAAyA6VYGsWsdUWFWoQtmx4mnXRJRp9GOwKDa0uIpSmVidDHF+j8Y1z3uddu+ObTuuOeG0ncGCYGCzdMMimOOqC+0dRLuRVwxfpIuzQ57nlh1ByAGjGRqFwhcMd8/ukmhE/8D5kqUzwu25+CQeper4qIAdC55EH6oEqYycg4C9N2omm0NSsndX6+f9kGKoAqsL9L7gbNbjy833l0SSc5/Hz6QSA6ismgx5T6uAbFdJuR17UsgCdIrmmT+xkm/QD+4znt42bRmPCzgEBNJtrCHfDulNGHgvMUsrrkzAMaWK4h9nzCPaa65o8rmamk9mwWwjfqhqLsz4NaRQLbFrl2gDs5hqUBizYbDQOCYgl8bKwLyOPkXEeA5auuSdTPU7MppiAxKmlrlHMfd6j0XsmYAmplDM5pmYQ2Un1C9cLO1yYrcRQBL3y1oY+1rQ9sxdoGj1n8+gOGSE/i0KUPI6eri8K0gnTO6/L7YKLKqDnoxeAekFwU2ASJWUuOhYtl3mW3E6xAaOGBvN1sRLxynI0jxjB5TZmzgE/1sfGaQpYjYZv3oU/0/Ot82ljUAZx8g6/y0p6oU+S8jI237BaxKVGSs/PWuBLUD90Eb34mUM7RL2eHI2CEJ0LfwW3yaLe0IhI75XJAoJKATzmRWCLuY6tiKSGUsDRc4+fLPCeNUYHSlSxrUq0eRYZr9u7IL5h1HhijtzcyNnhJESn1U4mvWFW8ouA38xTyIj/MNWX3VAYHnW9Bus8xRjSr9XZJA/FJ1rfpD9AqKX8ewqKnQaIVLTzjXZbRsEavG7vRY+E3BovrV0SMn4nYe9CbSFogY5nlW5zyEgisoOx8p8vVpmpOs//LkLuEpZMTCk4vd uw+sMUM2 MIS8CBJr9hRJU8pG/ajjxW5NlRhX1mcLeEvoUCjFxh8xA6GOz8B+AjhNVwM4ewslf+vqiNYh0TpVxMq3uKy7TnmxL5rUGpLOq+2oYV5lMQ5H/3z0lynodLh0I6m2hr8ewMdCpPmQCdkiJEsGX+F8+Kh0vce+t+dUMuHRbnE5OkW0V2jlwY2KhGDhNeK7eln8QXx+FV2g/G9xniP+qjOZ/mcoTIJRPNjHszoh2y8+cCIL5EYxPodSv+JUFdqkjg/uwfDqX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: After enabling khugepaged to handle VMAs of any size, it may happen that the process faults on a VMA other than the VMA under collapse, and both these VMAs span the same PTE table. As a result, the fault handler will install a new PTE table after khugepaged isolates the PTE table. Therefore, scan the PTE table, retrieve all VMAs, and write lock them. Note that, rmap can still reach the PTE table from folios not under collapse; this is fine since it does not interfere with the PTEs under collapse, nor the folios under collapse, nor can rmap fill the PMD. Signed-off-by: Dev Jain --- mm/khugepaged.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 048f990d8507..e1c2c5b89f6d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1139,6 +1139,23 @@ static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm, return SCAN_SUCCEED; } +static void take_vma_locks_per_pte(struct mm_struct *mm, unsigned long haddress) +{ + struct vm_area_struct *vma; + unsigned long start = haddress; + unsigned long end = haddress + HPAGE_PMD_SIZE; + + while (start < end) { + vma = vma_lookup(mm, start); + if (!vma) { + start += PAGE_SIZE; + continue; + } + vma_start_write(vma); + start = vma->vm_end; + } +} + static int vma_collapse_anon_folio_pmd(struct mm_struct *mm, unsigned long address, struct vm_area_struct *vma, struct collapse_control *cc, pmd_t *pmd, struct folio *folio) @@ -1270,7 +1287,9 @@ static int vma_collapse_anon_folio(struct mm_struct *mm, unsigned long address, if (result != SCAN_SUCCEED) goto out; - vma_start_write(vma); + /* Faulting may fill the PMD after flush; lock all VMAs mapping this PTE */ + take_vma_locks_per_pte(mm, haddress); + anon_vma_lock_write(vma->anon_vma); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, haddress, From patchwork Tue Feb 11 11:13:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74736C0219B for ; Tue, 11 Feb 2025 11:16:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51C1B6B0093; Tue, 11 Feb 2025 06:16:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A5196B009B; Tue, 11 Feb 2025 06:16:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AA5C6B009C; Tue, 11 Feb 2025 06:16:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 07AA46B0093 for ; Tue, 11 Feb 2025 06:16:10 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B7802140318 for ; Tue, 11 Feb 2025 11:16:09 +0000 (UTC) X-FDA: 83107409658.08.52D3B67 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf20.hostedemail.com (Postfix) with ESMTP id F1D241C000C for ; Tue, 11 Feb 2025 11:16:07 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272568; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RAToA3j+68VYcFT4GxNM/ewmM8AbXHXRq7FvcrAH4IU=; b=2JffzYyfvCLm1sBZpSVbHd2EVUBuL/ZnYAGf50ry7B3fap4WW5Jy+Qenn1aDlJu8RxGzpw 054tCRqybFocFhPHSR5LLOF+MkYsQ9aI/RfjiSvnf/YWaZrxHGuAikEOIYU+Mw06wqXkwR ApnC3lysJDxGVkUXoiYppMSxwGY4r1c= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272568; a=rsa-sha256; cv=none; b=dY1ZsM6RT1dxIES1jXNBbGUHFpcCzTPcF+uWqpPI+TaMCawdSK99oCbzzFRjmcT5ytm/Mi H5Buxmz/QGGbdb27ac9d5VEHVT9+qSpKZKruS548myDboPlsd1slekUVeVUkfdippinje8 UUPA7nzqIomukcIQ09IKXFmq3GsWyn8= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BF1AD13D5; Tue, 11 Feb 2025 03:16:28 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7B34A3F5A1; Tue, 11 Feb 2025 03:15:57 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 14/17] khugepaged: Reset scan address to correct alignment Date: Tue, 11 Feb 2025 16:43:23 +0530 Message-Id: <20250211111326.14295-15-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: F1D241C000C X-Stat-Signature: k5o8mcsfih38ri3kko43xp18xie8c6j6 X-Rspamd-Server: rspam03 X-HE-Tag: 1739272567-686195 X-HE-Meta: U2FsdGVkX1/+ZLEyHthffKE6HCDwW45mJkrPlCRKOrcC/nU8449mpRW0qSTXW+vK30G/a45WaclUNOTXd9WPcuN3sDMrmjLHgiWpegRoCsyFABJhso+Jwrgm0Jrc5bQW49XGbLg8fCjhjyPmKCTUqS4GnoqQOfgA3ZEjMnoJubdPKGdjddtllP3MrmOjcDKUa2KijTxV2PVFRHJnuJwyvIY7sFkqjX+A9FoWmcP/b+9RfQRu/trdF7yvQE5HnXGDtSCoTmCJ/rn5Ukh2E8Wb/zHR7eCwRqT0vVg2tdaWfWzWZkbS1D81XqQFnftebEVbuQU3aHWeyNA4UATmhX90/iVtfkwf7NEFXm1UZzRWV0NJYDoaJaZGyAFUK5ZNl0ej+9ATwWzZ8nZLFoXiocU6xcgBZgiKJFz5vLjrbR+jXFY9/ajhjAN+dAYk6w0/HJqVUc66rc9zyNvBOiLmiVmmFjP+txXdrTYIsyx9KbAOsnD4a+q3xkCYnYFII3U0oMSVWjNqKeAmRNrLtMAVB81cVbNympDvj9FGVT8AldSHMky9B8MKD1vVZk+UYnMDNXQBfV0K03iRRc3aD3uvH3A5LQ8Hfso4NptfAmuaI/cTt/gO/u679QrOSWpF8+dBjK1GnevyUEJ1EsnobBS6Jtlbw2xQcdqbbHX/iM6Ue42WRQSbOQnKRUh44OB4t25Id/m4yHBesIyIuQ1Dkdf0Mj1ebPuAy3YiiNeFEtNNsIAMr8LFmx5Fy+WT+WSE3Jgj4OqcN6PYHbOlSlwbQYPURnxZ6VqdMbI0Vfi1poB0IPeSJ2qRhusc500YrbroWKM3QhVr1I8oqNlejsOd14YwpB4GigCaKxgOW7jOohjpI/IIsSUWfltwnYSrJ+4Hhk6+w92xWUmZx7YS89Fwn1OxGCLvCij5/o8b/C3RG1FIIrXGKOPy9QJPOq4DBswXHl8RaF/WA032LgUcyBF8JmEygdM 1LiGY3pm m4uZmKy7nhh5ZYtS/j68QPMziMASN0c7dkIsbU8eXAq5uRThZS4e1rWIXrErHBIkP00ATlminAqgLKF7c8kwEtPrg4eNdY7IkVVj5Ex21cq/8+XrXojDbnJ7z709wsdFc3kjh067H6TMR3qOajcdno94B9ceLKzQTtgiZm1OpILc956Ppnmj/US17JZtGJdOrWVaYxczjfO/FZFuxMq6jR+PKZMNLNvkEoGmK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: There are two situations: 1) After retaking the mmap lock, the next VMA expands downwards. 2) After khugepaged sleeps and starts again, it will pick up the starting address from the global struct khugepaged_scan, and hence will pick up the same VMA as in the previous cycle. In both cases, khugepaged_scan.address > hstart. Therefore, explicitly align the address to the order we are scanning for. Previously this was not a problem since the alignment was to be always PMD-aligned. Signed-off-by: Dev Jain --- mm/khugepaged.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index e1c2c5b89f6d..7c9a758f6817 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2722,6 +2722,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, hstart = round_up(vma->vm_start, PAGE_SIZE << order); if (khugepaged_scan.address < hstart) khugepaged_scan.address = hstart; + else + khugepaged_scan.address = round_down(khugepaged_scan.address, PAGE_SIZE << order); + VM_BUG_ON(khugepaged_scan.address & ((PAGE_SIZE << order) - 1)); while (khugepaged_scan.address < hend) { From patchwork Tue Feb 11 11:13:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969530 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98F68C021A1 for ; Tue, 11 Feb 2025 11:16:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23602280003; Tue, 11 Feb 2025 06:16:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1BEEC280001; Tue, 11 Feb 2025 06:16:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 037A2280003; Tue, 11 Feb 2025 06:16:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D3FEC280001 for ; Tue, 11 Feb 2025 06:16:19 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9478C161525 for ; Tue, 11 Feb 2025 11:16:19 +0000 (UTC) X-FDA: 83107410078.07.146C458 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf25.hostedemail.com (Postfix) with ESMTP id 07457A000C for ; Tue, 11 Feb 2025 11:16:17 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272578; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H8SbKuVuxb+XOcV3roWDwWULe89EsbuOyo+3V6RkLR0=; b=zVUVw9Afi1whwvUbtlrJhTpAohq+nfnPBlCbl3WN81YjbNGvBJuBE3PhPqSuKUCcP0D9Nf 39HpAz280y7RmuSA678QMjGNHiRXsCUmKK4eLb8GEI2WhF69QRYEX/ExA3jCPiagEeo4tS iGxilu281kVSbq+aby8crGBixQUheOo= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272578; a=rsa-sha256; cv=none; b=SyawDHuA+oMrppwpgHgTjC70uTz90L/MTk5J9K3jbV/eImyv/QFtMfsBMXQS7UVo7w0Q06 a3xnGf/EDVuZ5TW8SASxEY/m5HaJBRvk6FKzk6s8nF8IAWrn21cgw3Nw7IJWTkF8ZC277O 7nCFZ5qO0RYVNyPCB744RvMoV2Fi04s= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C80F513D5; Tue, 11 Feb 2025 03:16:38 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id CB9143F5A1; Tue, 11 Feb 2025 03:16:07 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 15/17] khugepaged: Delay cond_resched() Date: Tue, 11 Feb 2025 16:43:24 +0530 Message-Id: <20250211111326.14295-16-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 07457A000C X-Stat-Signature: txqo1wam7dwoqi56tbuxusjtfncabbmx X-Rspam-User: X-HE-Tag: 1739272577-280693 X-HE-Meta: U2FsdGVkX1/UQfXC9ywBeduyH5SecmZiayXV9sd13bKcp13uIdSvUS6XbW90wcN8W+OTe7MjJz2pfQvvSARlcn4nDj5HE1CH+1pROb33qLC1G3Prx0oxPz/OVi/LKS2qYzjnWFdgfAv28mH9cVZGMFq6FWR8vSRhzm6ftkJs4jogIAjBWcUyTHuzf9jkdtiYO6mP1EmGTAGH2C6gI4jQCBW8+q+e2/HOJpjpImCJd6gfd64xcg7LJaF3IFdh1G+KCAJE4PY/0pHML25RhWBARoROVlLf/T5BTZEHgNTiT0lhoHgyMpiRTlTPLB1EpXyCp5SS1gkBWdkzeYNBF2f2Cw8OtPyg3U7RRHoNm/1vdy9ew3Udl0xGB4tCYD8WXz69hcbV8DWrzBOiGcIau5A/TTzG82Obx58Wl3sJWBv/g33Ddtkrk+HGaaNix9ju/d/auxEsXxnjiKLz4m+6SkHZDxlSh1pSPQdqvZqSNs4Ut7Xt9QKQhZGFmMEwSveWQkxBqAmdaNJF+PjzJAy6JUu1IkbG7gdoiMmdTz7TkybR8YGeHNfife+a496/6+dSKq0yxwP0SBtF+Aya1MEE8GN+vehrT8E0a2NFPMpeMGtNutUOH3hhqWdl3u2KpoFQJfy0nsycWgkBzS2prVvbjez8+IBmY9ePUkCrXLs2C9CgtaJLIk7sYvOaqT2MAgu0fP/y6YcU5BrSFxBdhZ9nfxUbu2tCuKJeKL+3veWBCVOWs2kmHyfXYyi08u82V0jQ9m41KuGumRJTgT1c0X7EvwdQT6WhFh6S04nEBSHiAEboYHkowY7YDbBcFuTNYcrV2V8w4Zi7IB2pWTvjEV8Ldn1H+NNdeSe+0A/zIkOqEhk9bEralrvGe4tmd0+bzIs5baU8sHMTtAwdbI801YzjjnBOuPzPM1mhmzNY9/PNdmolEvsVcvPcdlBhvTA5PEc9NTmJ4n6RQIJKnAyMtC+vL+j DVoWmMF8 BBscC757zgYmEU0kVbxy6ATphKWrIWy7ubEksJpfP1H4mufiuNNgJcz8IC8e16OWjFzk6kUi94iVxAjjTfIloUu//7Enq4Ty0UnK39nrDqYW1EQxvUamaaGU19heU2vUle8waio+IohfiHNt5FjwVBhvgTJL84Zvpr+htiGfO2IOS9Nns9DrZSwCb0/DBgGDbsAsC7An4cSosjfep8DY5vXfnqKpSKPHqeUiSR4MeQ8N9QN5q9ChqfkoVYPBKvAaofDW4b2Kse6p+9+TX+Y87p8ntEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Post scanning VMAs less than PMD-size, cond_resched() may get called at a frequency of 1 << order worth of pte scan. Earlier, this was at a PMD-worth scan. Therefore, manually enforce the previous behaviour; not doing this causes the khugepaged selftest to timeout. Signed-off-by: Dev Jain --- mm/khugepaged.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 7c9a758f6817..d2bb008b95e7 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2650,6 +2650,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, unsigned long orders; int order; bool is_file_vma; + int prev_progress = 0; VM_BUG_ON(!pages); lockdep_assert_held(&khugepaged_mm_lock); @@ -2730,7 +2731,10 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, while (khugepaged_scan.address < hend) { bool mmap_locked = true; - cond_resched(); + if (progress - prev_progress >= HPAGE_PMD_NR) { + cond_resched(); + prev_progress = progress; + } if (unlikely(hpage_collapse_test_exit_or_disable(mm))) goto breakouterloop; From patchwork Tue Feb 11 11:13:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08B76C0219B for ; Tue, 11 Feb 2025 11:16:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DB2E280004; Tue, 11 Feb 2025 06:16:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 76385280001; Tue, 11 Feb 2025 06:16:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DD49280004; Tue, 11 Feb 2025 06:16:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 363C6280001 for ; Tue, 11 Feb 2025 06:16:30 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E904D4481A for ; Tue, 11 Feb 2025 11:16:29 +0000 (UTC) X-FDA: 83107410498.28.7387E6A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 4A864C0012 for ; Tue, 11 Feb 2025 11:16:28 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272588; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rPHWL81e3X5xJLqPXe38UP90M/1a3mLjWx5Vl/7I9dk=; b=va25EK3sZDFejaxGNYWTfPFOqT9d/ha6/g/YqKI2VmuxDXXVb0fsGuYMqqMiqC5BHxqovt 8B7GO4Yj/zWXrNzgnIGKs2mgndYggfWhQN7IEWVyyg5fZGyctJ2V6mszylKReZxSs7aej/ E7E/FF3UdcMiHSsyD6YJZAAwEKaT64s= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272588; a=rsa-sha256; cv=none; b=hhwz7cNxvOJsoNijLhjwkyFxcC5WtA5Rj/nxBIorTwG7LeQ6/d9h9t5gsKT42+bY+qUIUs Jt2gx/m6f3TL6qxZVSa31tRzoJmOQfDwOhgg75iQkCgaxZBFmO7EZgNSAwe7U13yL4t3g/ YqeQlPPC6jkB42N2jADtLuBqVvUrCgs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C70313D5; Tue, 11 Feb 2025 03:16:49 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D52D43F5A1; Tue, 11 Feb 2025 03:16:17 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 16/17] khugepaged: Implement strict policy for mTHP collapse Date: Tue, 11 Feb 2025 16:43:25 +0530 Message-Id: <20250211111326.14295-17-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 4A864C0012 X-Stat-Signature: ku4b5gsd7hirmycfhtrmtuu88bzjozfs X-HE-Tag: 1739272588-823375 X-HE-Meta: U2FsdGVkX1/3aaevMOIKtlfgW8eRupFlJGFQrHA+PibYjZfCGu6TeCwyFxEYbcGARz1tZx4uJyzaGTjR37eOJum5jtJkCGRaPtdW8DCs60nIrFzoR8KuOyg5zFwMG+6xH7hTPwf5UNKU8y+hxcUqoU9QRxq1Md75qXD3RmByVCsVvxEd1lIWU4SOxoYe61PVSJIapSvUK6D72obVIe8V9x+vIxu192fkpof4bWptzrobuZmNfFyeczUa+mtYS7vtvQthPqwnzES2tIZHUzGk2KDfpzonzhFx+QKOqrLy3iDUVDW1se8kW9hd5GRplqIfjUI8J2KIBJ3J/tKsOqO+4jpFHCcaOG8Ef3iN+YoEmx9Hg+UtTQ5m3O3N+S8JKOJTVvkTltzNjYUBOoRZtx6UtwrWm+Bhm7HPY3C0AyK1PxviolYJPrcAzL4tNLD/YkOh+p1KtDSxbVyy43m22A4r1NpnXTxsVwFyZRdywVPoqh7htsR++Hf57XNc2QNeRD0qeYu3oNkeDzlRlG8EKklI3yK5sYV37zADsmyLisn7RHdPXi2/DMfTCR8/KOu/iAPqr2M017yIZ/ZGsyqD4FrNlVrP7OIKdWJIPSFKWAZezAszuIeYnL1U38Xwqz1F+dU+Ke3KUiqXiEtby9CfLzeymUcCXCZa8xa755h0oaED/DoeiQkod7ih2g1PtgrLlaDZBy0v7DiXUPPX+5knrEf0a5BAm4fzVcxaKpkwmPWRs6tlo6hTy8uvS8INR73yZU3x3CC6upZf9R0kpU34eCIShjU34z6I2GolVvyrMrkWWneCgnhr9DApLdeFhmTCvxnwyMAEO61xfCGUUk7CzvyNqCws1rOlveSrVVzacBYGs2YVi4DubFAX7f167rP1s2es9y2ysup6eglcMrvOODMG/M5YiWM8lV8l0UCpxgprYWcbQG+hGmKsBgMw8NX49UiOpd2FXQVFpVuGsizgWzh koonoAQn WQauAWOgK3LqLStP/EEy7+4p2XBfoIOgVcwPsg7/QYFz75oUKSG8FHufE+ok8mv9v2Eeql5wgVBn/kJzJsU8XOWBhpuppEfm/IVY0+jGFiXVdLBc2mTTLgDqGF8ahrus1CvQw+QvxSsdipZBJKaYoDcXM/ltQTku1gUA2rqN8GAftL/42ztt+UEBtQfGOf/A6pEH1aqo5K7Pxik2lJJ4ue6nkCXmfDoySZoD8aFy1OvX/Un5KJI0dDoIbhg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As noted in the discussion thread ending at [1], avoid the creep problem by collapsing to mTHPs only if max_ptes_none is zero or 511. Along with this, make mTHP collapse conditions stricter by removing scaling of max_ptes_shared and max_ptes_swap, and consider collapse only if there are no shared or swap PTEs in the range. [1] https://lore.kernel.org/all/8114d47b-b383-4d6e-ab65-a0e88b99c873@arm.com/ Signed-off-by: Dev Jain --- mm/khugepaged.c | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index d2bb008b95e7..b589f889bb5a 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -417,6 +417,17 @@ static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) static bool thp_enabled(void) { + bool anon_pmd_enabled = (test_bit(PMD_ORDER, &huge_anon_orders_always) || + test_bit(PMD_ORDER, &huge_anon_orders_madvise) || + (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && + hugepage_global_enabled())); + + /* + * If PMD_ORDER is ineligible for collapse, check if mTHP collapse policy is obeyed; + * see Documentation/admin-guide/transhuge.rst + */ + bool anon_collapse_mthp = (khugepaged_max_ptes_none == 0 || + khugepaged_max_ptes_none == HPAGE_PMD_NR - 1); /* * We cover the anon, shmem and the file-backed case here; file-backed * hugepages, when configured in, are determined by the global control. @@ -427,8 +438,9 @@ static bool thp_enabled(void) if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && hugepage_global_enabled()) return true; - if (huge_anon_orders_always || huge_anon_orders_madvise || - (huge_anon_orders_inherit && hugepage_global_enabled())) + if ((huge_anon_orders_always || huge_anon_orders_madvise || + (huge_anon_orders_inherit && hugepage_global_enabled())) && + (anon_pmd_enabled || anon_collapse_mthp)) return true; if (IS_ENABLED(CONFIG_SHMEM) && shmem_hpage_pmd_enabled()) return true; @@ -578,13 +590,16 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, pte_t *_pte; int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0; bool writable = false; - unsigned int max_ptes_shared = khugepaged_max_ptes_shared >> (HPAGE_PMD_ORDER - order); + unsigned int max_ptes_shared = khugepaged_max_ptes_shared; unsigned int max_ptes_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); bool all_pfns_present = true; bool all_pfns_contig = true; bool first_pfn_aligned = true; pte_t prev_pteval; + if (order != HPAGE_PMD_ORDER) + max_ptes_shared = 0; + for (_pte = pte; _pte < pte + (1UL << order); _pte++, address += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); @@ -1453,11 +1468,16 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, order = highest_order(orders); VM_BUG_ON(address & ((PAGE_SIZE << order) - 1)); + max_ptes_none = khugepaged_max_ptes_none; + max_ptes_shared = khugepaged_max_ptes_shared; + max_ptes_swap = khugepaged_max_ptes_swap; + scan_pte_range: - max_ptes_shared = khugepaged_max_ptes_shared >> (HPAGE_PMD_ORDER - order); + if (order != HPAGE_PMD_ORDER) + max_ptes_shared = max_ptes_swap = 0; + max_ptes_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - order); - max_ptes_swap = khugepaged_max_ptes_swap >> (HPAGE_PMD_ORDER - order); referenced = 0, shared = 0, none_or_zero = 0, unmapped = 0; all_pfns_present = true, all_pfns_contig = true, first_pfn_aligned = true; @@ -2651,6 +2671,11 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, int order; bool is_file_vma; int prev_progress = 0; + bool collapse_mthp = true; + + /* Avoid the creep problem; see Documentation/admin-guide/transhuge.rst */ + if (khugepaged_max_ptes_none && khugepaged_max_ptes_none != HPAGE_PMD_NR - 1) + collapse_mthp = false; VM_BUG_ON(!pages); lockdep_assert_held(&khugepaged_mm_lock); @@ -2710,6 +2735,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, /* select the highest possible order for the VMA */ order = highest_order(orders); while (orders) { + if (order != HPAGE_PMD_ORDER && !collapse_mthp) + goto skip; hend = round_down(vma->vm_end, PAGE_SIZE << order); if (khugepaged_scan.address <= hend) break; From patchwork Tue Feb 11 11:13:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969532 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4898C0219B for ; Tue, 11 Feb 2025 11:16:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B7666B008C; Tue, 11 Feb 2025 06:16:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 540426B0098; Tue, 11 Feb 2025 06:16:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E0F46B0099; Tue, 11 Feb 2025 06:16:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1C59A6B008C for ; Tue, 11 Feb 2025 06:16:56 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BD8421C8AAC for ; Tue, 11 Feb 2025 11:16:40 +0000 (UTC) X-FDA: 83107411002.15.066FA80 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id DE6F4C000A for ; Tue, 11 Feb 2025 11:16:38 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf10.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272599; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Cmd9EnVWTOAbuOj2/O8Hi1V4lwX9xsFBQIeDghF7L9w=; b=zz6aycxs+OZzBOoxUD6N7fIV1ds48Juzd8VqzreTADQteSG2SCjQvZniTYiQy6TeHdmmGV 7fVXD7JTE4/SWjeZ+8mijG1biOmypRgZRo4/K710oP41HarFIaQ0MA9Z+dQxBFlrsTiwWJ 4+1oePiD4amRaVkgeNsXASN6AHsAuso= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272599; a=rsa-sha256; cv=none; b=zT19Zhy34OQm4j51t5W2DYHbGemLhdq/iklhQWVA3azFqry597UANFTVu2uf2tRHdo3xEq jiuEys50ZV9gJ8J5S1wX6jGBDWV9/f/xKq+ZbWw4lxbrkkU77R/v1vOrAQJq0rxcR88IGn oR+QOnJUiZmrpUwX6FjRrPJHvd/Ogyw= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf10.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A26381477; Tue, 11 Feb 2025 03:16:59 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2A64C3F5A1; Tue, 11 Feb 2025 03:16:27 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 17/17] Documentation: transhuge: Define khugepaged mTHP collapse policy Date: Tue, 11 Feb 2025 16:43:26 +0530 Message-Id: <20250211111326.14295-18-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: DE6F4C000A X-Rspamd-Server: rspam07 X-Stat-Signature: bwcbgipj8kxckji4zdgkzda3fgeq87f7 X-HE-Tag: 1739272598-210635 X-HE-Meta: U2FsdGVkX1/6FYN1BJ/mG9FwVpitfcQJSh0UQeVatHM9MUoNUURHxKZNb4Vmr9jJI1XykobmgdiAlP1qo3j/uNnIdo5Ur33iXNyjpFK+8g5KEWKsCjkKdtkbG1GGqKsNdPHNSED8E2LC0rEXt0fJErYaKp9yVmpcsxQvTTfapIht2s+5ZIvTWKt+Huw0q/J9fQqO2AqZKqxWRw6rBO0dWNBNVDCvNhjlKLj0FJqWmZo8Rn/kyFuOx97tDZVSy2GetidEpPuQAzfOcXm3E1f6O9l1HQUhpZmZnsrlwPVCh64Ofj0fUb8zLz4Df+mCZ98LpM6JuSUZoUCNufbbFF944AJsn0ifSjiM/yl8JGir2VD5TZzFBdyUCcoPJKKXBQNkJykE9fUJeDUC50h99/sgDKieBk0Wr2l4mUitgO/B+DITBIC/eqHE/lIm0faOBZzSPXnOQZN3T2+3jaDZCMMi5tfK18KFg13hL2uhN4j8JX4G21HDSn33Q0MZFSVnTRQWIZvc//svoPVYc5ueuTN3hI0BoIhjCR+zEhJ7vM3voy+LfVMWakgGL7cNaJ86MQeepzkDni0TigLN+HiT22Rz8StZ17JpF43ZThNQpO/Jk9XB2vbCXP7curlvdcYyFdzoNm/iGOwMhcejdmUkPozvRlH6Rp7oOV90MNApoDx+bCzcM13kvTnzoIZoRGA9XVZDa30emwcC5G+VuEnEQMOmdT2Ejig+lYd11lTYv2AZlJxXWQ8zL2GWp2ZkjdB0yomryPJKQKte2AGTKOGsbNf1uJnitsvBiibnwmlRn1bhZmaFSH04BTun9zrcMLqv0NPHV9d68QLYo89NksjYCb26i1kTbztWASOTciOuHfMMloCh52vders0x9xReC7X0vsT04aRKcJc+aBXU9C5tGEI+Sw6NYJxqJOrQXFZEwtzrAIxhreuKaQn4n5PDEtAHl1Ge0C42D/M9UvlMikK+xt clnmCn9Q xDqSC/u5Rlsnn6rCIDRGtxSBNmZJpkdesfjQ71hPE/kXnbYnXWLe/JdaulWu9w7m62fvcLLeG4KBZ1IMbOaQgdWiiRUhpDB+ZI52bypTMTDq/l1NaDtLyCZQvWchUl2fDeOwaaI0lxX+XMJvsgskaUSnnU0TK7cjgJOGhbRfoP+9jiRzDkfZAnXap+d3AQUxYJZL89rmOKXJC+cRgEwhv1th2bT6MdJxjOZy4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Update documentation to reflect the mTHP specific changes for khugepaged. Signed-off-by: Dev Jain --- Documentation/admin-guide/mm/transhuge.rst | 49 +++++++++++++++++----- 1 file changed, 38 insertions(+), 11 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index dff8d5985f0f..6a513fa81005 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -63,7 +63,7 @@ often. THP can be enabled system wide or restricted to certain tasks or even memory ranges inside task's address space. Unless THP is completely disabled, there is ``khugepaged`` daemon that scans memory and -collapses sequences of basic pages into PMD-sized huge pages. +collapses sequences of basic pages into huge pages. The THP behaviour is controlled via :ref:`sysfs ` interface and using madvise(2) and prctl(2) system calls. @@ -212,20 +212,16 @@ this behaviour by writing 0 to shrink_underused, and enable it by writing echo 0 > /sys/kernel/mm/transparent_hugepage/shrink_underused echo 1 > /sys/kernel/mm/transparent_hugepage/shrink_underused -khugepaged will be automatically started when PMD-sized THP is enabled +khugepaged will be automatically started when THP is enabled (either of the per-size anon control or the top-level control are set to "always" or "madvise"), and it'll be automatically shutdown when -PMD-sized THP is disabled (when both the per-size anon control and the -top-level control are "never") +THP is disabled (when all of the per-size anon controls and the +top-level control are "never"). mTHP collapse is supported only for +private-anonymous memory. Khugepaged controls ------------------- -.. note:: - khugepaged currently only searches for opportunities to collapse to - PMD-sized THP and no attempt is made to collapse to other THP - sizes. - khugepaged runs usually at low frequency so while one may not want to invoke defrag algorithms synchronously during the page faults, it should be worth invoking defrag at least in khugepaged. However it's @@ -254,8 +250,9 @@ The khugepaged progress can be seen in the number of pages collapsed (note that this counter may not be an exact count of the number of pages collapsed, since "collapsed" could mean multiple things: (1) A PTE mapping being replaced by a PMD mapping, or (2) All 4K physical pages replaced by -one 2M hugepage. Each may happen independently, or together, depending on -the type of memory and the failures that occur. As such, this value should +one 2M hugepage, or (3) A portion of the PTE mapping 4K pages replaced by +a mapping to an mTHP. Each may happen independently, or together, depending +on the type of memory and the failures that occur. As such, this value should be interpreted roughly as a sign of progress, and counters in /proc/vmstat consulted for more accurate accounting):: @@ -294,6 +291,36 @@ that THP is shared. Exceeding the number would block the collapse:: A higher value may increase memory footprint for some workloads. +Khugepaged specifics for anon-mTHP collapse +------------------------------------------ + +The objective of khugepaged is to collapse memory to the highest aligned order +possible. If it fails on PMD order, it will greedily try the lower orders. + +The tunables max_ptes_shared and max_ptes_swap are considered to be zero for +mTHP collapsing; i.e the memory range must not have any shared or swap PTE +for it to be eligible for mTHP collapse. + +The tunable max_ptes_none is scaled downwards, according to the order of +the collapse. For example, if max_ptes_none = 511, and khugepaged tries to +collapse to order 4, then the memory range under consideration will become +a candidate for collapse only when the number of none PTEs (out of the 16 PTEs) +does not exceed: 511 >> (9 - 4) = 15. + +mTHP collapse is supported only if max_ptes_none is either zero or 511 (one less +than the number of entries in the PTE table). Any other value, given the scaling +logic presented above, produces what we call the "creep" problem; let the bitmask +00110000 denote a memory range mapped by 8 consecutive pagetable entries, where 0 +denotes an empty pte and 1, a pte embedding a physical folio. Let max_ptes_none = 50% +(i.e max_ptes_none = 256, which implies 256 >> (9 - 4) = 8 for our case). If order-2 and +order-3 are enabled, khugepaged may do the following: it scans the range for order-3, but +since the percentage of none ptes = 5/8 * 100 = 62.5%, it drops down to order 2. +It successfully collapses to order-2 for the first 4 PTEs, and the memory range becomes: +11110000 +Now, from the order-3 PoV, the range has 4 out of 8 PTEs filled, and the range has now +suddenly become eligible for order-3 collapse. So, we can creep into large order +collapses in a very inefficient manner. + Boot parameters ===============