From patchwork Fri Sep 30 14:19:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12995513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1B21C433F5 for ; Fri, 30 Sep 2022 14:19:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 38B6A6B0073; Fri, 30 Sep 2022 10:19:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 315678D0003; Fri, 30 Sep 2022 10:19:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 167938D0002; Fri, 30 Sep 2022 10:19:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EDCB96B0073 for ; Fri, 30 Sep 2022 10:19:54 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AC75B1C7032 for ; Fri, 30 Sep 2022 14:19:54 +0000 (UTC) X-FDA: 79968960708.06.A0BB6D9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id 23B0A16001F for ; Fri, 30 Sep 2022 14:19:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664547593; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=14k6alGur8/dUo2gK36W7ANUFkyrFHKBXBjIPk/s/yc=; b=KTomB3+KrfxnXLcH1NUglw7L5/HH82u5e51xA9NJSISjVsxwL2BHkIb18kdke2KAFi0V7I 5EzXQBGM9dE4D4yWocmCGSE83OOSBvw5j2lU47G1TAk85L3bkyYIvH5lvKdxuaf1nuwicJ sZRowVfwHXDxXzs45mxJrjPzxAgwWTs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-271-bM6wMhxeMee1M1Ad6jU0sQ-1; Fri, 30 Sep 2022 10:19:49 -0400 X-MC-Unique: bM6wMhxeMee1M1Ad6jU0sQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 99B20185A7A9; Fri, 30 Sep 2022 14:19:48 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id D0A9E1121314; Fri, 30 Sep 2022 14:19:45 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v1 1/7] selftests/vm: add test to measure MADV_UNMERGEABLE performance Date: Fri, 30 Sep 2022 16:19:25 +0200 Message-Id: <20220930141931.174362-2-david@redhat.com> In-Reply-To: <20220930141931.174362-1-david@redhat.com> References: <20220930141931.174362-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KTomB3+K; spf=pass (imf08.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664547594; a=rsa-sha256; cv=none; b=PRtr37B1P46RaMrR/OZvpwqaqSwdVVTdFgTAmY8/fiv0jxLuky8h1/91MAxg9WGyCFdyG9 7p4Jnik/2dJaGxA3hMIi68BmWnGVkv156yZrgB272ciG9ARVaSLFpXyUzDAWYIUX0q62hX Yd8aJAI/m+bDVCpgBXgeMVu1LZEaChY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664547594; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=14k6alGur8/dUo2gK36W7ANUFkyrFHKBXBjIPk/s/yc=; b=JfvXFKgwmolJIuhDLKUbnuOCUeS2SZpApqaUHMJ42VkOB0OVqtSSPy/JYZANYLORKJ4eUg 3nMqSWhravSeHNf8GiYoyBNYoqXU3vLIPVuAAAmsxk2bJzdZ/AxnB+nbM54gyzW/8bXQYT eM5GjQRSUAP7AEHorxoG6teJpQ+R/CA= X-Stat-Signature: mcuemsm1u91wy7ub4utj6stwje4hn33u X-Rspamd-Queue-Id: 23B0A16001F Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KTomB3+K; spf=pass (imf08.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1664547593-547830 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Let's add a test to measure performance of KSM breaking not triggered via COW, but triggered by disabling KSM on an area filled with KSM pages via MADV_UNMERGEABLE. Signed-off-by: David Hildenbrand Acked-by: Peter Xu --- tools/testing/selftests/vm/ksm_tests.c | 76 +++++++++++++++++++++++++- 1 file changed, 74 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/vm/ksm_tests.c b/tools/testing/selftests/vm/ksm_tests.c index f5e4e0bbd081..353eee96aeba 100644 --- a/tools/testing/selftests/vm/ksm_tests.c +++ b/tools/testing/selftests/vm/ksm_tests.c @@ -40,6 +40,7 @@ enum ksm_test_name { CHECK_KSM_NUMA_MERGE, KSM_MERGE_TIME, KSM_MERGE_TIME_HUGE_PAGES, + KSM_UNMERGE_TIME, KSM_COW_TIME }; @@ -108,7 +109,10 @@ static void print_help(void) " -P evaluate merging time and speed.\n" " For this test, the size of duplicated memory area (in MiB)\n" " must be provided using -s option\n" - " -H evaluate merging time and speed of area allocated mostly with huge pages\n" + " -H evaluate merging time and speed of area allocated mostly with huge pages\n" + " For this test, the size of duplicated memory area (in MiB)\n" + " must be provided using -s option\n" + " -D evaluate unmerging time and speed when disabling KSM.\n" " For this test, the size of duplicated memory area (in MiB)\n" " must be provided using -s option\n" " -C evaluate the time required to break COW of merged pages.\n\n"); @@ -188,6 +192,16 @@ static int ksm_merge_pages(void *addr, size_t size, struct timespec start_time, return 0; } +static int ksm_unmerge_pages(void *addr, size_t size, + struct timespec start_time, int timeout) +{ + if (madvise(addr, size, MADV_UNMERGEABLE)) { + perror("madvise"); + return 1; + } + return 0; +} + static bool assert_ksm_pages_count(long dupl_page_count) { unsigned long max_page_sharing, pages_sharing, pages_shared; @@ -560,6 +574,53 @@ static int ksm_merge_time(int mapping, int prot, int timeout, size_t map_size) return KSFT_FAIL; } +static int ksm_unmerge_time(int mapping, int prot, int timeout, size_t map_size) +{ + void *map_ptr; + struct timespec start_time, end_time; + unsigned long scan_time_ns; + + map_size *= MB; + + map_ptr = allocate_memory(NULL, prot, mapping, '*', map_size); + if (!map_ptr) + return KSFT_FAIL; + if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) { + perror("clock_gettime"); + goto err_out; + } + if (ksm_merge_pages(map_ptr, map_size, start_time, timeout)) + goto err_out; + + if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) { + perror("clock_gettime"); + goto err_out; + } + if (ksm_unmerge_pages(map_ptr, map_size, start_time, timeout)) + goto err_out; + if (clock_gettime(CLOCK_MONOTONIC_RAW, &end_time)) { + perror("clock_gettime"); + goto err_out; + } + + scan_time_ns = (end_time.tv_sec - start_time.tv_sec) * NSEC_PER_SEC + + (end_time.tv_nsec - start_time.tv_nsec); + + printf("Total size: %lu MiB\n", map_size / MB); + printf("Total time: %ld.%09ld s\n", scan_time_ns / NSEC_PER_SEC, + scan_time_ns % NSEC_PER_SEC); + printf("Average speed: %.3f MiB/s\n", (map_size / MB) / + ((double)scan_time_ns / NSEC_PER_SEC)); + + munmap(map_ptr, map_size); + return KSFT_PASS; + +err_out: + printf("Not OK\n"); + munmap(map_ptr, map_size); + return KSFT_FAIL; +} + static int ksm_cow_time(int mapping, int prot, int timeout, size_t page_size) { void *map_ptr; @@ -644,7 +705,7 @@ int main(int argc, char *argv[]) bool merge_across_nodes = KSM_MERGE_ACROSS_NODES_DEFAULT; long size_MB = 0; - while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPCH")) != -1) { + while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPCHD")) != -1) { switch (opt) { case 'a': prot = str_to_prot(optarg); @@ -701,6 +762,9 @@ int main(int argc, char *argv[]) case 'H': test_name = KSM_MERGE_TIME_HUGE_PAGES; break; + case 'D': + test_name = KSM_UNMERGE_TIME; + break; case 'C': test_name = KSM_COW_TIME; break; @@ -762,6 +826,14 @@ int main(int argc, char *argv[]) ret = ksm_merge_hugepages_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec, size_MB); break; + case KSM_UNMERGE_TIME: + if (size_MB == 0) { + printf("Option '-s' is required.\n"); + return KSFT_FAIL; + } + ret = ksm_unmerge_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, + ksm_scan_limit_sec, size_MB); + break; case KSM_COW_TIME: ret = ksm_cow_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec, page_size); From patchwork Fri Sep 30 14:19:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12995514 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D125C433FE for ; Fri, 30 Sep 2022 14:20:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D8766B0074; Fri, 30 Sep 2022 10:20:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3619C8D0003; Fri, 30 Sep 2022 10:20:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1DBDA8D0002; Fri, 30 Sep 2022 10:20:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0960D6B0074 for ; Fri, 30 Sep 2022 10:20:00 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D79F1ABF26 for ; Fri, 30 Sep 2022 14:19:59 +0000 (UTC) X-FDA: 79968960918.15.B2AFABD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id 6750B40008 for ; Fri, 30 Sep 2022 14:19:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664547598; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h/FXbSZy3Kth2MbXXSfAiotPyH32a5Ri5Xru6xlLiiI=; b=H70TQFE1exTXNuD/c7XJULdQzc7I067hhfU/r+ZSZ6UgBvkJ5eabKgFu2ZGN8geW+kW3RU tzdcdnCKrH7Ny+lZj3aSaF8R+r657zz2OX1qE3IjvDV7qez6+9CUBXKkmb0KWeash5gMnK DJtNqeBJz4fGqMxrUt+q3kvAfbB3ZDo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-81-xX1KmHDqPWyeq4x3g8JPWw-1; Fri, 30 Sep 2022 10:19:53 -0400 X-MC-Unique: xX1KmHDqPWyeq4x3g8JPWw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E3D0380C8C3; Fri, 30 Sep 2022 14:19:52 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id 05DD71121314; Fri, 30 Sep 2022 14:19:48 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v1 2/7] mm/ksm: simplify break_ksm() to not rely on VM_FAULT_WRITE Date: Fri, 30 Sep 2022 16:19:26 +0200 Message-Id: <20220930141931.174362-3-david@redhat.com> In-Reply-To: <20220930141931.174362-1-david@redhat.com> References: <20220930141931.174362-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664547599; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h/FXbSZy3Kth2MbXXSfAiotPyH32a5Ri5Xru6xlLiiI=; b=knzC8+JHgFkWQMBAhLozFqPCoZ+1dViG3VJWmfGOsnnOBKVFHUT5hhbz/b/ZiHfKRSLRu6 yYXf4ohujQVVe/2not1qRRNKLZ3XUQcbBW/pgUVNDBvtm+PqqhEhEqDixOnpGXMmjj5ClW ZkRquCWNMRT0P61qCNlSzwMLwwqtiVs= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=H70TQFE1; spf=pass (imf11.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664547599; a=rsa-sha256; cv=none; b=dQXLdc+7dLM+TKLD9VIk1jvE1ji/WMbS5l8oJo4O6RkleqStk7HTXly4KNl1LjxATfWvxG Nbw22HsxdsHvdIrjZiQGEIReKaIUFvP/zCB7qL7WF+BuPOAUAD4gIIPuSA+YD2iEZklzKH qzIbcoWBR1e9cQxdUkyhKoa5Zh14deY= X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6750B40008 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=H70TQFE1; spf=pass (imf11.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: 9y9uh3n8i4f6mmhejrqbzd3rtmgem6c1 X-HE-Tag: 1664547599-548168 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now that GUP no longer requires VM_FAULT_WRITE, break_ksm() is the sole remaining user of VM_FAULT_WRITE. As we also want to stop triggering a fake write fault and instead use FAULT_FLAG_UNSHARE -- similar to GUP-triggered unsharing when taking a R/O pin on a shared anonymous page (including KSM pages), let's stop relying on VM_FAULT_WRITE. Let's rework break_ksm() to not rely on the return value of handle_mm_fault() anymore to figure out whether COW-breaking was successful. Simply perform another follow_page() lookup to verify the result. While this makes break_ksm() slightly less efficient, we can simplify handle_mm_fault() a little and easily switch to FAULT_FLAG_UNSHARE without introducing similar KSM-specific behavior for FAULT_FLAG_UNSHARE. In my setup (AMD Ryzen 9 3900X), running the KSM selftest to test unmerge performance on 2 GiB (taskset 0x8 ./ksm_tests -D -s 2048), this results in a performance degradation of ~4% -- 5% (old: ~5250 MiB/s, new: ~5010 MiB/s). I don't think that we particularly care about that performance drop when unmerging. If it ever turns out to be an actual performance issue, we can think about a better alternative for FAULT_FLAG_UNSHARE -- let's just keep it simple for now. Signed-off-by: David Hildenbrand Acked-by: Peter Xu --- mm/ksm.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 0cd2f4b62334..e8d987fb379e 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -473,26 +473,27 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) vm_fault_t ret = 0; do { + bool ksm_page = false; + cond_resched(); page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); if (IS_ERR_OR_NULL(page)) break; if (PageKsm(page)) - ret = handle_mm_fault(vma, addr, - FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE, - NULL); - else - ret = VM_FAULT_WRITE; + ksm_page = true; put_page(page); - } while (!(ret & (VM_FAULT_WRITE | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM))); + + if (!ksm_page) + return 0; + ret = handle_mm_fault(vma, addr, + FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE, + NULL); + } while (!(ret & (VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM))); /* - * We must loop because handle_mm_fault() may back out if there's - * any difficulty e.g. if pte accessed bit gets updated concurrently. - * - * VM_FAULT_WRITE is what we have been hoping for: it indicates that - * COW has been broken, even if the vma does not permit VM_WRITE; - * but note that a concurrent fault might break PageKsm for us. + * We must loop until we no longer find a KSM page because + * handle_mm_fault() may back out if there's any difficulty e.g. if + * pte accessed bit gets updated concurrently. * * VM_FAULT_SIGBUS could occur if we race with truncation of the * backing file, which also invalidates anonymous pages: that's From patchwork Fri Sep 30 14:19:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12995515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EB4DC433F5 for ; Fri, 30 Sep 2022 14:20:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44DFB6B0073; Fri, 30 Sep 2022 10:20:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D7198D0003; Fri, 30 Sep 2022 10:20:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2516A8D0002; Fri, 30 Sep 2022 10:20:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 110476B0073 for ; Fri, 30 Sep 2022 10:20:30 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9C29A41571 for ; Fri, 30 Sep 2022 14:20:29 +0000 (UTC) X-FDA: 79968962178.25.328356A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id 6E6C7140011 for ; Fri, 30 Sep 2022 14:20:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664547626; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BgHuUZqK9el1Q359iaqosMQiNhGcGN8eqaVP266/i4g=; b=GxQ2j4lLzFfyw6uSNdL8wNvYPOC8QO06G2a2+xNsuzMkDNfvh6+zBlD2QY5We91F7Bxry0 7Gxze69epWyV7/JJGFySquTOIRPvB0kzVn/1PxnIx/m/EB2AzT3Krn0Ecup4hRFL7S3yHl cQZdAic7Jed0RWnr9wMWoFh1J5rDW7s= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-567-ew032SwtPdytGZ38382Xcg-1; Fri, 30 Sep 2022 10:20:23 -0400 X-MC-Unique: ew032SwtPdytGZ38382Xcg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CB375858F13; Fri, 30 Sep 2022 14:20:22 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3271C1121315; Fri, 30 Sep 2022 14:19:53 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v1 3/7] mm: remove VM_FAULT_WRITE Date: Fri, 30 Sep 2022 16:19:27 +0200 Message-Id: <20220930141931.174362-4-david@redhat.com> In-Reply-To: <20220930141931.174362-1-david@redhat.com> References: <20220930141931.174362-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664547629; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BgHuUZqK9el1Q359iaqosMQiNhGcGN8eqaVP266/i4g=; b=HsWqPl38PCdpY4X+PWUoc0ejx09/5W7LX7jUXrh9kDwYXMPwdqNYPy+hakKdCVzxxbc9yo EpYHhI6dsRAh5MKKFc6WeWHPYgO9No1Y8W+vBEI5uxESvTIyKGifi/UlrEmSbkl30sFgUa wgMMgvt3ImlVl0oKVeKr/g5MgFG/ZN0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GxQ2j4lL; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664547629; a=rsa-sha256; cv=none; b=n9WUw0hY8zZy6qNaD4DLqpmQLt2UqTTU0LwehyoSIueVoAXp/NnfiMztNZkhE0KKZU+59e mjwYT7ZeoeJBrByM36uCsW5kvkzrTOuEYy7W7UdKH6EzMjrsceg/uYBLdyMLqINhloBbEx Xp9yQgOSYRhJDopjcSba3d3Scuyidjw= Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GxQ2j4lL; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Stat-Signature: gnh1n1e8grwkqhqusjm8s5kejfd7uock X-Rspamd-Queue-Id: 6E6C7140011 X-Rspamd-Server: rspam08 X-HE-Tag: 1664547627-377698 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: All users -- GUP and KSM -- are gone, let's just remove it. Signed-off-by: David Hildenbrand Acked-by: Peter Xu --- include/linux/mm_types.h | 3 --- mm/huge_memory.c | 2 +- mm/memory.c | 9 ++++----- 3 files changed, 5 insertions(+), 9 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8f30f262431c..6a1375dcb4ac 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -807,7 +807,6 @@ typedef __bitwise unsigned int vm_fault_t; * @VM_FAULT_OOM: Out Of Memory * @VM_FAULT_SIGBUS: Bad access * @VM_FAULT_MAJOR: Page read from storage - * @VM_FAULT_WRITE: Special case for get_user_pages * @VM_FAULT_HWPOISON: Hit poisoned small page * @VM_FAULT_HWPOISON_LARGE: Hit poisoned large page. Index encoded * in upper bits @@ -828,7 +827,6 @@ enum vm_fault_reason { VM_FAULT_OOM = (__force vm_fault_t)0x000001, VM_FAULT_SIGBUS = (__force vm_fault_t)0x000002, VM_FAULT_MAJOR = (__force vm_fault_t)0x000004, - VM_FAULT_WRITE = (__force vm_fault_t)0x000008, VM_FAULT_HWPOISON = (__force vm_fault_t)0x000010, VM_FAULT_HWPOISON_LARGE = (__force vm_fault_t)0x000020, VM_FAULT_SIGSEGV = (__force vm_fault_t)0x000040, @@ -854,7 +852,6 @@ enum vm_fault_reason { { VM_FAULT_OOM, "OOM" }, \ { VM_FAULT_SIGBUS, "SIGBUS" }, \ { VM_FAULT_MAJOR, "MAJOR" }, \ - { VM_FAULT_WRITE, "WRITE" }, \ { VM_FAULT_HWPOISON, "HWPOISON" }, \ { VM_FAULT_HWPOISON_LARGE, "HWPOISON_LARGE" }, \ { VM_FAULT_SIGSEGV, "SIGSEGV" }, \ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 84bf1d5f6b7e..b351c1d4f858 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1376,7 +1376,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf) if (pmdp_set_access_flags(vma, haddr, vmf->pmd, entry, 1)) update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); spin_unlock(vmf->ptl); - return VM_FAULT_WRITE; + return 0; } unlock_fallback: diff --git a/mm/memory.c b/mm/memory.c index e49faa0a1f9a..6e2f47d05f2b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3240,7 +3240,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } delayacct_wpcopy_end(); - return (page_copied && !unshare) ? VM_FAULT_WRITE : 0; + return 0; oom_free_new: put_page(new_page); oom: @@ -3304,14 +3304,14 @@ static vm_fault_t wp_pfn_shared(struct vm_fault *vmf) return finish_mkwrite_fault(vmf); } wp_page_reuse(vmf); - return VM_FAULT_WRITE; + return 0; } static vm_fault_t wp_page_shared(struct vm_fault *vmf) __releases(vmf->ptl) { struct vm_area_struct *vma = vmf->vma; - vm_fault_t ret = VM_FAULT_WRITE; + vm_fault_t ret = 0; get_page(vmf->page); @@ -3462,7 +3462,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) return 0; } wp_page_reuse(vmf); - return VM_FAULT_WRITE; + return 0; } else if (unshare) { /* No anonymous page -> nothing to do. */ pte_unmap_unlock(vmf->pte, vmf->ptl); @@ -3960,7 +3960,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (vmf->flags & FAULT_FLAG_WRITE) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; - ret |= VM_FAULT_WRITE; } rmap_flags |= RMAP_EXCLUSIVE; } From patchwork Fri Sep 30 14:19:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12995516 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2555AC433F5 for ; Fri, 30 Sep 2022 14:20:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CD176B0074; Fri, 30 Sep 2022 10:20:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9555F8D0003; Fri, 30 Sep 2022 10:20:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A80D8D0002; Fri, 30 Sep 2022 10:20:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6886A6B0074 for ; Fri, 30 Sep 2022 10:20:33 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 498CDA0FF6 for ; Fri, 30 Sep 2022 14:20:33 +0000 (UTC) X-FDA: 79968962346.02.21FEE93 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf12.hostedemail.com (Postfix) with ESMTP id C20FE40004 for ; Fri, 30 Sep 2022 14:20:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664547632; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AYManVlpCvt6iYiv3mntijvloAa1rW/lzIzdVimSiKE=; b=is9OulGwAm8tdzpIUeyhsY7C/MQE38xWj9iPXp3iSt43nyNOUFf/p1DvqPzBE/9b4z7Kgj tQMhhym92YShrbIOqiktuVcwucgFZshvpo1z/6CALEO0M+XidvU9/D5eADOyr4vjlR/4S3 ccVorqq2tpWR1pX4sInhMXKwo78emwc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-436-QCmRDKFMPtK1hafPaPCFGA-1; Fri, 30 Sep 2022 10:20:31 -0400 X-MC-Unique: QCmRDKFMPtK1hafPaPCFGA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 31D02101A52A; Fri, 30 Sep 2022 14:20:30 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id 691AE112132E; Fri, 30 Sep 2022 14:20:23 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v1 4/7] mm/ksm: fix KSM COW breaking with userfaultfd-wp via FAULT_FLAG_UNSHARE Date: Fri, 30 Sep 2022 16:19:28 +0200 Message-Id: <20220930141931.174362-5-david@redhat.com> In-Reply-To: <20220930141931.174362-1-david@redhat.com> References: <20220930141931.174362-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664547632; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AYManVlpCvt6iYiv3mntijvloAa1rW/lzIzdVimSiKE=; b=FHbxYHBYpj4Cset08cBP+Y57CFHnML9Fw7fpfJ0k8/sklteE8MKfhymKZWJtITTknFHuzz GnKnbqz9NrjZKDEvklPbAbsqmjOVvxQDCFAJaucEP7y1P3XjULNk6dsWx791FxbtanuD++ eZPpEOKg5EEUqUMqlOM+z7uLWCum244= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=is9OulGw; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664547632; a=rsa-sha256; cv=none; b=r/1iRRf0MkmFxEflnn+Vyd455DSPLcwVZdIPpYqAPviTwXAConfOmdGzEOx351a7Lr3RZK S39yKxWNMw5ULZub9zeAWLfNUgL+XpMxR3buyFuBQjaSxz8bwucA/kmB4xj0vOBm/m5gyt BqC3f445Gyx1qNDX8cdKUCY82PCOcxs= X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C20FE40004 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=is9OulGw; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: hdk7ch3wiesigc5k81i6i6x4ra6j5yy4 X-HE-Tag: 1664547632-939455 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Let's stop breaking COW via a fake write fault and let's use FAULT_FLAG_UNSHARE instead. This avoids any wrong side effects of the fake write fault, such as mapping the PTE writable and marking the pte dirty/softdirty. Also, this fixes KSM interaction with userfaultfd-wp: when we have a KSM page that's write-protected by userfaultfd, break_ksm()->handle_mm_fault() will fail with VM_FAULT_SIGBUS and will simpy return in break_ksm() with 0. The warning in dmesg indicates this wrong handling: [ 230.096368] FAULT_FLAG_ALLOW_RETRY missing 881 [ 230.100822] CPU: 1 PID: 1643 Comm: ksm-uffd-wp [...] [ 230.110124] Hardware name: [...] [ 230.117775] Call Trace: [ 230.120227] [ 230.122334] dump_stack_lvl+0x44/0x5c [ 230.126010] handle_userfault.cold+0x14/0x19 [ 230.130281] ? tlb_finish_mmu+0x65/0x170 [ 230.134207] ? uffd_wp_range+0x65/0xa0 [ 230.137959] ? _raw_spin_unlock+0x15/0x30 [ 230.141972] ? do_wp_page+0x50/0x590 [ 230.145551] __handle_mm_fault+0x9f5/0xf50 [ 230.149652] ? mmput+0x1f/0x40 [ 230.152712] handle_mm_fault+0xb9/0x2a0 [ 230.156550] break_ksm+0x141/0x180 [ 230.159964] unmerge_ksm_pages+0x60/0x90 [ 230.163890] ksm_madvise+0x3c/0xb0 [ 230.167295] do_madvise.part.0+0x10c/0xeb0 [ 230.171396] ? do_syscall_64+0x67/0x80 [ 230.175157] __x64_sys_madvise+0x5a/0x70 [ 230.179082] do_syscall_64+0x58/0x80 [ 230.182661] ? do_syscall_64+0x67/0x80 [ 230.186413] entry_SYSCALL_64_after_hwframe+0x63/0xcd -------------------------------------------------------------------------- #include #include #include #include #include #include #include #include #include #include #define MMAP_SIZE (2 * 1024 * 1024u) static char *map; int uffd; static int setup_uffd(void) { struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; struct uffdio_writeprotect uffd_writeprotect; struct uffdio_range uffd_range; uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); if (uffd < 0) { fprintf(stderr, "syscall() failed: %d\n", errno); return -errno; } uffdio_api.api = UFFD_API; uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP; if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) { fprintf(stderr, "UFFDIO_API failed: %d\n", errno); return -errno; } if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) { fprintf(stderr, "UFFD_FEATURE_WRITEPROTECT missing\n"); return -ENOSYS; } /* Register UFFD-WP */ uffdio_register.range.start = (unsigned long) map; uffdio_register.range.len = MMAP_SIZE; uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) < 0) { fprintf(stderr, "UFFDIO_REGISTER failed: %d\n", errno); return -errno; } /* Writeprotect the range. */ uffd_writeprotect.range.start = (unsigned long) map; uffd_writeprotect.range.len = MMAP_SIZE; uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP; if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) { fprintf(stderr, "UFFDIO_WRITEPROTECT failed: %d\n", errno); return -errno; } return 0; } int main(int argc, char **argv) { int ksm_fd, ret; ksm_fd = open("/sys/kernel/mm/ksm/run", O_RDWR); if (ksm_fd < 0) { fprintf(stderr, "KSM not available\n"); return -errno; } map = mmap(NULL, MMAP_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } ret = madvise(map, MMAP_SIZE, MADV_NOHUGEPAGE); if (ret) { fprintf(stderr, "MADV_NOHUGEPAGE failed\n"); return -errno; } /* Fill with same value and trigger merging. */ memset(map, 0xff, MMAP_SIZE); ret = madvise(map, MMAP_SIZE, MADV_MERGEABLE); if (ret) { fprintf(stderr, "MADV_MERGEABLE failed\n"); return -errno; } /* * Run KSM to trigger merging and wait a bit until merging should be * done. */ if (write(ksm_fd, "1", 1) != 1) { fprintf(stderr, "Running KSM failed\n"); } sleep(10); /* Write-protect the range with UFFD. */ if (setup_uffd()) return 1; /* Trigger unsharing. */ ret = madvise(map, MMAP_SIZE, MADV_UNMERGEABLE); if (ret) { fprintf(stderr, "MADV_UNMERGEABLE failed\n"); return -errno; } return 0; } -------------------------------------------------------------------------- Consequently, we will no longer trigger a fake write fault and break COW without any such side-effects. This is primarily a fix for KSM+userfaultfd-wp, however, the fake write fault was always questionable. As this fix is not easy to backport and it's not very critical, let's not cc stable. Fixes: 529b930b87d9 ("userfaultfd: wp: hook userfault handler to write protection fault") Signed-off-by: David Hildenbrand Acked-by: Peter Xu --- mm/ksm.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index e8d987fb379e..4d7bcf7da7c3 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -453,17 +453,15 @@ static inline bool ksm_test_exit(struct mm_struct *mm) } /* - * We use break_ksm to break COW on a ksm page: it's a stripped down + * We use break_ksm to break COW on a ksm page by triggering unsharing, + * such that the ksm page will get replaced by an exclusive anonymous page. * - * if (get_user_pages(addr, 1, FOLL_WRITE, &page, NULL) == 1) - * put_page(page); - * - * but taking great care only to touch a ksm page, in a VM_MERGEABLE vma, + * We take great care only to touch a ksm page, in a VM_MERGEABLE vma, * in case the application has unmapped and remapped mm,addr meanwhile. * Could a ksm page appear anywhere else? Actually yes, in a VM_PFNMAP * mmap of /dev/mem, where we would not want to touch it. * - * FAULT_FLAG/FOLL_REMOTE are because we do this outside the context + * FAULT_FLAG_REMOTE/FOLL_REMOTE are because we do this outside the context * of the process that owns 'vma'. We also do not want to enforce * protection keys here anyway. */ @@ -487,7 +485,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) if (!ksm_page) return 0; ret = handle_mm_fault(vma, addr, - FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE, + FAULT_FLAG_UNSHARE | FAULT_FLAG_REMOTE, NULL); } while (!(ret & (VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM))); /* From patchwork Fri Sep 30 14:19:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12995517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11BC4C433FE for ; Fri, 30 Sep 2022 14:20:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A41366B0078; Fri, 30 Sep 2022 10:20:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9CA8E8D0003; Fri, 30 Sep 2022 10:20:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81DA48D0002; Fri, 30 Sep 2022 10:20:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6C5806B0078 for ; Fri, 30 Sep 2022 10:20:36 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2F6ADABF26 for ; Fri, 30 Sep 2022 14:20:36 +0000 (UTC) X-FDA: 79968962472.20.6FF99FC Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf26.hostedemail.com (Postfix) with ESMTP id BFDF5140021 for ; Fri, 30 Sep 2022 14:20:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664547634; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Os3ebntrNckY9ikYeoLSgVh4tT4cYPUxPafC1slc+E=; b=agUYkZXHcYsfF7Wsa/RAZkhDCUvIPfqMKi9JGJeCRwA62L+0LcRq+7hpyqcUEipR8fcPIp Yb4aFSYwAPdsahWQoxVkxPARfsigLIy1COPFaqgYeHCExEf0xHK+F4dWXez0p40bExVeLU tBRrmIazGPWTLk/i7rxlhfWMkIjjPjs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-168-gjvSGRHeOl-RFUWtXhoFUQ-1; Fri, 30 Sep 2022 10:20:33 -0400 X-MC-Unique: gjvSGRHeOl-RFUWtXhoFUQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C1544101E150; Fri, 30 Sep 2022 14:20:32 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id 92C461121315; Fri, 30 Sep 2022 14:20:30 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v1 5/7] mm/pagewalk: add walk_page_range_vma() Date: Fri, 30 Sep 2022 16:19:29 +0200 Message-Id: <20220930141931.174362-6-david@redhat.com> In-Reply-To: <20220930141931.174362-1-david@redhat.com> References: <20220930141931.174362-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664547635; a=rsa-sha256; cv=none; b=5r0kfO20siYI2aweM/71jwihNQ3Qm1xWBE5YvIMjPAUUWOf3KNeH7rEWr/LNEozXvolQcH WAhDDO2tJYsobj7KlMG1Fq7zQJBVCH7YuEHY7aUuO9y8AciVtE3Eyhi6zmVJdZjuIqHTEb Vg7WWCVTPQYpuKWomI6K4P5tPDpVdyA= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=agUYkZXH; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664547635; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5Os3ebntrNckY9ikYeoLSgVh4tT4cYPUxPafC1slc+E=; b=JN5B93hjLF+JxZQW0MCpP9l+23edAWh4TtonIRP+V2f9jOY8bIDS33MQ132/gWyLElgpta JueNZHeGARzL6wu6UWaaCgd3U9NPaLmUxTdZA5L3v4By/RFyo4hdTFt2ZZGjI9WAfEG8Li rdn4VW51aenWZyro8HgrFQq+u84EvbM= X-Stat-Signature: axkijw43nnusspxxdeui35kns8fqb318 X-Rspamd-Queue-Id: BFDF5140021 X-Rspamd-Server: rspam04 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=agUYkZXH; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com X-Rspam-User: X-HE-Tag: 1664547635-23538 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Let's add walk_page_range_vma(), which is similar to walk_page_vma(), however, is only interested in a subset of the VMA range. To be used in KSM code to stop using follow_page() next. Signed-off-by: David Hildenbrand --- include/linux/pagewalk.h | 3 +++ mm/pagewalk.c | 27 +++++++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index f3fafb731ffd..2f8f6cc980b4 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -99,6 +99,9 @@ int walk_page_range_novma(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, pgd_t *pgd, void *private); +int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, + unsigned long end, const struct mm_walk_ops *ops, + void *private); int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops, void *private); int walk_page_mapping(struct address_space *mapping, pgoff_t first_index, diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 131b2b335b2c..757c075da231 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -514,6 +514,33 @@ int walk_page_range_novma(struct mm_struct *mm, unsigned long start, return __walk_page_range(start, end, &walk); } +int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, + unsigned long end, const struct mm_walk_ops *ops, + void *private) +{ + struct mm_walk walk = { + .ops = ops, + .mm = vma->vm_mm, + .vma = vma, + .private = private, + }; + int err; + + if (start >= end || !walk.mm) + return -EINVAL; + if (start < vma->vm_start || end > vma->vm_end) + return -EINVAL; + + mmap_assert_locked(walk.mm); + + err = walk_page_test(start, end, &walk); + if (err > 0) + return 0; + if (err < 0) + return err; + return __walk_page_range(start, end, &walk); +} + int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops, void *private) { From patchwork Fri Sep 30 14:19:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12995518 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07C01C4332F for ; Fri, 30 Sep 2022 14:20:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90A048D0002; Fri, 30 Sep 2022 10:20:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 892176B007E; Fri, 30 Sep 2022 10:20:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 732C78D0002; Fri, 30 Sep 2022 10:20:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5FB736B007D for ; Fri, 30 Sep 2022 10:20:51 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 24DFC41577 for ; Fri, 30 Sep 2022 14:20:51 +0000 (UTC) X-FDA: 79968963102.27.446B8E3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf03.hostedemail.com (Postfix) with ESMTP id CD5122000B for ; Fri, 30 Sep 2022 14:20:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664547650; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QdWmhT+XCWVUBSGE3ZoFfg/O04jXqhSLTEOmUwlj3I8=; b=JyFXUR7ZTukTDqSBmzhTf0BteO513nY8Oai/mn1btshLc+JXFKBx7EP25jSzwwo/Lj0Jcz 63bXsVMQrUK1wtLQ2kuWEOUGUW4MsgGW1bi+FrTavXbHvjJ+DPmUqZW42+dg0p7QLqN5ZG DrH4FFpeNkK4P5+dDWrT87p3ws0nMME= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-499-ZiHXnUcHNF6VF1dflaLRxQ-1; Fri, 30 Sep 2022 10:20:47 -0400 X-MC-Unique: ZiHXnUcHNF6VF1dflaLRxQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8F6342A5955A; Fri, 30 Sep 2022 14:20:46 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0CB731121314; Fri, 30 Sep 2022 14:20:32 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v1 6/7] mm/ksm: convert break_ksm() to use walk_page_range_vma() Date: Fri, 30 Sep 2022 16:19:30 +0200 Message-Id: <20220930141931.174362-7-david@redhat.com> In-Reply-To: <20220930141931.174362-1-david@redhat.com> References: <20220930141931.174362-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JyFXUR7Z; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664547650; a=rsa-sha256; cv=none; b=4FjdYbxWBM5P/aqVdzefA/P85BKpUaCydPN2xuwVrjWk2d6SMPaJfDQS7E4pyiDFFPEQT8 oTM7kHoKTnxWrq10kWiYUHAW76QJHY+1+9tGhh/3E9BFFtayASszpHqg/giHdAE1/Ie4q7 5z1r1w9s9LLs2PtscNGS1hD9YPkjuns= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664547650; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QdWmhT+XCWVUBSGE3ZoFfg/O04jXqhSLTEOmUwlj3I8=; b=7S0sAWvSuQ28SytAZKT3645JWLtiLiJ2vhKn0JQGfAEg/WhpUg8v4oxnVqHZ3nPXOBNSbS Zx5uYDdBim01UL5yVEvHeBMZsz8oYkd+wm99PWBOTgXw7/9wLWNXC4FNWp/yeDXATcG4WV 6Lkp0BcyV3gv/wY2jfOGlh2GNE6gYV4= X-Stat-Signature: i8oeuthdyed9kbcdumm1mpfjrfngu9xh X-Rspamd-Queue-Id: CD5122000B Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JyFXUR7Z; spf=pass (imf03.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1664547650-691075 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: FOLL_MIGRATION exists only for the purpose of break_ksm(), and actually, there is not even the need to wait for the migration to finish, we only want to know if we're dealing with a KSM page. Using follow_page() just to identify a KSM page overcomplicates GUP code. Let's use walk_page_range_vma() instead, because we don't actually care about the page itself, we only need to know a single property -- no need to even grab a reference on the page. In my setup (AMD Ryzen 9 3900X), running the KSM selftest to test unmerge performance on 2 GiB (taskset 0x8 ./ksm_tests -D -s 2048), this results in a performance degradation of ~4% (old: ~5010 MiB/s, new: ~4800 MiB/s). I don't think we particularly care for now. Signed-off-by: David Hildenbrand Signed-off-by: David Hildenbrand --- mm/ksm.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 62 insertions(+), 8 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 4d7bcf7da7c3..814c1a37c323 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include "internal.h" @@ -452,6 +453,60 @@ static inline bool ksm_test_exit(struct mm_struct *mm) return atomic_read(&mm->mm_users) == 0; } +int break_ksm_pud_entry(pud_t *pud, unsigned long addr, unsigned long next, + struct mm_walk *walk) +{ + /* We only care about page tables to walk to a single base page. */ + if (pud_leaf(*pud) || !pud_present(*pud)) + return 1; + return 0; +} + +int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long next, + struct mm_walk *walk) +{ + bool *ksm_page = walk->private; + struct page *page = NULL; + pte_t *pte, ptent; + spinlock_t *ptl; + + /* We only care about page tables to walk to a single base page. */ + if (pmd_leaf(*pmd) || !pmd_present(*pmd)) + return 1; + + /* + * We only lookup a single page (a) no need to iterate; and (b) + * always return 1 to exit immediately and not iterate in the caller. + */ + pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + ptent = *pte; + + if (pte_none(ptent)) + return 1; + if (!pte_present(ptent)) { + swp_entry_t entry = pte_to_swp_entry(ptent); + + /* + * We only care about migration of KSM pages. As KSM pages + * remain KSM pages until freed, no need to wait here for + * migration to end to identify such. + */ + if (is_migration_entry(entry)) + page = pfn_swap_entry_to_page(entry); + } else { + page = vm_normal_page(walk->vma, addr, ptent); + } + if (page && PageKsm(page)) + *ksm_page = true; + pte_unmap_unlock(pte, ptl); + return 1; +} + +static const struct mm_walk_ops break_ksm_ops = { + .pud_entry = break_ksm_pud_entry, + .pmd_entry = break_ksm_pmd_entry, +}; + /* * We use break_ksm to break COW on a ksm page by triggering unsharing, * such that the ksm page will get replaced by an exclusive anonymous page. @@ -467,20 +522,19 @@ static inline bool ksm_test_exit(struct mm_struct *mm) */ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) { - struct page *page; vm_fault_t ret = 0; + if (WARN_ON_ONCE(!IS_ALIGNED(addr, PAGE_SIZE))) + return -EINVAL; + do { bool ksm_page = false; cond_resched(); - page = follow_page(vma, addr, - FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); - if (IS_ERR_OR_NULL(page)) - break; - if (PageKsm(page)) - ksm_page = true; - put_page(page); + ret = walk_page_range_vma(vma, addr, addr + PAGE_SIZE, + &break_ksm_ops, &ksm_page); + if (WARN_ON_ONCE(ret < 0)) + return ret; if (!ksm_page) return 0; From patchwork Fri Sep 30 14:19:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12995519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C806C433F5 for ; Fri, 30 Sep 2022 14:21:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4007A8D0003; Fri, 30 Sep 2022 10:21:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3892C6B007E; Fri, 30 Sep 2022 10:21:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 202868D0003; Fri, 30 Sep 2022 10:21:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 12B306B007D for ; Fri, 30 Sep 2022 10:21:03 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id CBEB5A0FF6 for ; Fri, 30 Sep 2022 14:21:02 +0000 (UTC) X-FDA: 79968963564.14.37F2827 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 60F88180008 for ; Fri, 30 Sep 2022 14:21:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1664547661; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NRwNrzzGpz8bO60Pn63vXNqmewRV7ohymI+Kzy3OQVY=; b=VT+Lz1lpMVZKuRSEJPXHYXG5yi11DJMm8Ln/YS/I35+ypDIx1jS06xEVpLkQ0S0JBKfODN PdLuAVuhfTWygGZs6EMo/N3BBkwMFRYXT11gCt/N3SvK+9zZQc2SoGTz+U+JhtqPKrhXVr yABTImqPcFm88EmwqbXMnNJ2uZAio7E= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-192-ced8YZfqPOWKMAfTZSCfpQ-1; Fri, 30 Sep 2022 10:20:56 -0400 X-MC-Unique: ced8YZfqPOWKMAfTZSCfpQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 03AAE855305; Fri, 30 Sep 2022 14:20:56 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id DE74A112132C; Fri, 30 Sep 2022 14:20:46 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v1 7/7] mm/gup: remove FOLL_MIGRATION Date: Fri, 30 Sep 2022 16:19:31 +0200 Message-Id: <20220930141931.174362-8-david@redhat.com> In-Reply-To: <20220930141931.174362-1-david@redhat.com> References: <20220930141931.174362-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664547662; a=rsa-sha256; cv=none; b=Wy9+rJ9x6Gbj34Ext2qrSSQaEL/h8TkJ7gJRMsQ8CuGBqH5GiaPVPSydshRNa6WYaUl59J NFy3gJTMlH+AOffvsUqw1gpypq0sr1OgGzXCbmhojxjp3WtLCTAxSVRRAvXxSFrm/XhYXf gXbKDRghXBcJU5RMzAWnJhUOOra8sfU= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VT+Lz1lp; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664547662; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NRwNrzzGpz8bO60Pn63vXNqmewRV7ohymI+Kzy3OQVY=; b=Ss3AaJBbswmj2wY9k04Men97Xb617zjHlnfUpon5foRr7OOw820SlU035W4tLPa0PDGNGv QI8Cu7OxJ3EvUbNBGPlI6+tQbZO4bBu9YYr4NM1/q1ju22PgElT8+r55R06ZpkEzMw06eM jjYdUfrwSMUSqZU7KAs0E1mzSiHME9M= X-Stat-Signature: 78gatk53ax1yapuk5uupi9aga61d4n1q X-Rspamd-Queue-Id: 60F88180008 X-Rspamd-Server: rspam04 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VT+Lz1lp; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com X-Rspam-User: X-HE-Tag: 1664547662-415274 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Fortunately, the last user (KSM) is gone, so let's just remove this rather special code from generic GUP handling -- especially because KSM never required the PMD handling as KSM only deals with individual base pages. Signed-off-by: David Hildenbrand --- include/linux/mm.h | 1 - mm/gup.c | 55 +++++----------------------------------------- 2 files changed, 5 insertions(+), 51 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index e56dd8f7eae1..4c176e308ead 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2942,7 +2942,6 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, * and return without waiting upon it */ #define FOLL_NOFAULT 0x80 /* do not fault in pages */ #define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */ -#define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */ #define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */ #define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ #define FOLL_ANON 0x8000 /* don't do file mappings */ diff --git a/mm/gup.c b/mm/gup.c index ce00a4c40da8..37195c549f68 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -537,30 +537,13 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) == (FOLL_PIN | FOLL_GET))) return ERR_PTR(-EINVAL); -retry: if (unlikely(pmd_bad(*pmd))) return no_page_table(vma, flags); ptep = pte_offset_map_lock(mm, pmd, address, &ptl); pte = *ptep; - if (!pte_present(pte)) { - swp_entry_t entry; - /* - * KSM's break_ksm() relies upon recognizing a ksm page - * even while it is being migrated, so for that case we - * need migration_entry_wait(). - */ - if (likely(!(flags & FOLL_MIGRATION))) - goto no_page; - if (pte_none(pte)) - goto no_page; - entry = pte_to_swp_entry(pte); - if (!is_migration_entry(entry)) - goto no_page; - pte_unmap_unlock(ptep, ptl); - migration_entry_wait(mm, pmd, address); - goto retry; - } + if (!pte_present(pte)) + goto no_page; if (pte_protnone(pte) && !gup_can_follow_protnone(flags)) goto no_page; @@ -682,28 +665,8 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, return page; return no_page_table(vma, flags); } -retry: - if (!pmd_present(pmdval)) { - /* - * Should never reach here, if thp migration is not supported; - * Otherwise, it must be a thp migration entry. - */ - VM_BUG_ON(!thp_migration_supported() || - !is_pmd_migration_entry(pmdval)); - - if (likely(!(flags & FOLL_MIGRATION))) - return no_page_table(vma, flags); - - pmd_migration_entry_wait(mm, pmd); - pmdval = READ_ONCE(*pmd); - /* - * MADV_DONTNEED may convert the pmd to null because - * mmap_lock is held in read mode - */ - if (pmd_none(pmdval)) - return no_page_table(vma, flags); - goto retry; - } + if (!pmd_present(pmdval)) + return no_page_table(vma, flags); if (pmd_devmap(pmdval)) { ptl = pmd_lock(mm, pmd); page = follow_devmap_pmd(vma, address, pmd, flags, &ctx->pgmap); @@ -717,18 +680,10 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, if (pmd_protnone(pmdval) && !gup_can_follow_protnone(flags)) return no_page_table(vma, flags); -retry_locked: ptl = pmd_lock(mm, pmd); - if (unlikely(pmd_none(*pmd))) { - spin_unlock(ptl); - return no_page_table(vma, flags); - } if (unlikely(!pmd_present(*pmd))) { spin_unlock(ptl); - if (likely(!(flags & FOLL_MIGRATION))) - return no_page_table(vma, flags); - pmd_migration_entry_wait(mm, pmd); - goto retry_locked; + return no_page_table(vma, flags); } if (unlikely(!pmd_trans_huge(*pmd))) { spin_unlock(ptl);