From patchwork Wed Jun 29 03:54:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sierra Guiza, Alejandro (Alex)" X-Patchwork-Id: 12899270 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8083CC43334 for ; Wed, 29 Jun 2022 03:55:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AFB38E0005; Tue, 28 Jun 2022 23:55:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 45EAD8E0002; Tue, 28 Jun 2022 23:55:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D8D48E0005; Tue, 28 Jun 2022 23:55:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1E17A8E0002 for ; Tue, 28 Jun 2022 23:55:12 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E5BE23528F for ; Wed, 29 Jun 2022 03:55:11 +0000 (UTC) X-FDA: 79629908022.22.7114BE0 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) by imf15.hostedemail.com (Postfix) with ESMTP id 83434A0023 for ; Wed, 29 Jun 2022 03:55:11 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d+37sKLM2zPm1Tpt2kZJzd4AQBevnr5B9nQYN1d2azP3cD7Iuy1hlu2u2Tfg8FWfobXzQ1pzqc7IvJRfdWU96++bcFZTpr1oRNckBHOFUCibr4yv6sVV36b58362zMDN3EPETuRrUt/6zvgCnn1MYpop39lsOHqcOvecXhKswXkYsxpTePUF8OtCpjrS0Kb+0bnwnbSaXtxNuBstRwmhKoUoH59WxcwjZzBsbq3oRkBRYIQSbOktnlfl2MD7MEICnueBCYR3UxAIl4pCGDQB+eWxIk1XrrfATdHp7KxS72vo3VY/iPhhUmgR66zWDeqUHxJPShZatt2NlWTUtOVtQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1ro+esyoxspRcCdLcy2c3opq7pjdPJz4su/UKS3EEj8=; b=L5/dd9byXRKiABq+FxM/3bT4vVmqwsukUec6B+O6Ao4202JQD+kZUaSmb1eI96Q+ukmFvBuUP2A+58OB59eyjsFSHMgx8fTWjhhL1NToNbNWhRTpb1shV+w99NueuzQH1kuoy5dmoqagduS2VBNKa4Fvzn/QfWIshuPQfhN/X3ZIi2AZuJj0SrNHx8SGk7shmVAy5DpCz1IbYQG8Oqr34mGuyPHKSAyNLnL1OtBpHYtpw3EdV/8+Zwh2m+To5kaMDhYjrt9V/RtaATmnxhvNE/+mAB1dPHTP9E5qqGna/e3vhYRLgHJ5LRpowamXx7bPId+TobkvzZRFeCrW2XS6lA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=nvidia.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1ro+esyoxspRcCdLcy2c3opq7pjdPJz4su/UKS3EEj8=; b=RzxQvGFt0/WA09990l1te3UIfpr6Hob+djzBt1KnvXU+B8ddagrsMEuRQE5ASUkUmhqollT9V8e//LWS5E2Pi/KVLcnbnMzPIxQjW3iJ4Fm67uSM0OdxncSaHvz6ND4KX/xQK4Nwpj5pCKxHZexG0uAKJdr31FOjU4IAaRRXbi4= Received: from DM6PR03CA0073.namprd03.prod.outlook.com (2603:10b6:5:333::6) by BN6PR12MB1140.namprd12.prod.outlook.com (2603:10b6:404:19::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.16; Wed, 29 Jun 2022 03:55:08 +0000 Received: from DM6NAM11FT068.eop-nam11.prod.protection.outlook.com (2603:10b6:5:333:cafe::27) by DM6PR03CA0073.outlook.office365.com (2603:10b6:5:333::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.16 via Frontend Transport; Wed, 29 Jun 2022 03:55:08 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DM6NAM11FT068.mail.protection.outlook.com (10.13.173.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5373.15 via Frontend Transport; Wed, 29 Jun 2022 03:55:08 +0000 Received: from alex-MS-7B09.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Tue, 28 Jun 2022 22:55:06 -0500 From: Alex Sierra To: CC: , , , , , , , , , , , , Subject: [PATCH v7 03/14] mm: handling Non-LRU pages returned by vm_normal_pages Date: Tue, 28 Jun 2022 22:54:15 -0500 Message-ID: <20220629035426.20013-4-alex.sierra@amd.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220629035426.20013-1-alex.sierra@amd.com> References: <20220629035426.20013-1-alex.sierra@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: e5828bf4-80f9-4764-2478-08da5983289b X-MS-TrafficTypeDiagnostic: BN6PR12MB1140:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: aoCxg7DJSQpR/i8OGojddBkZwD0fqxxrxUEQ9wTvsMPlUl+4/cv1hFnrdPVrvnIK8HxdOJFzy7uP3U5arN9s2Zt315iG1OH6yOyjQS8xIZedico4JeiQuJi0jO6hfng8KFhU/ImYyN8nbsiiBA7XwQUTPb1RyXuaS0ZKQnQuSm11L/Mc5T1QPNjXEaIWgvtiAMKFgxt2uzHHln9GK5I6XLa7hh3Co9Lpv5N3E7w15PvIofHu9yWPZp5ChwIGfy9ONULiuFUj98mNiJZRJ7PsW36KsWGnEV6R5I1yWF1WF6okK39dRwkv2FGlkycqtktpdB57hURKOBcIgtxgHgbXnj1EoBHqXPYJjzEAkNC0ER6Nq1O/M/PyjGY62MqV/7FbcGu4BEYmXF8z0hMpRUYYsS04ti1LgdfByHe3e5yIcI+RjG7XiQOevEgQZJxxWyK5wqr1rEH9kzvn1NHcx18unbj1+E/DHxUgOB2SyUJFuEO9lFFOLdXdNqXfE4lytTs6+RQ9gH8DzZzq0Ez1DCB9CYr96YquwZEd+8k8MV0RfPXvwniPPpGFDjpue61R6i84eLIIZrCLVVVAFMjiP/YVsJiQ6vBIG8S0B3NelWBgONd67cCO5yiqo/6Mi0sppJgeO6aKP8P5XW7KStd7VCGMMfzpuhpqBJcqQ8rL8fHi8+tOErFrVgoNwVA2dWosNVnJClaXiJkf7PZ7JJf59/zY1ONoVuQr6oGT2MbUVez1bMDRh8XVVO3RiNL0m44SGpxe4gWf6g1REiAilParmhtMRexxtVbVLK6JndtaujKL3+mFxo/2dXytaBUVLIxVcx5fRfmBEwCDQNFhIcN3TFmUCA== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230016)(4636009)(346002)(376002)(396003)(39860400002)(136003)(36840700001)(40470700004)(46966006)(6666004)(36860700001)(70206006)(2906002)(47076005)(4326008)(81166007)(36756003)(82740400003)(41300700001)(5660300002)(6916009)(40480700001)(2616005)(478600001)(86362001)(7696005)(26005)(8936002)(16526019)(83380400001)(82310400005)(44832011)(356005)(316002)(186003)(7416002)(426003)(54906003)(8676002)(70586007)(1076003)(336012)(40460700003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jun 2022 03:55:08.2096 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e5828bf4-80f9-4764-2478-08da5983289b X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT068.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR12MB1140 ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1656474911; a=rsa-sha256; cv=pass; b=WvJshiQLgT2uZMPzPM4p/qKGYZwc6ESq+l1afBC47/uwsZjY8FS51moEXXWRxG3OU3J8w5 egwsbTJFr2yxP3yWVX3jWp5fswZdrD9Cve5ina2BqzqgkoFvZlvMzNpXCNx+05bW4bYuYN swN/qdh5A5+vVcPZCf3Setj0J2ecw5g= ARC-Authentication-Results: i=2; imf15.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=RzxQvGFt; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf15.hostedemail.com: domain of Alex.Sierra@amd.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=Alex.Sierra@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656474911; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1ro+esyoxspRcCdLcy2c3opq7pjdPJz4su/UKS3EEj8=; b=LA8iQEzdTT+ZLnwJYbo4fJg1ZaWgS/vorQKbfHq1C9Q2QGQPybI7aRON6S5V5iQ8uaJ2zJ dAZIJefzXFz9yws0fscTp2yQJYL66J89Jh0GK99W4wJj8BtWG8n3ajTYWuUGMGdW/FoR12 ghJ/zd2ensLBTxXEj7FC+Yej2Pd3tzg= X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=RzxQvGFt; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf15.hostedemail.com: domain of Alex.Sierra@amd.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=Alex.Sierra@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com X-Rspamd-Server: rspam02 X-Stat-Signature: hzb84wodsmof7xtjjifxmgsgcsu4crap X-Rspamd-Queue-Id: 83434A0023 X-HE-Tag: 1656474911-625596 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With DEVICE_COHERENT, we'll soon have vm_normal_pages() return device-managed anonymous pages that are not LRU pages. Although they behave like normal pages for purposes of mapping in CPU page, and for COW. They do not support LRU lists, NUMA migration or THP. Callers to follow_page that expect LRU pages, are also checked for device zone pages due to DEVICE_COHERENT type. Signed-off-by: Alex Sierra Acked-by: Felix Kuehling (v2) Reviewed-by: Alistair Popple (v6) Acked-by: David Hildenbrand --- fs/proc/task_mmu.c | 2 +- mm/huge_memory.c | 2 +- mm/khugepaged.c | 9 ++++++--- mm/ksm.c | 6 +++--- mm/madvise.c | 4 ++-- mm/memory.c | 9 ++++++++- mm/mempolicy.c | 2 +- mm/migrate.c | 4 ++-- mm/mlock.c | 2 +- mm/mprotect.c | 2 +- 10 files changed, 26 insertions(+), 16 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 2d04e3470d4c..2dd8c8a66924 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1792,7 +1792,7 @@ static struct page *can_gather_numa_stats(pte_t pte, struct vm_area_struct *vma, return NULL; page = vm_normal_page(vma, addr, pte); - if (!page) + if (!page || is_zone_device_page(page)) return NULL; if (PageReserved(page)) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 834f288b3769..c47e95b02244 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2910,7 +2910,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, if (IS_ERR(page)) continue; - if (!page) + if (!page || is_zone_device_page(page)) continue; if (!is_transparent_hugepage(page)) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 16be62d493cd..671ac7800e53 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -618,7 +618,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, goto out; } page = vm_normal_page(vma, address, pteval); - if (unlikely(!page)) { + if (unlikely(!page) || unlikely(is_zone_device_page(page))) { result = SCAN_PAGE_NULL; goto out; } @@ -1267,7 +1267,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, writable = true; page = vm_normal_page(vma, _address, pteval); - if (unlikely(!page)) { + if (unlikely(!page) || unlikely(is_zone_device_page(page))) { result = SCAN_PAGE_NULL; goto out_unmap; } @@ -1479,7 +1479,8 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) goto abort; page = vm_normal_page(vma, addr, *pte); - + if (WARN_ON_ONCE(page && is_zone_device_page(page))) + page = NULL; /* * Note that uprobe, debugger, or MAP_PRIVATE may change the * page table, but the new page will not be a subpage of hpage. @@ -1497,6 +1498,8 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) if (pte_none(*pte)) continue; page = vm_normal_page(vma, addr, *pte); + if (WARN_ON_ONCE(page && is_zone_device_page(page))) + goto abort; page_remove_rmap(page, vma, false); } diff --git a/mm/ksm.c b/mm/ksm.c index 54f78c9eecae..831b18a7a50b 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) cond_resched(); page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); - if (IS_ERR_OR_NULL(page)) + if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) break; if (PageKsm(page)) ret = handle_mm_fault(vma, addr, @@ -560,7 +560,7 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item) goto out; page = follow_page(vma, addr, FOLL_GET); - if (IS_ERR_OR_NULL(page)) + if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) goto out; if (PageAnon(page)) { flush_anon_page(vma, page, addr); @@ -2308,7 +2308,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) if (ksm_test_exit(mm)) break; *page = follow_page(vma, ksm_scan.address, FOLL_GET); - if (IS_ERR_OR_NULL(*page)) { + if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) { ksm_scan.address += PAGE_SIZE; cond_resched(); continue; diff --git a/mm/madvise.c b/mm/madvise.c index 0316bbc6441b..e252635fe935 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -421,7 +421,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, continue; page = vm_normal_page(vma, addr, ptent); - if (!page) + if (!page || is_zone_device_page(page)) continue; /* @@ -639,7 +639,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, } page = vm_normal_page(vma, addr, ptent); - if (!page) + if (!page || is_zone_device_page(page)) continue; /* diff --git a/mm/memory.c b/mm/memory.c index 7a089145cad4..e18555af9024 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -624,6 +624,13 @@ struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, if (is_zero_pfn(pfn)) return NULL; if (pte_devmap(pte)) +/* + * NOTE: New uers of ZONE_DEVICE will not set pte_devmap() and will have + * refcounts incremented on their struct pages when they are inserted into + * PTEs, thus they are safe to return here. Legacy ZONE_DEVICE pages that set + * pte_devmap() do not have refcounts. Example of legacy ZONE_DEVICE is + * MEMORY_DEVICE_FS_DAX type in pmem or virtio_fs drivers. + */ return NULL; print_bad_pte(vma, addr, pte, NULL); @@ -4693,7 +4700,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) pte = pte_modify(old_pte, vma->vm_page_prot); page = vm_normal_page(vma, vmf->address, pte); - if (!page) + if (!page || is_zone_device_page(page)) goto out_map; /* TODO: handle PTE-mapped THP */ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index d39b01fd52fe..abc26890fc95 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -523,7 +523,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, if (!pte_present(*pte)) continue; page = vm_normal_page(vma, addr, *pte); - if (!page) + if (!page || is_zone_device_page(page)) continue; /* * vm_normal_page() filters out zero pages, but there might diff --git a/mm/migrate.c b/mm/migrate.c index 6c1ea61f39d8..a98a219d12ab 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1620,7 +1620,7 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, goto out; err = -ENOENT; - if (!page) + if (!page || is_zone_device_page(page)) goto out; err = 0; @@ -1810,7 +1810,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (IS_ERR(page)) goto set_status; - if (page) { + if (page && !is_zone_device_page(page)) { err = page_to_nid(page); put_page(page); } else { diff --git a/mm/mlock.c b/mm/mlock.c index 716caf851043..b14e929084cc 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -333,7 +333,7 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, if (!pte_present(*pte)) continue; page = vm_normal_page(vma, addr, *pte); - if (!page) + if (!page || is_zone_device_page(page)) continue; if (PageTransCompound(page)) continue; diff --git a/mm/mprotect.c b/mm/mprotect.c index ba5592655ee3..e034aae2a98b 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -95,7 +95,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, continue; page = vm_normal_page(vma, addr, oldpte); - if (!page || PageKsm(page)) + if (!page || is_zone_device_page(page) || PageKsm(page)) continue; /* Also skip shared copy-on-write pages */