From patchwork Wed Nov 8 07:59:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Ghiti X-Patchwork-Id: 13449667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D81DEC4332F for ; Wed, 8 Nov 2023 08:01:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A6998D00A6; Wed, 8 Nov 2023 03:01:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7576D8D00A2; Wed, 8 Nov 2023 03:01:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F93E8D00A6; Wed, 8 Nov 2023 03:01:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4FA358D00A2 for ; Wed, 8 Nov 2023 03:01:54 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2ECE51CB7B8 for ; Wed, 8 Nov 2023 08:01:54 +0000 (UTC) X-FDA: 81434043348.03.B820A86 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by imf15.hostedemail.com (Postfix) with ESMTP id 3E79DA0012 for ; Wed, 8 Nov 2023 08:01:52 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=mwjVmSLP; dmarc=none; spf=pass (imf15.hostedemail.com: domain of alexghiti@rivosinc.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=alexghiti@rivosinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699430512; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xUvzkpfZJslkqpYiEl2gbMDcHOpTegWwJJ2kuhRtB18=; b=w/uRjyBQA3DvlzYlXtPmgeZrAd5mgPiRi3qS4wD6inXTvAlMH1+v6TqJzbgY3BvuJ+Ip8t uspBL1u1Z4gGCCGuWd22UEek0dUSPhmwwhE0B1ZFUC2EutDLmslY/2FP2bBdTWftBUbWUW dm0JnErbveSt6zR3fpWa7l96J/f/1wM= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=mwjVmSLP; dmarc=none; spf=pass (imf15.hostedemail.com: domain of alexghiti@rivosinc.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=alexghiti@rivosinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699430512; a=rsa-sha256; cv=none; b=Sb+sD/yg/mBs5k+mpatam/A81Q6a5Ag4NA6/fh1lkQfjHyoecfQJdDgYE3XMntK+H1F2b6 5SOjL4jM+TjaH6Lok2bzUBq15PC2PNApvD60A/t1HqG/ujyIAu4P5O6cheGKB36PJj8G3h avZP7M7iDsu9NOWtQtqZ4Q73lP+bWy8= Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-40838915cecso48012515e9.2 for ; Wed, 08 Nov 2023 00:01:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1699430511; x=1700035311; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xUvzkpfZJslkqpYiEl2gbMDcHOpTegWwJJ2kuhRtB18=; b=mwjVmSLPp/k9t/dljXsc3jm2KMkfIgvzSfd235nOYmTjtx+o/SaFJiReIW9pWyKsBP BAq5M6pX7kM4BBHFww5EliPvtZiMrPp0JgGA1beHwN8UGziq+/Udg1gCwMNIe6KfrJl4 TeCynGLgAEWrOND0dojDNcp96E9NIksKlnYctkV5nsHMJnKdRdUwTnFqHD9sES0BDqbW YW2KkTMmLWvCd4rJ1/lzOTAK8stoo+1OKHyCSVQekMc3Cq0LWlzEmxhJjqSRbm8cWuhP x9WbNiPruHK/UZkD70HcbPrLQ+zLaDJcg+TMdLHIvbssAPb4C9wpCCnczhQTN4MibF5P zjCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699430511; x=1700035311; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xUvzkpfZJslkqpYiEl2gbMDcHOpTegWwJJ2kuhRtB18=; b=FiOJ/XSiZVtGssXVm73pp8C2c9vJcJ6mhc3vgshPem0hXzCF0dktuyOdvLD27aceJU Pa9vvxKGprUquUZUuc/Oj4yD4WU1iC5aIwPOYKKAkV+k0LcwsgqYDHYd7Fti0WPDwOXV gRTS3QXmVLm/Ai3wwo1ec4dZn8/5KM0Tj/rTIDLi+8fVoO0HDyL2r9OSTLfCIfHGxo0R 1/AfM0rc7as3aXbblIKcYCUQSex1CmdHIHEDn/riRPyLaLIQHAjYDF4PqIVGz/g7Semp 6+yn9Os3WE/LMPlYBeYLe08qVOv2BWhmF/c5Mro3/rloFgyajkDSkj51RsyUhqWnU/vt sXRA== X-Gm-Message-State: AOJu0Yy0cvtp+KG4XDycf8JTOb+g+F0LFotLsl72wJ+Ki17RKeGfsjI8 GAU19aokrLXN+Fx07Ikf3K9OeQ== X-Google-Smtp-Source: AGHT+IH2dGhcK2G/D5OBlYtTn1fjQvYZbS0LB5a8gEupyNYYUH9bxRLuspT7rFEherKSYR5gqKTojA== X-Received: by 2002:a05:600c:4e87:b0:405:82c0:d9d9 with SMTP id f7-20020a05600c4e8700b0040582c0d9d9mr1032390wmq.41.1699430510733; Wed, 08 Nov 2023 00:01:50 -0800 (PST) Received: from alex-rivos.home (amontpellier-656-1-456-62.w92-145.abo.wanadoo.fr. [92.145.124.62]) by smtp.gmail.com with ESMTPSA id fc13-20020a05600c524d00b004068de50c64sm18545572wmb.46.2023.11.08.00.01.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Nov 2023 00:01:50 -0800 (PST) From: Alexandre Ghiti To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Mike Rapoport , linux-mm@kvack.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Alexandre Ghiti Subject: [PATCH RESEND v2 2/2] riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings Date: Wed, 8 Nov 2023 08:59:30 +0100 Message-Id: <20231108075930.7157-3-alexghiti@rivosinc.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231108075930.7157-1-alexghiti@rivosinc.com> References: <20231108075930.7157-1-alexghiti@rivosinc.com> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3E79DA0012 X-Stat-Signature: q814ot69rkbhhw71e47xcuedids6a3yd X-Rspam-User: X-HE-Tag: 1699430512-882892 X-HE-Meta: U2FsdGVkX1+GgMO6a3oiALlsVJyAKIzd+L2bTJNFeaa/BiRRY7a7rsLUdwqJ6vkPP6uJtiJhM5T3eWtAGG/XZe8p7x58C4unoF4qZMxHpjEnzsFZDS1II7ry37ErDuBrXitTM0bkuYhUCJTsM9Ba2ytTWUOHBid1gpK+Xtd1bCfaHIXNsVDEc9nHfU92FAr5oUSA2nCf1ObaYCWNrTSduXCie2ZMQqK5dktESnTsBFQqFN4A7TuRvvEmvNeLT37F1SHKXFLLZyY0vzi5XZ1cHgRkV4ZPDKj6/HI0abkyCr7dRZw+lbMvioHoSvKSbIv9WEfkQ1/VQwNh3S1m5SctocOejSeMA3PRrZ5OcL+fy8pM+Qg0Tvv+Yffw1K/bc1z6YaLmTWC15Vna6jaG5wZzODIAHb5v3Kdyr9SpAKaeBAoEZtZjg9ubNRpB7LZCF1uf9wBTMXCr4ufMjOb4TdoWio1fL5QxNpjQmYAoXAJc07of/6L7h8WIy/0LDEvUCtx84u9oHlPQ0KEKQc+bXxIuPB4EWbr/SaT8AbAOoJpr8q5pa+2fS56E/GWcy0jU+2170DJalQUNkhg3dUoSl29CFETzXoi+G8MYYpckd4SbWkeOu8xYimK5J1XTvynSjHOf4RcjWnK2nNIJVtN9FAf6DyS3HytmtXqPrNR6obtHSFK5ckZdTWnUpeeBpByhnW28rZ95CMn30hIh825zh4C1mfxaYyIug1FOX/r6A3HbYWb6qHuQGRp0eL1nQwFQqky1Ihn+rKIZjGeBOiXuqM0/mi2OlwbGhfJEJyQq2VPRB9CYPcZTfAevWJvEOdpXzro2RZIQAW68YHApPgTOToCz3jTRql5EffXp77CiB7JWjYOGYEaM8An4BjsAfzlOQlY1ID7GYVn/WOVz7t0INRpfK5D117S7A0Enqpjcm8np0tFalN5iG8g/atm854UGSg+dAaOqhForcZI4/0xwD4v 7yHEHzsN V32QAexKki1s25T9YzpMJZqikl3qnv923fEPSxTKi7QQGooFjVv6EEwJRX11vLWDHSshcD4Wh8GNkG2qJQIJdOIwr8bxylnYPLVEk7uvez1RHJ+1s5GjM5QP74wxvQuqwNsciNasylGV3xQDJtvb4mpFic08ZclgqSouYv5g0dM0RgZ21ElIu3B/pImrgr3kArDKShJa4F80aPymCNbGHeMbBBq4sNe/15O+RRAPOj/KNiLbt4AO5BdTC8L1vzr1n0iTIqoL+U2vA9Ryz/WI/0Gs4YaPjC/ificx5cwlVgLNuKakXM7nZcAKVOdhQdAibw2mnnYcYqGnGLoMJBlMH6j7ihlG7qDpqaXHD2EqRIMhXV5WtchdB10mtapHZ0J1ngLCsSoZWL509kSosmiT34YSh+lWId+XP4IoKYcLNU29dAyagegtm4VPT4BmgFaVAtjau4YlCMrkhma2znAvSR/cyzjew6s9AK+qSZ0hEjiEuNmliciSq/XKN13Euredkmnq/hUhcet/4U+4pkRiXNqgjy1sZdPPVHcm+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When STRICT_KERNEL_RWX is set, any change of permissions on any kernel mapping (vmalloc/modules/kernel text...etc) should be applied on its linear mapping alias. The problem is that the riscv kernel uses huge mappings for the linear mapping and walk_page_range_novma() does not split those huge mappings. So this patchset implements such split in order to apply fine-grained permissions on the linear mapping. Below is the difference before and after (the first PUD mapping is split into PTE/PMD mappings): Before: ---[ Linear mapping ]--- 0xffffaf8000080000-0xffffaf8000200000 0x0000000080080000 1536K PTE D A G . . W R V 0xffffaf8000200000-0xffffaf8077c00000 0x0000000080200000 1914M PMD D A G . . W R V 0xffffaf8077c00000-0xffffaf8078800000 0x00000000f7c00000 12M PMD D A G . . . R V 0xffffaf8078800000-0xffffaf8078c00000 0x00000000f8800000 4M PMD D A G . . W R V 0xffffaf8078c00000-0xffffaf8079200000 0x00000000f8c00000 6M PMD D A G . . . R V 0xffffaf8079200000-0xffffaf807e600000 0x00000000f9200000 84M PMD D A G . . W R V 0xffffaf807e600000-0xffffaf807e716000 0x00000000fe600000 1112K PTE D A G . . W R V 0xffffaf807e717000-0xffffaf807e71a000 0x00000000fe717000 12K PTE D A G . . W R V 0xffffaf807e71d000-0xffffaf807e71e000 0x00000000fe71d000 4K PTE D A G . . W R V 0xffffaf807e722000-0xffffaf807e800000 0x00000000fe722000 888K PTE D A G . . W R V 0xffffaf807e800000-0xffffaf807fe00000 0x00000000fe800000 22M PMD D A G . . W R V 0xffffaf807fe00000-0xffffaf807ff54000 0x00000000ffe00000 1360K PTE D A G . . W R V 0xffffaf807ff55000-0xffffaf8080000000 0x00000000fff55000 684K PTE D A G . . W R V 0xffffaf8080000000-0xffffaf8400000000 0x0000000100000000 14G PUD D A G . . W R V After: ---[ Linear mapping ]--- 0xffffaf8000080000-0xffffaf8000200000 0x0000000080080000 1536K PTE D A G . . W R V 0xffffaf8000200000-0xffffaf8077c00000 0x0000000080200000 1914M PMD D A G . . W R V 0xffffaf8077c00000-0xffffaf8078800000 0x00000000f7c00000 12M PMD D A G . . . R V 0xffffaf8078800000-0xffffaf8078a00000 0x00000000f8800000 2M PMD D A G . . W R V 0xffffaf8078a00000-0xffffaf8078c00000 0x00000000f8a00000 2M PTE D A G . . W R V 0xffffaf8078c00000-0xffffaf8079200000 0x00000000f8c00000 6M PMD D A G . . . R V 0xffffaf8079200000-0xffffaf807e600000 0x00000000f9200000 84M PMD D A G . . W R V 0xffffaf807e600000-0xffffaf807e716000 0x00000000fe600000 1112K PTE D A G . . W R V 0xffffaf807e717000-0xffffaf807e71a000 0x00000000fe717000 12K PTE D A G . . W R V 0xffffaf807e71d000-0xffffaf807e71e000 0x00000000fe71d000 4K PTE D A G . . W R V 0xffffaf807e722000-0xffffaf807e800000 0x00000000fe722000 888K PTE D A G . . W R V 0xffffaf807e800000-0xffffaf807fe00000 0x00000000fe800000 22M PMD D A G . . W R V 0xffffaf807fe00000-0xffffaf807ff54000 0x00000000ffe00000 1360K PTE D A G . . W R V 0xffffaf807ff55000-0xffffaf8080000000 0x00000000fff55000 684K PTE D A G . . W R V 0xffffaf8080000000-0xffffaf8080800000 0x0000000100000000 8M PMD D A G . . W R V 0xffffaf8080800000-0xffffaf8080af6000 0x0000000100800000 3032K PTE D A G . . W R V 0xffffaf8080af6000-0xffffaf8080af8000 0x0000000100af6000 8K PTE D A G . X . R V 0xffffaf8080af8000-0xffffaf8080c00000 0x0000000100af8000 1056K PTE D A G . . W R V 0xffffaf8080c00000-0xffffaf8081a00000 0x0000000100c00000 14M PMD D A G . . W R V 0xffffaf8081a00000-0xffffaf8081a40000 0x0000000101a00000 256K PTE D A G . . W R V 0xffffaf8081a40000-0xffffaf8081a44000 0x0000000101a40000 16K PTE D A G . X . R V 0xffffaf8081a44000-0xffffaf8081a52000 0x0000000101a44000 56K PTE D A G . . W R V 0xffffaf8081a52000-0xffffaf8081a54000 0x0000000101a52000 8K PTE D A G . X . R V ... 0xffffaf809e800000-0xffffaf80c0000000 0x000000011e800000 536M PMD D A G . . W R V 0xffffaf80c0000000-0xffffaf8400000000 0x0000000140000000 13G PUD D A G . . W R V Note that this also fixes memfd_secret() syscall which uses set_direct_map_invalid_noflush() and set_direct_map_default_noflush() to remove the pages from the linear mapping. Below is the kernel page table while a memfd_secret() syscall is running, you can see all the !valid page table entries in the linear mapping: ... 0xffffaf8082240000-0xffffaf8082241000 0x0000000102240000 4K PTE D A G . . W R . 0xffffaf8082241000-0xffffaf8082250000 0x0000000102241000 60K PTE D A G . . W R V 0xffffaf8082250000-0xffffaf8082252000 0x0000000102250000 8K PTE D A G . . W R . 0xffffaf8082252000-0xffffaf8082256000 0x0000000102252000 16K PTE D A G . . W R V 0xffffaf8082256000-0xffffaf8082257000 0x0000000102256000 4K PTE D A G . . W R . 0xffffaf8082257000-0xffffaf8082258000 0x0000000102257000 4K PTE D A G . . W R V 0xffffaf8082258000-0xffffaf8082259000 0x0000000102258000 4K PTE D A G . . W R . 0xffffaf8082259000-0xffffaf808225a000 0x0000000102259000 4K PTE D A G . . W R V 0xffffaf808225a000-0xffffaf808225c000 0x000000010225a000 8K PTE D A G . . W R . 0xffffaf808225c000-0xffffaf8082266000 0x000000010225c000 40K PTE D A G . . W R V 0xffffaf8082266000-0xffffaf8082268000 0x0000000102266000 8K PTE D A G . . W R . 0xffffaf8082268000-0xffffaf8082284000 0x0000000102268000 112K PTE D A G . . W R V 0xffffaf8082284000-0xffffaf8082288000 0x0000000102284000 16K PTE D A G . . W R . 0xffffaf8082288000-0xffffaf808229c000 0x0000000102288000 80K PTE D A G . . W R V 0xffffaf808229c000-0xffffaf80822a0000 0x000000010229c000 16K PTE D A G . . W R . 0xffffaf80822a0000-0xffffaf80822a5000 0x00000001022a0000 20K PTE D A G . . W R V 0xffffaf80822a5000-0xffffaf80822a6000 0x00000001022a5000 4K PTE D A G . . . R V 0xffffaf80822a6000-0xffffaf80822ab000 0x00000001022a6000 20K PTE D A G . . W R V ... And when the memfd_secret() fd is released, the linear mapping is correctly reset: ... 0xffffaf8082240000-0xffffaf80822a5000 0x0000000102240000 404K PTE D A G . . W R V 0xffffaf80822a5000-0xffffaf80822a6000 0x00000001022a5000 4K PTE D A G . . . R V 0xffffaf80822a6000-0xffffaf80822af000 0x00000001022a6000 36K PTE D A G . . W R V ... Signed-off-by: Alexandre Ghiti --- arch/riscv/mm/pageattr.c | 270 +++++++++++++++++++++++++++++++++------ 1 file changed, 230 insertions(+), 40 deletions(-) diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c index 161d0b34c2cb..fc5fc4f785c4 100644 --- a/arch/riscv/mm/pageattr.c +++ b/arch/riscv/mm/pageattr.c @@ -5,6 +5,7 @@ #include #include +#include #include #include #include @@ -25,19 +26,6 @@ static unsigned long set_pageattr_masks(unsigned long val, struct mm_walk *walk) return new_val; } -static int pageattr_pgd_entry(pgd_t *pgd, unsigned long addr, - unsigned long next, struct mm_walk *walk) -{ - pgd_t val = READ_ONCE(*pgd); - - if (pgd_leaf(val)) { - val = __pgd(set_pageattr_masks(pgd_val(val), walk)); - set_pgd(pgd, val); - } - - return 0; -} - static int pageattr_p4d_entry(p4d_t *p4d, unsigned long addr, unsigned long next, struct mm_walk *walk) { @@ -96,7 +84,6 @@ static int pageattr_pte_hole(unsigned long addr, unsigned long next, } static const struct mm_walk_ops pageattr_ops = { - .pgd_entry = pageattr_pgd_entry, .p4d_entry = pageattr_p4d_entry, .pud_entry = pageattr_pud_entry, .pmd_entry = pageattr_pmd_entry, @@ -105,12 +92,181 @@ static const struct mm_walk_ops pageattr_ops = { .walk_lock = PGWALK_RDLOCK, }; +#ifdef CONFIG_64BIT +static int __split_linear_mapping_pmd(pud_t *pudp, + unsigned long vaddr, unsigned long end) +{ + pmd_t *pmdp; + unsigned long next; + + pmdp = pmd_offset(pudp, vaddr); + + do { + next = pmd_addr_end(vaddr, end); + + if (next - vaddr >= PMD_SIZE && + vaddr <= (vaddr & PMD_MASK) && end >= next) + continue; + + if (pmd_leaf(*pmdp)) { + struct page *pte_page; + unsigned long pfn = _pmd_pfn(*pmdp); + pgprot_t prot = __pgprot(pmd_val(*pmdp) & ~_PAGE_PFN_MASK); + pte_t *ptep_new; + int i; + + pte_page = alloc_page(GFP_KERNEL); + if (!pte_page) + return -ENOMEM; + + ptep_new = (pte_t *)page_address(pte_page); + for (i = 0; i < PTRS_PER_PTE; ++i, ++ptep_new) + set_pte(ptep_new, pfn_pte(pfn + i, prot)); + + smp_wmb(); + + set_pmd(pmdp, pfn_pmd(page_to_pfn(pte_page), PAGE_TABLE)); + } + } while (pmdp++, vaddr = next, vaddr != end); + + return 0; +} + +static int __split_linear_mapping_pud(p4d_t *p4dp, + unsigned long vaddr, unsigned long end) +{ + pud_t *pudp; + unsigned long next; + int ret; + + pudp = pud_offset(p4dp, vaddr); + + do { + next = pud_addr_end(vaddr, end); + + if (next - vaddr >= PUD_SIZE && + vaddr <= (vaddr & PUD_MASK) && end >= next) + continue; + + if (pud_leaf(*pudp)) { + struct page *pmd_page; + unsigned long pfn = _pud_pfn(*pudp); + pgprot_t prot = __pgprot(pud_val(*pudp) & ~_PAGE_PFN_MASK); + pmd_t *pmdp_new; + int i; + + pmd_page = alloc_page(GFP_KERNEL); + if (!pmd_page) + return -ENOMEM; + + pmdp_new = (pmd_t *)page_address(pmd_page); + for (i = 0; i < PTRS_PER_PMD; ++i, ++pmdp_new) + set_pmd(pmdp_new, + pfn_pmd(pfn + ((i * PMD_SIZE) >> PAGE_SHIFT), prot)); + + smp_wmb(); + + set_pud(pudp, pfn_pud(page_to_pfn(pmd_page), PAGE_TABLE)); + } + + ret = __split_linear_mapping_pmd(pudp, vaddr, next); + if (ret) + return ret; + } while (pudp++, vaddr = next, vaddr != end); + + return 0; +} + +static int __split_linear_mapping_p4d(pgd_t *pgdp, + unsigned long vaddr, unsigned long end) +{ + p4d_t *p4dp; + unsigned long next; + int ret; + + p4dp = p4d_offset(pgdp, vaddr); + + do { + next = p4d_addr_end(vaddr, end); + + /* + * If [vaddr; end] contains [vaddr & P4D_MASK; next], we don't + * need to split, we'll change the protections on the whole P4D. + */ + if (next - vaddr >= P4D_SIZE && + vaddr <= (vaddr & P4D_MASK) && end >= next) + continue; + + if (p4d_leaf(*p4dp)) { + struct page *pud_page; + unsigned long pfn = _p4d_pfn(*p4dp); + pgprot_t prot = __pgprot(p4d_val(*p4dp) & ~_PAGE_PFN_MASK); + pud_t *pudp_new; + int i; + + pud_page = alloc_page(GFP_KERNEL); + if (!pud_page) + return -ENOMEM; + + /* + * Fill the pud level with leaf puds that have the same + * protections as the leaf p4d. + */ + pudp_new = (pud_t *)page_address(pud_page); + for (i = 0; i < PTRS_PER_PUD; ++i, ++pudp_new) + set_pud(pudp_new, + pfn_pud(pfn + ((i * PUD_SIZE) >> PAGE_SHIFT), prot)); + + /* + * Make sure the pud filling is not reordered with the + * p4d store which could result in seeing a partially + * filled pud level. + */ + smp_wmb(); + + set_p4d(p4dp, pfn_p4d(page_to_pfn(pud_page), PAGE_TABLE)); + } + + ret = __split_linear_mapping_pud(p4dp, vaddr, next); + if (ret) + return ret; + } while (p4dp++, vaddr = next, vaddr != end); + + return 0; +} + +static int __split_linear_mapping_pgd(pgd_t *pgdp, + unsigned long vaddr, + unsigned long end) +{ + unsigned long next; + int ret; + + do { + next = pgd_addr_end(vaddr, end); + /* We never use PGD mappings for the linear mapping */ + ret = __split_linear_mapping_p4d(pgdp, vaddr, next); + if (ret) + return ret; + } while (pgdp++, vaddr = next, vaddr != end); + + return 0; +} + +static int split_linear_mapping(unsigned long start, unsigned long end) +{ + return __split_linear_mapping_pgd(pgd_offset_k(start), start, end); +} +#endif /* CONFIG_64BIT */ + static int __set_memory(unsigned long addr, int numpages, pgprot_t set_mask, pgprot_t clear_mask) { int ret; unsigned long start = addr; unsigned long end = start + PAGE_SIZE * numpages; + unsigned long __maybe_unused lm_start; + unsigned long __maybe_unused lm_end; struct pageattr_masks masks = { .set_mask = set_mask, .clear_mask = clear_mask @@ -120,11 +276,67 @@ static int __set_memory(unsigned long addr, int numpages, pgprot_t set_mask, return 0; mmap_write_lock(&init_mm); + +#ifdef CONFIG_64BIT + /* + * We are about to change the permissions of a kernel mapping, we must + * apply the same changes to its linear mapping alias, which may imply + * splitting a huge mapping. + */ + + if (is_vmalloc_or_module_addr((void *)start)) { + struct vm_struct *area = NULL; + int i, page_start; + + area = find_vm_area((void *)start); + page_start = (start - (unsigned long)area->addr) >> PAGE_SHIFT; + + for (i = page_start; i < page_start + numpages; ++i) { + lm_start = (unsigned long)page_address(area->pages[i]); + lm_end = lm_start + PAGE_SIZE; + + ret = split_linear_mapping(lm_start, lm_end); + if (ret) + goto unlock; + + ret = walk_page_range_novma(&init_mm, lm_start, lm_end, + &pageattr_ops, NULL, &masks); + if (ret) + goto unlock; + } + } else if (is_kernel_mapping(start) || is_linear_mapping(start)) { + lm_start = (unsigned long)lm_alias(start); + lm_end = (unsigned long)lm_alias(end); + + ret = split_linear_mapping(lm_start, lm_end); + if (ret) + goto unlock; + + ret = walk_page_range_novma(&init_mm, lm_start, lm_end, + &pageattr_ops, NULL, &masks); + if (ret) + goto unlock; + } + ret = walk_page_range_novma(&init_mm, start, end, &pageattr_ops, NULL, &masks); + +unlock: + mmap_write_unlock(&init_mm); + + /* + * We can't use flush_tlb_kernel_range() here as we may have split a + * hugepage that is larger than that, so let's flush everything. + */ + flush_tlb_all(); +#else + ret = walk_page_range_novma(&init_mm, start, end, &pageattr_ops, NULL, + &masks); + mmap_write_unlock(&init_mm); flush_tlb_kernel_range(start, end); +#endif return ret; } @@ -159,36 +371,14 @@ int set_memory_nx(unsigned long addr, int numpages) int set_direct_map_invalid_noflush(struct page *page) { - int ret; - unsigned long start = (unsigned long)page_address(page); - unsigned long end = start + PAGE_SIZE; - struct pageattr_masks masks = { - .set_mask = __pgprot(0), - .clear_mask = __pgprot(_PAGE_PRESENT) - }; - - mmap_read_lock(&init_mm); - ret = walk_page_range(&init_mm, start, end, &pageattr_ops, &masks); - mmap_read_unlock(&init_mm); - - return ret; + return __set_memory((unsigned long)page_address(page), 1, + __pgprot(0), __pgprot(_PAGE_PRESENT)); } int set_direct_map_default_noflush(struct page *page) { - int ret; - unsigned long start = (unsigned long)page_address(page); - unsigned long end = start + PAGE_SIZE; - struct pageattr_masks masks = { - .set_mask = PAGE_KERNEL, - .clear_mask = __pgprot(0) - }; - - mmap_read_lock(&init_mm); - ret = walk_page_range(&init_mm, start, end, &pageattr_ops, &masks); - mmap_read_unlock(&init_mm); - - return ret; + return __set_memory((unsigned long)page_address(page), 1, + PAGE_KERNEL, __pgprot(0)); } #ifdef CONFIG_DEBUG_PAGEALLOC