From patchwork Mon Apr 13 12:53:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 11485539 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D1CB15AB for ; Mon, 13 Apr 2020 12:54:43 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0CBD020732 for ; Mon, 13 Apr 2020 12:54:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="P4D+kzUk"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="L5S1BJRQ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0CBD020732 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=OhMdUhGbA2pdnP02lKik9bRomkRujue5i7inU70VUsY=; b=P4D+kzUklzzCRH GDyqExL8eJZ51izoo8rjOG1/f73SfZEuBJ7JC5ulVBW6ghLiFkI/b7gPdoMIDPbh6vsWGX1sxAHtf zEfJ9RpOQ66Xzl7bs87SucE/8nf5p6kxs8tEwjjEdydBs7GRuGEnz3JuyQmHbec9wRSd6vc1zoaji svtmgB7XVn54e4DKUdf4n6xbbe0Ys0D0Ift+2lWsqfzW2kppxZU/ZC5RnzhBASNXJOYkvaOVvYwiH HKd2hOjmx8WaMew/+sTQXBL+6USQcLOsCdrTJZnhkyL6AWDVxeROEQeaxCcMxih5MaJLb4mlTW89s 0mskoMxp6z7XCJ7HfQ1A==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNybi-00054a-PP; Mon, 13 Apr 2020 12:54:34 +0000 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNybY-0004wD-Br for linux-arm-kernel@lists.infradead.org; Mon, 13 Apr 2020 12:54:25 +0000 Received: by mail-pl1-x643.google.com with SMTP id a23so3364054plm.1 for ; Mon, 13 Apr 2020 05:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=k49Z1eeIe8RAePIWxlY/V+bEMZoWrI+Axi8oPgRhVtE=; b=L5S1BJRQ9w88zx9XItTQeKB1SWsCw9jN3PuCuTfVxmuOCKWgUewZNKRsVyQjwxHom4 ryvIxTIOl9WF3LHatElNK8cNsWnR1tE7/N5ODwBzb5+SkQvox0yCH821WVCcfqm495pa LK65a2FSueaENFn4ZvlWg3x+AHB5L2wFPRT41i9dbpsLWS2dWFfpccRmZEmhKUWPEymx WyI5x9Acu8dyiVuDncUPJrEQb0Sf9Z9Fwppf9ivQJDnmtOAgeNFzkaQYfo3pypCDwIVI ITWDYFpdOCKbVXLiHa2pbhcRH0bGRxSg+GfkJc2W005zzgPh63LtWhzwzkFkwciB+nEy N5WA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=k49Z1eeIe8RAePIWxlY/V+bEMZoWrI+Axi8oPgRhVtE=; b=fqmTJXVu7BZuZEKENB7oouuIQ0ioimgRyurPufK19m2XCrbXnKglBkWQrUv3fsmycB qMXfhWa6+UC52di1vTSWuFE75pYqTKkPRvL0pzyqlqZxLP5BAMx4Y7dScGAyUCk60sIJ N5T8zethb0ubgVWpCpN5sfc0DkHax2Wu5sWCTP5y9OohNa0l1+VPuULFqHwon3eeGOS5 FNc8Ajz29Zryfq8r7WqlGi/t8U8Q5Z0rV6ynT0HaZJOVL3nI5anj/XNVPMndS2x1pEBc kCmnboVLrwgVUpIAmCA2N+Gv6DG5LCbBWVy4bc2aLFYI/3SPlbP2va2Kk6rL0xBYrr45 iVsw== X-Gm-Message-State: AGi0PuYsw8czKHvMEGgW1UAmycKlXnep8ASOHIPtqgAqPouSKAZpoHPj 0I2hcBEVQYSbBXWjlRQf67o= X-Google-Smtp-Source: APiQypKN1B5uL3kKzB/lOZ19Pz9DdktaX2dQyGsTq0k0RExJ7gXb2rmJrqHeHeQ2h0CyWmOAFzHcaA== X-Received: by 2002:a17:90a:368d:: with SMTP id t13mr20937061pjb.175.1586782463462; Mon, 13 Apr 2020 05:54:23 -0700 (PDT) Received: from bobo.ibm.com (60-241-117-97.tpgi.com.au. [60.241.117.97]) by smtp.gmail.com with ESMTPSA id j24sm9235610pji.20.2020.04.13.05.54.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Apr 2020 05:54:22 -0700 (PDT) From: Nicholas Piggin To: linux-mm@kvack.org Subject: [PATCH v2 1/4] mm/vmalloc: fix vmalloc_to_page for huge vmap mappings Date: Mon, 13 Apr 2020 22:53:00 +1000 Message-Id: <20200413125303.423864-2-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200413125303.423864-1-npiggin@gmail.com> References: <20200413125303.423864-1-npiggin@gmail.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200413_055424_453628_7CD1B5DD X-CRM114-Status: GOOD ( 14.72 ) X-Spam-Score: -0.2 (/) X-Spam-Report: SpamAssassin version 3.4.4 on bombadil.infradead.org summary: Content analysis details: (-0.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:643 listed in] [list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider [npiggin[at]gmail.com] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Catalin Marinas , x86@kernel.org, linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-kernel@vger.kernel.org, Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Thomas Gleixner , Will Deacon , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org vmalloc_to_page returns NULL for addresses mapped by larger pages[*]. Whether or not a vmap is huge depends on the architecture details, alignments, boot options, etc., which the caller can not be expected to know. Therefore HUGE_VMAP is a regression for vmalloc_to_page. This change teaches vmalloc_to_page about larger pages, and returns the struct page that corresponds to the offset within the large page. This makes the API agnostic to mapping implementation details. [*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings") Signed-off-by: Nicholas Piggin --- mm/vmalloc.c | 40 ++++++++++++++++++++++++++-------------- 1 file changed, 26 insertions(+), 14 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 399f219544f7..1afec7def23f 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -36,6 +36,7 @@ #include #include +#include #include #include @@ -272,7 +273,9 @@ int is_vmalloc_or_module_addr(const void *x) } /* - * Walk a vmap address to the struct page it maps. + * Walk a vmap address to the struct page it maps. Huge vmap mappings will + * return the tail page that corresponds to the base page address, which + * matches small vmap mappings. */ struct page *vmalloc_to_page(const void *vmalloc_addr) { @@ -292,25 +295,33 @@ struct page *vmalloc_to_page(const void *vmalloc_addr) if (pgd_none(*pgd)) return NULL; + if (WARN_ON_ONCE(pgd_leaf(*pgd))) + return NULL; /* XXX: no allowance for huge pgd */ + if (WARN_ON_ONCE(pgd_bad(*pgd))) + return NULL; + p4d = p4d_offset(pgd, addr); if (p4d_none(*p4d)) return NULL; - pud = pud_offset(p4d, addr); + if (p4d_leaf(*p4d)) + return p4d_page(*p4d) + ((addr & ~P4D_MASK) >> PAGE_SHIFT); + if (WARN_ON_ONCE(p4d_bad(*p4d))) + return NULL; - /* - * Don't dereference bad PUD or PMD (below) entries. This will also - * identify huge mappings, which we may encounter on architectures - * that define CONFIG_HAVE_ARCH_HUGE_VMAP=y. Such regions will be - * identified as vmalloc addresses by is_vmalloc_addr(), but are - * not [unambiguously] associated with a struct page, so there is - * no correct value to return for them. - */ - WARN_ON_ONCE(pud_bad(*pud)); - if (pud_none(*pud) || pud_bad(*pud)) + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) + return NULL; + if (pud_leaf(*pud)) + return pud_page(*pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); + if (WARN_ON_ONCE(pud_bad(*pud))) return NULL; + pmd = pmd_offset(pud, addr); - WARN_ON_ONCE(pmd_bad(*pmd)); - if (pmd_none(*pmd) || pmd_bad(*pmd)) + if (pmd_none(*pmd)) + return NULL; + if (pmd_leaf(*pmd)) + return pmd_page(*pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); + if (WARN_ON_ONCE(pmd_bad(*pmd))) return NULL; ptep = pte_offset_map(pmd, addr); @@ -318,6 +329,7 @@ struct page *vmalloc_to_page(const void *vmalloc_addr) if (pte_present(pte)) page = pte_page(pte); pte_unmap(ptep); + return page; } EXPORT_SYMBOL(vmalloc_to_page); From patchwork Mon Apr 13 12:53:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 11485543 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A7BA14B4 for ; Mon, 13 Apr 2020 12:54:57 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 39F172073E for ; Mon, 13 Apr 2020 12:54:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="jh0nX0uG"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GVv3ucsL" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 39F172073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Er8wGSQYbK1DadNa4TDasT8HAyzgcHE7E4F83X/FT+E=; b=jh0nX0uGxLPdnU 3xr2LhSP/K8lmInWtdqzPsrZ+TFUeSgWbJUsqPndtDstAUt5b8Gxv0RAMdgCnU7hQSJQL/6llyVyg MP6hLCArJox9VgsNzxQknnPaGVaGlZyLqO+dcGLfS1Lz1IqKz9mj9NwWGTZHlp0BrorS8acPe+MkS d+BxagZUbXq8gX6170lvyE9NNQmTMso+Aj2EvA2qLZMMS/y9iHfBQ56snNB9VFwPODWxo0V1wqIny w7d4ixPyuEQaWen3gRQuJD9+JrI0WJ058VdipyDOXxul4CsvjSyaXHEd4KYl31dkdMO4gnPpUMJw2 ETGc7eru4VPcS96kxusA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNybz-0005Kg-Rz; Mon, 13 Apr 2020 12:54:51 +0000 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNybf-00053Z-JK for linux-arm-kernel@lists.infradead.org; Mon, 13 Apr 2020 12:54:37 +0000 Received: by mail-pj1-x1043.google.com with SMTP id a32so3787876pje.5 for ; Mon, 13 Apr 2020 05:54:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=X7MVUjdxoTYLktr/8CAz1umz/rNBVR6DTMKwArT/SjI=; b=GVv3ucsLdjLUrSZaDItIqGtF0WWKcVpzdbpq18nnkIs2OAT34acUNvX86wGgS5qUPt n6eXXfGcFLXIPOrj5zWoabSdRtTKYBMOewFnwpkFojN4v/klGSSHHlHARq0Rs04rdwqF 6et6QgG8LMT11/1pt6TtwEcXb87MNDRoOswAzhtHJFzxutN8GEbIra19aeo6HYzC4FUg MVZDNpVwlRwzs8UdEL8banprE+O7Fbox0giyXbjLs2KUjtsq7nEO9psIbdXed/PTdfg+ 088Ra4JqNWxxV7Uvgr5IJdRTFaJcz43ERTPzE2NIV0C09uILsPK1RCZYZuejPI9i/fjv 3fYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=X7MVUjdxoTYLktr/8CAz1umz/rNBVR6DTMKwArT/SjI=; b=oASZA/9h58iTt3lpGjp71oa10EeUXqGad1lkLkiMeZAZJK3rCQcqQ6k7Ap7MIQfkFO qNLCH+j2g3WAsmrBW9bfArwc7ZH1yDNJ1aYlm7V9E3pGs9NQgP9s6/BvD8mMv+Jepdnm 4QVns2vxsOKZAUOJ/IROK/Q5Bj7nvFsxUASLQ7J1yoHA0qdAq2HxRoCOubc3IxWRkvhJ weHHbQsYfm39MrSsmFkudgB8zMVCfh61W068czI1Wf07frB72jwuajPkZkbqUiTL0/i8 jbzOfu3Y8o7wvQstd88Dhr60R0l3gP3Xd8SlJsY1gZvZDnP81CtI+HAf8VU+P1thcIgi dZZA== X-Gm-Message-State: AGi0PuYzRSHnoS+4GGAwvPGGImqor/9lu1ysR4jRdRoJGfC6SJyVScX4 YUiqAYx3SWG6+iofsAiI3F0= X-Google-Smtp-Source: APiQypJEopX9J7e6sNl1kmYurlQKu4voKqn4JL7aajl4MYhYcb+7jAwZXAvFZLoBz74E1dpgMe756w== X-Received: by 2002:a17:902:5a0b:: with SMTP id q11mr17051128pli.23.1586782470530; Mon, 13 Apr 2020 05:54:30 -0700 (PDT) Received: from bobo.ibm.com (60-241-117-97.tpgi.com.au. [60.241.117.97]) by smtp.gmail.com with ESMTPSA id j24sm9235610pji.20.2020.04.13.05.54.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Apr 2020 05:54:30 -0700 (PDT) From: Nicholas Piggin To: linux-mm@kvack.org Subject: [PATCH v2 2/4] mm: Move ioremap page table mapping function to mm/ Date: Mon, 13 Apr 2020 22:53:01 +1000 Message-Id: <20200413125303.423864-3-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200413125303.423864-1-npiggin@gmail.com> References: <20200413125303.423864-1-npiggin@gmail.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200413_055431_720230_660B0E01 X-CRM114-Status: GOOD ( 18.02 ) X-Spam-Score: -0.2 (/) X-Spam-Report: SpamAssassin version 3.4.4 on bombadil.infradead.org summary: Content analysis details: (-0.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:1043 listed in] [list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider [npiggin[at]gmail.com] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Catalin Marinas , x86@kernel.org, linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-kernel@vger.kernel.org, Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Thomas Gleixner , Will Deacon , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org ioremap_page_range is a generic function to create a kernel virtual mapping, move it to mm/vmalloc.c and rename it vmap_range. For clarity with this move, also: - Rename vunmap_page_range (vmap_range's inverse) to vunmap_range. - Rename vmap_pages_range (which takes a page array) to vmap_pages. Signed-off-by: Nicholas Piggin --- include/linux/vmalloc.h | 3 + lib/ioremap.c | 182 +++--------------------------- mm/vmalloc.c | 239 ++++++++++++++++++++++++++++++++++++---- 3 files changed, 239 insertions(+), 185 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 0507a162ccd0..eb8a5080e472 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -173,6 +173,9 @@ extern struct vm_struct *find_vm_area(const void *addr); extern int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page **pages); #ifdef CONFIG_MMU +int vmap_range(unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift); extern int map_kernel_range_noflush(unsigned long start, unsigned long size, pgprot_t prot, struct page **pages); extern void unmap_kernel_range_noflush(unsigned long addr, unsigned long size); diff --git a/lib/ioremap.c b/lib/ioremap.c index 3f0e18543de8..7e383bdc51ad 100644 --- a/lib/ioremap.c +++ b/lib/ioremap.c @@ -60,176 +60,26 @@ static inline int ioremap_pud_enabled(void) { return 0; } static inline int ioremap_pmd_enabled(void) { return 0; } #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ -static int ioremap_pte_range(pmd_t *pmd, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) -{ - pte_t *pte; - u64 pfn; - - pfn = phys_addr >> PAGE_SHIFT; - pte = pte_alloc_kernel(pmd, addr); - if (!pte) - return -ENOMEM; - do { - BUG_ON(!pte_none(*pte)); - set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot)); - pfn++; - } while (pte++, addr += PAGE_SIZE, addr != end); - return 0; -} - -static int ioremap_try_huge_pmd(pmd_t *pmd, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, - pgprot_t prot) -{ - if (!ioremap_pmd_enabled()) - return 0; - - if ((end - addr) != PMD_SIZE) - return 0; - - if (!IS_ALIGNED(addr, PMD_SIZE)) - return 0; - - if (!IS_ALIGNED(phys_addr, PMD_SIZE)) - return 0; - - if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr)) - return 0; - - return pmd_set_huge(pmd, phys_addr, prot); -} - -static inline int ioremap_pmd_range(pud_t *pud, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) -{ - pmd_t *pmd; - unsigned long next; - - pmd = pmd_alloc(&init_mm, pud, addr); - if (!pmd) - return -ENOMEM; - do { - next = pmd_addr_end(addr, end); - - if (ioremap_try_huge_pmd(pmd, addr, next, phys_addr, prot)) - continue; - - if (ioremap_pte_range(pmd, addr, next, phys_addr, prot)) - return -ENOMEM; - } while (pmd++, phys_addr += (next - addr), addr = next, addr != end); - return 0; -} - -static int ioremap_try_huge_pud(pud_t *pud, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, - pgprot_t prot) -{ - if (!ioremap_pud_enabled()) - return 0; - - if ((end - addr) != PUD_SIZE) - return 0; - - if (!IS_ALIGNED(addr, PUD_SIZE)) - return 0; - - if (!IS_ALIGNED(phys_addr, PUD_SIZE)) - return 0; - - if (pud_present(*pud) && !pud_free_pmd_page(pud, addr)) - return 0; - - return pud_set_huge(pud, phys_addr, prot); -} - -static inline int ioremap_pud_range(p4d_t *p4d, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) -{ - pud_t *pud; - unsigned long next; - - pud = pud_alloc(&init_mm, p4d, addr); - if (!pud) - return -ENOMEM; - do { - next = pud_addr_end(addr, end); - - if (ioremap_try_huge_pud(pud, addr, next, phys_addr, prot)) - continue; - - if (ioremap_pmd_range(pud, addr, next, phys_addr, prot)) - return -ENOMEM; - } while (pud++, phys_addr += (next - addr), addr = next, addr != end); - return 0; -} - -static int ioremap_try_huge_p4d(p4d_t *p4d, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, - pgprot_t prot) -{ - if (!ioremap_p4d_enabled()) - return 0; - - if ((end - addr) != P4D_SIZE) - return 0; - - if (!IS_ALIGNED(addr, P4D_SIZE)) - return 0; - - if (!IS_ALIGNED(phys_addr, P4D_SIZE)) - return 0; - - if (p4d_present(*p4d) && !p4d_free_pud_page(p4d, addr)) - return 0; - - return p4d_set_huge(p4d, phys_addr, prot); -} - -static inline int ioremap_p4d_range(pgd_t *pgd, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) -{ - p4d_t *p4d; - unsigned long next; - - p4d = p4d_alloc(&init_mm, pgd, addr); - if (!p4d) - return -ENOMEM; - do { - next = p4d_addr_end(addr, end); - - if (ioremap_try_huge_p4d(p4d, addr, next, phys_addr, prot)) - continue; - - if (ioremap_pud_range(p4d, addr, next, phys_addr, prot)) - return -ENOMEM; - } while (p4d++, phys_addr += (next - addr), addr = next, addr != end); - return 0; -} - int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot) { - pgd_t *pgd; - unsigned long start; - unsigned long next; - int err; - - might_sleep(); - BUG_ON(addr >= end); - - start = addr; - pgd = pgd_offset_k(addr); - do { - next = pgd_addr_end(addr, end); - err = ioremap_p4d_range(pgd, addr, next, phys_addr, prot); - if (err) - break; - } while (pgd++, phys_addr += (next - addr), addr = next, addr != end); - - flush_cache_vmap(start, end); + unsigned int max_page_shift = PAGE_SHIFT; + + /* + * Due to the max_page_shift parameter to vmap_range, platforms must + * enable all smaller sizes to take advantage of a given size, + * otherwise fall back to small pages. + */ + if (ioremap_pmd_enabled()) { + max_page_shift = PMD_SHIFT; + if (ioremap_pud_enabled()) { + max_page_shift = PUD_SHIFT; + if (ioremap_p4d_enabled()) + max_page_shift = P4D_SHIFT; + } + } - return err; + return vmap_range(addr, end, phys_addr, prot, max_page_shift); } #ifdef CONFIG_GENERIC_IOREMAP diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 1afec7def23f..b1bc2fcae4e0 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -128,7 +128,7 @@ static void vunmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end) } while (p4d++, addr = next, addr != end); } -static void vunmap_page_range(unsigned long addr, unsigned long end) +static void vunmap_range(unsigned long addr, unsigned long end) { pgd_t *pgd; unsigned long next; @@ -143,7 +143,208 @@ static void vunmap_page_range(unsigned long addr, unsigned long end) } while (pgd++, addr = next, addr != end); } -static int vmap_pte_range(pmd_t *pmd, unsigned long addr, +static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, + phys_addr_t phys_addr, pgprot_t prot) +{ + pte_t *pte; + u64 pfn; + + pfn = phys_addr >> PAGE_SHIFT; + pte = pte_alloc_kernel(pmd, addr); + if (!pte) + return -ENOMEM; + do { + BUG_ON(!pte_none(*pte)); + set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot)); + pfn++; + } while (pte++, addr += PAGE_SIZE, addr != end); + return 0; +} + +static int vmap_try_huge_pmd(pmd_t *pmd, unsigned long addr, unsigned long end, + phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP)) + return 0; + + if (max_page_shift < PMD_SHIFT) + return 0; + + if ((end - addr) != PMD_SIZE) + return 0; + + if (!IS_ALIGNED(addr, PMD_SIZE)) + return 0; + + if (!IS_ALIGNED(phys_addr, PMD_SIZE)) + return 0; + + if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr)) + return 0; + + return pmd_set_huge(pmd, phys_addr, prot); +} + +static inline int vmap_pmd_range(pud_t *pud, unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + pmd_t *pmd; + unsigned long next; + + pmd = pmd_alloc(&init_mm, pud, addr); + if (!pmd) + return -ENOMEM; + do { + next = pmd_addr_end(addr, end); + + if (vmap_try_huge_pmd(pmd, addr, next, phys_addr, prot, + max_page_shift)) + continue; + + if (vmap_pte_range(pmd, addr, next, phys_addr, prot)) + return -ENOMEM; + } while (pmd++, phys_addr += (next - addr), addr = next, addr != end); + return 0; +} + +static int vmap_try_huge_pud(pud_t *pud, unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP)) + return 0; + + if (max_page_shift < PUD_SHIFT) + return 0; + + if ((end - addr) != PUD_SIZE) + return 0; + + if (!IS_ALIGNED(addr, PUD_SIZE)) + return 0; + + if (!IS_ALIGNED(phys_addr, PUD_SIZE)) + return 0; + + if (pud_present(*pud) && !pud_free_pmd_page(pud, addr)) + return 0; + + return pud_set_huge(pud, phys_addr, prot); +} + +static inline int vmap_pud_range(p4d_t *p4d, unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + pud_t *pud; + unsigned long next; + + pud = pud_alloc(&init_mm, p4d, addr); + if (!pud) + return -ENOMEM; + do { + next = pud_addr_end(addr, end); + + if (vmap_try_huge_pud(pud, addr, next, phys_addr, prot, + max_page_shift)) + continue; + + if (vmap_pmd_range(pud, addr, next, phys_addr, prot, + max_page_shift)) + return -ENOMEM; + } while (pud++, phys_addr += (next - addr), addr = next, addr != end); + return 0; +} + +static int vmap_try_huge_p4d(p4d_t *p4d, unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP)) + return 0; + + if (max_page_shift < P4D_SHIFT) + return 0; + + if ((end - addr) != P4D_SIZE) + return 0; + + if (!IS_ALIGNED(addr, P4D_SIZE)) + return 0; + + if (!IS_ALIGNED(phys_addr, P4D_SIZE)) + return 0; + + if (p4d_present(*p4d) && !p4d_free_pud_page(p4d, addr)) + return 0; + + return p4d_set_huge(p4d, phys_addr, prot); +} + +static inline int vmap_p4d_range(pgd_t *pgd, unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + p4d_t *p4d; + unsigned long next; + + p4d = p4d_alloc(&init_mm, pgd, addr); + if (!p4d) + return -ENOMEM; + do { + next = p4d_addr_end(addr, end); + + if (vmap_try_huge_p4d(p4d, addr, next, phys_addr, prot, + max_page_shift)) + continue; + + if (vmap_pud_range(p4d, addr, next, phys_addr, prot, + max_page_shift)) + return -ENOMEM; + } while (p4d++, phys_addr += (next - addr), addr = next, addr != end); + return 0; +} + +static int vmap_range_noflush(unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + pgd_t *pgd; + unsigned long start; + unsigned long next; + int err; + + might_sleep(); + BUG_ON(addr >= end); + + start = addr; + pgd = pgd_offset_k(addr); + do { + next = pgd_addr_end(addr, end); + err = vmap_p4d_range(pgd, addr, next, phys_addr, prot, + max_page_shift); + if (err) + break; + } while (pgd++, phys_addr += (next - addr), addr = next, addr != end); + + return err; +} + +int vmap_range(unsigned long addr, + unsigned long end, phys_addr_t phys_addr, pgprot_t prot, + unsigned int max_page_shift) +{ + int ret; + + ret = vmap_range_noflush(addr, end, phys_addr, prot, max_page_shift); + flush_cache_vmap(addr, end); + + return ret; +} + +static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, int *nr) { pte_t *pte; @@ -169,7 +370,7 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, return 0; } -static int vmap_pmd_range(pud_t *pud, unsigned long addr, +static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, int *nr) { pmd_t *pmd; @@ -180,13 +381,13 @@ static int vmap_pmd_range(pud_t *pud, unsigned long addr, return -ENOMEM; do { next = pmd_addr_end(addr, end); - if (vmap_pte_range(pmd, addr, next, prot, pages, nr)) + if (vmap_pages_pte_range(pmd, addr, next, prot, pages, nr)) return -ENOMEM; } while (pmd++, addr = next, addr != end); return 0; } -static int vmap_pud_range(p4d_t *p4d, unsigned long addr, +static int vmap_pages_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, int *nr) { pud_t *pud; @@ -197,13 +398,13 @@ static int vmap_pud_range(p4d_t *p4d, unsigned long addr, return -ENOMEM; do { next = pud_addr_end(addr, end); - if (vmap_pmd_range(pud, addr, next, prot, pages, nr)) + if (vmap_pages_pmd_range(pud, addr, next, prot, pages, nr)) return -ENOMEM; } while (pud++, addr = next, addr != end); return 0; } -static int vmap_p4d_range(pgd_t *pgd, unsigned long addr, +static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, int *nr) { p4d_t *p4d; @@ -214,7 +415,7 @@ static int vmap_p4d_range(pgd_t *pgd, unsigned long addr, return -ENOMEM; do { next = p4d_addr_end(addr, end); - if (vmap_pud_range(p4d, addr, next, prot, pages, nr)) + if (vmap_pages_pud_range(p4d, addr, next, prot, pages, nr)) return -ENOMEM; } while (p4d++, addr = next, addr != end); return 0; @@ -226,7 +427,7 @@ static int vmap_p4d_range(pgd_t *pgd, unsigned long addr, * * Ie. pte at addr+N*PAGE_SIZE shall point to pfn corresponding to pages[N] */ -static int vmap_page_range_noflush(unsigned long start, unsigned long end, +static int vmap_pages_range_noflush(unsigned long start, unsigned long end, pgprot_t prot, struct page **pages) { pgd_t *pgd; @@ -239,7 +440,7 @@ static int vmap_page_range_noflush(unsigned long start, unsigned long end, pgd = pgd_offset_k(addr); do { next = pgd_addr_end(addr, end); - err = vmap_p4d_range(pgd, addr, next, prot, pages, &nr); + err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr); if (err) return err; } while (pgd++, addr = next, addr != end); @@ -247,12 +448,12 @@ static int vmap_page_range_noflush(unsigned long start, unsigned long end, return nr; } -static int vmap_page_range(unsigned long start, unsigned long end, +static int vmap_pages_range(unsigned long start, unsigned long end, pgprot_t prot, struct page **pages) { int ret; - ret = vmap_page_range_noflush(start, end, prot, pages); + ret = vmap_pages_range_noflush(start, end, prot, pages); flush_cache_vmap(start, end); return ret; } @@ -1238,7 +1439,7 @@ EXPORT_SYMBOL_GPL(unregister_vmap_purge_notifier); */ static void unmap_vmap_area(struct vmap_area *va) { - vunmap_page_range(va->va_start, va->va_end); + vunmap_range(va->va_start, va->va_end); } /* @@ -1699,7 +1900,7 @@ static void vb_free(const void *addr, unsigned long size) rcu_read_unlock(); BUG_ON(!vb); - vunmap_page_range((unsigned long)addr, (unsigned long)addr + size); + vunmap_range((unsigned long)addr, (unsigned long)addr + size); if (debug_pagealloc_enabled_static()) flush_tlb_kernel_range((unsigned long)addr, @@ -1854,7 +2055,7 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node, pgprot_t pro kasan_unpoison_vmalloc(mem, size); - if (vmap_page_range(addr, addr + size, prot, pages) < 0) { + if (vmap_pages_range(addr, addr + size, prot, pages) < 0) { vm_unmap_ram(mem, count); return NULL; } @@ -2020,7 +2221,7 @@ void __init vmalloc_init(void) int map_kernel_range_noflush(unsigned long addr, unsigned long size, pgprot_t prot, struct page **pages) { - return vmap_page_range_noflush(addr, addr + size, prot, pages); + return vmap_pages_range_noflush(addr, addr + size, prot, pages); } /** @@ -2039,7 +2240,7 @@ int map_kernel_range_noflush(unsigned long addr, unsigned long size, */ void unmap_kernel_range_noflush(unsigned long addr, unsigned long size) { - vunmap_page_range(addr, addr + size); + vunmap_range(addr, addr + size); } EXPORT_SYMBOL_GPL(unmap_kernel_range_noflush); @@ -2056,7 +2257,7 @@ void unmap_kernel_range(unsigned long addr, unsigned long size) unsigned long end = addr + size; flush_cache_vunmap(addr, end); - vunmap_page_range(addr, end); + vunmap_range(addr, end); flush_tlb_kernel_range(addr, end); } EXPORT_SYMBOL_GPL(unmap_kernel_range); @@ -2067,7 +2268,7 @@ int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page **pages) unsigned long end = addr + get_vm_area_size(area); int err; - err = vmap_page_range(addr, end, prot, pages); + err = vmap_pages_range(addr, end, prot, pages); return err > 0 ? 0 : err; } From patchwork Mon Apr 13 12:53:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 11485545 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C6D814B4 for ; Mon, 13 Apr 2020 12:55:26 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 678702073E for ; Mon, 13 Apr 2020 12:55:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="pJrd2iYl"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TgVrUESm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 678702073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=6PyAaf55SXwQ94l7JhRUvUp2069kDPWvN0OrTDHrOVc=; b=pJrd2iYlAyfjT3 amXzndkUtOcSO8XC9kiPjFmYVjbW0AcelHkSlhNJQbmeuUSnlm4IHSBZg5+efWSltF+f8FF5hoFgD fOH/XvhjVIIjXl1uElNx+WL0TJ5E7DrQz5uhmohyku+12LE5u+jZ8rA98G0N4veyvg/tEyVpYYaZq Kmi/w5oOXRsiN47g5gpo5Vau2k3ZDqh/vWOM/rONGLDNNqJM8lXtsOAStdkPpayIh9azs3qTiO7Np Pgv5pzpO9o9X1uA9+voSvcbZauntZgk8M8+7zyuLeqEBuYNCTWqpAexiAYSF96LY96WOrcg4fcNWL etzAYy+jXe1QhgmeU7mA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNycT-00088P-0z; Mon, 13 Apr 2020 12:55:21 +0000 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNybn-00059Y-1U for linux-arm-kernel@lists.infradead.org; Mon, 13 Apr 2020 12:54:40 +0000 Received: by mail-pl1-x642.google.com with SMTP id ay1so3369906plb.0 for ; Mon, 13 Apr 2020 05:54:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=t/wcNzaAB5pAArlx+ZjBI9wsNpeJjlLOlloWkaJo2n4=; b=TgVrUESmDUMV2qkPYNmjlgmMwQVS/rXuX+p4JfyO5oH8F93TSYD1nLlg8YLxzziZCo KvrlLoETWJNRh5ducV8daE6BuUTVDAl1vMXM5Q72je6aKY+eMRSLrxC8RWoQFaQwl4EE 87rU/kiWEXxXduA3cgMoXJvdHG2+ufl7y5bWwp/udEWoMYooNnyNHOalR7+iMriyogMK jQgHAfqvoWEw4jHqF6GPVmrp+9RzUGl0Ca8Q6vKv6ZudkHE8IXEUQ2/WBSZtyYCoReWl fOHmutSGi+FG7GH/xmlRuosY0QpoX2sc+SakFBvE+yTmDI/0yky8hBK2vk7uoffF8TA5 dpUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=t/wcNzaAB5pAArlx+ZjBI9wsNpeJjlLOlloWkaJo2n4=; b=NSj9YmdOU2erRrbVgFWvregcMQc2vswokMiHmAY91RDoFd7/jFGGNAK2t04Guk5UG6 OTASQI2ia1lD7Rzc0PabtITj2GUy2JJAGwjsYlFcDrCTNoWkba6Kc/bvnQk/9LGBv/NS V+XllQCuSyv+IikG7iNtdw5pFiqvKupEVPBbIA3Wj2+PRWZkbWChJbqFsWUN+znBWeNt WIfKxeLDx7auLO+7OQ6gOZs29O5h79uG9BOsC4MlJ3wFMGYwGFPzi+V96QcvfziAjvue KenXc0JHjZNY4vo764PU3xzwg3iTZCAHcYmNnwhWLTkCezPajT4yQq7LmpNKJoEPDzO1 xZYw== X-Gm-Message-State: AGi0Pua3Z/xk60BB50S36TbnKx/VD8P1w7XPUbisWCZKSJoWAPA24DnT UrUU0bQf82O8736PC0+b77k= X-Google-Smtp-Source: APiQypLq3osf0ABPUom7dyWk1j667e+sJMY76Pzt7difwoQJbWgnQtWDrEdJQ9QwzVXtD99r1MM5zw== X-Received: by 2002:a17:90b:3585:: with SMTP id mm5mr21755398pjb.168.1586782477948; Mon, 13 Apr 2020 05:54:37 -0700 (PDT) Received: from bobo.ibm.com (60-241-117-97.tpgi.com.au. [60.241.117.97]) by smtp.gmail.com with ESMTPSA id j24sm9235610pji.20.2020.04.13.05.54.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Apr 2020 05:54:37 -0700 (PDT) From: Nicholas Piggin To: linux-mm@kvack.org Subject: [PATCH v2 3/4] mm: HUGE_VMAP arch query functions cleanup Date: Mon, 13 Apr 2020 22:53:02 +1000 Message-Id: <20200413125303.423864-4-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200413125303.423864-1-npiggin@gmail.com> References: <20200413125303.423864-1-npiggin@gmail.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200413_055439_102520_DF0DE14F X-CRM114-Status: GOOD ( 19.14 ) X-Spam-Score: -0.2 (/) X-Spam-Report: SpamAssassin version 3.4.4 on bombadil.infradead.org summary: Content analysis details: (-0.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:642 listed in] [list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider [npiggin[at]gmail.com] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Catalin Marinas , x86@kernel.org, linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-kernel@vger.kernel.org, Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Thomas Gleixner , Will Deacon , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org This changes the awkward approach where architectures provide init functions to determine which levels they can provide large mappings for, to one where the arch is queried for each call. This allows odd configurations to be allowed (PUD but not PMD), and will make it easier to constant-fold dead code away if the arch inlines unsupported levels. This also adds a prot argument to the arch query. This is unused currently but could help with some architectures (some powerpc implementations can't map uncacheable memory with large pages for example). The name is changed from ioremap to vmap, as it will be used more generally in the next patch. Signed-off-by: Nicholas Piggin --- arch/arm64/mm/mmu.c | 8 ++-- arch/powerpc/mm/book3s64/radix_pgtable.c | 6 +-- arch/x86/mm/ioremap.c | 6 +-- include/linux/io.h | 3 -- include/linux/vmalloc.h | 10 +++++ lib/ioremap.c | 51 ++---------------------- mm/vmalloc.c | 9 +++++ 7 files changed, 33 insertions(+), 60 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index a374e4f51a62..b8e381c46fa1 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1244,12 +1244,12 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot) return dt_virt; } -int __init arch_ioremap_p4d_supported(void) +bool arch_vmap_p4d_supported(pgprot_t prot) { return 0; } -int __init arch_ioremap_pud_supported(void) +bool arch_vmap_pud_supported(pgprot_t prot) { /* * Only 4k granule supports level 1 block mappings. @@ -1259,9 +1259,9 @@ int __init arch_ioremap_pud_supported(void) !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); } -int __init arch_ioremap_pmd_supported(void) +bool arch_vmap_pmd_supported(pgprot_t prot) { - /* See arch_ioremap_pud_supported() */ + /* See arch_vmap_pud_supported() */ return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); } diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index 8f9edf07063a..5130e7912dd4 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1091,13 +1091,13 @@ void radix__ptep_modify_prot_commit(struct vm_area_struct *vma, set_pte_at(mm, addr, ptep, pte); } -int __init arch_ioremap_pud_supported(void) +bool arch_vmap_pud_supported(pgprot_t prot) { /* HPT does not cope with large pages in the vmalloc area */ return radix_enabled(); } -int __init arch_ioremap_pmd_supported(void) +bool arch_vmap_pmd_supported(pgprot_t prot) { return radix_enabled(); } @@ -1191,7 +1191,7 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr) return 1; } -int __init arch_ioremap_p4d_supported(void) +bool arch_vmap_p4d_supported(pgprot_t prot) { return 0; } diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 18c637c0dc6f..bb4b75c344e4 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -481,12 +481,12 @@ void iounmap(volatile void __iomem *addr) } EXPORT_SYMBOL(iounmap); -int __init arch_ioremap_p4d_supported(void) +bool arch_vmap_p4d_supported(pgprot_t prot) { return 0; } -int __init arch_ioremap_pud_supported(void) +bool arch_vmap_pud_supported(pgprot_t prot) { #ifdef CONFIG_X86_64 return boot_cpu_has(X86_FEATURE_GBPAGES); @@ -495,7 +495,7 @@ int __init arch_ioremap_pud_supported(void) #endif } -int __init arch_ioremap_pmd_supported(void) +bool arch_vmap_pmd_supported(pgprot_t prot) { return boot_cpu_has(X86_FEATURE_PSE); } diff --git a/include/linux/io.h b/include/linux/io.h index 8394c56babc2..2832e051bc2e 100644 --- a/include/linux/io.h +++ b/include/linux/io.h @@ -33,9 +33,6 @@ static inline int ioremap_page_range(unsigned long addr, unsigned long end, #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP void __init ioremap_huge_init(void); -int arch_ioremap_p4d_supported(void); -int arch_ioremap_pud_supported(void); -int arch_ioremap_pmd_supported(void); #else static inline void ioremap_huge_init(void) { } #endif diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index eb8a5080e472..291313a7e663 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -84,6 +84,16 @@ struct vmap_area { }; }; +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP +bool arch_vmap_p4d_supported(pgprot_t prot); +bool arch_vmap_pud_supported(pgprot_t prot); +bool arch_vmap_pmd_supported(pgprot_t prot); +#else +static inline bool arch_vmap_p4d_supported(pgprot_t prot) { return false; } +static inline bool arch_vmap_pud_supported(pgprot_t prot) { return false; } +static inline bool arch_vmap_pmd_supported(prprot_t prot) { return false; } +#endif + /* * Highlevel APIs for driver use */ diff --git a/lib/ioremap.c b/lib/ioremap.c index 7e383bdc51ad..0a1ddf1a1286 100644 --- a/lib/ioremap.c +++ b/lib/ioremap.c @@ -14,10 +14,9 @@ #include #include +static unsigned int __read_mostly max_page_shift = PAGE_SHIFT; + #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP -static int __read_mostly ioremap_p4d_capable; -static int __read_mostly ioremap_pud_capable; -static int __read_mostly ioremap_pmd_capable; static int __read_mostly ioremap_huge_disabled; static int __init set_nohugeiomap(char *str) @@ -29,56 +28,14 @@ early_param("nohugeiomap", set_nohugeiomap); void __init ioremap_huge_init(void) { - if (!ioremap_huge_disabled) { - if (arch_ioremap_p4d_supported()) - ioremap_p4d_capable = 1; - if (arch_ioremap_pud_supported()) - ioremap_pud_capable = 1; - if (arch_ioremap_pmd_supported()) - ioremap_pmd_capable = 1; - } -} - -static inline int ioremap_p4d_enabled(void) -{ - return ioremap_p4d_capable; -} - -static inline int ioremap_pud_enabled(void) -{ - return ioremap_pud_capable; + if (!ioremap_huge_disabled) + max_page_shift = P4D_SHIFT; } - -static inline int ioremap_pmd_enabled(void) -{ - return ioremap_pmd_capable; -} - -#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */ -static inline int ioremap_p4d_enabled(void) { return 0; } -static inline int ioremap_pud_enabled(void) { return 0; } -static inline int ioremap_pmd_enabled(void) { return 0; } #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot) { - unsigned int max_page_shift = PAGE_SHIFT; - - /* - * Due to the max_page_shift parameter to vmap_range, platforms must - * enable all smaller sizes to take advantage of a given size, - * otherwise fall back to small pages. - */ - if (ioremap_pmd_enabled()) { - max_page_shift = PMD_SHIFT; - if (ioremap_pud_enabled()) { - max_page_shift = PUD_SHIFT; - if (ioremap_p4d_enabled()) - max_page_shift = P4D_SHIFT; - } - } - return vmap_range(addr, end, phys_addr, prot, max_page_shift); } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index b1bc2fcae4e0..c898d16ddd25 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -171,6 +171,9 @@ static int vmap_try_huge_pmd(pmd_t *pmd, unsigned long addr, unsigned long end, if (max_page_shift < PMD_SHIFT) return 0; + if (!arch_vmap_pmd_supported(prot)) + return 0; + if ((end - addr) != PMD_SIZE) return 0; @@ -219,6 +222,9 @@ static int vmap_try_huge_pud(pud_t *pud, unsigned long addr, if (max_page_shift < PUD_SHIFT) return 0; + if (!arch_vmap_pud_supported(prot)) + return 0; + if ((end - addr) != PUD_SIZE) return 0; @@ -268,6 +274,9 @@ static int vmap_try_huge_p4d(p4d_t *p4d, unsigned long addr, if (max_page_shift < P4D_SHIFT) return 0; + if (!arch_vmap_p4d_supported(prot)) + return 0; + if ((end - addr) != P4D_SIZE) return 0; From patchwork Mon Apr 13 12:53:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 11485547 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9CE2614B4 for ; Mon, 13 Apr 2020 12:55:38 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7059B2073E for ; Mon, 13 Apr 2020 12:55:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="NLpjeAn7"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Q9cq4BSX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7059B2073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Ae926JRTOiI6iyF4zufWJUtO5mG6GNZwpmhX7JRh4+E=; b=NLpjeAn7ovV+o2 TTikmgCNY8Rc64DBwL71MUVCmyCfq9OwZHd4rcW7vnDxRgFcYDk43OU14qr8qFxwAmQwZdDFXL4TB 7pVcnzN0aysFTGEatEgzwiYe3gDQzugBcwDr9U5f174FFARWiRmjA8hOfmmGARZA99O/eE6wwQB2n BUhJKcScr1FjpJf8/+FCsOvIhf4iaQ7KhnGlNHYErfkTwBqTJ6wzRXGWqsjcTWeijwcghHWXv/vi7 yzq4rE7hwaxpM3ELprxciszu294N8m0Qn1Q1s1eEmPHaslsvYJuIBOlEzJsojFYS//jvtaBQCxn/x 5/xWp2zDiPBJAKo3xchQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNycc-0008JW-Py; Mon, 13 Apr 2020 12:55:30 +0000 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jNybu-0005GY-CP for linux-arm-kernel@lists.infradead.org; Mon, 13 Apr 2020 12:54:48 +0000 Received: by mail-pg1-x541.google.com with SMTP id g32so4407607pgb.6 for ; Mon, 13 Apr 2020 05:54:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Qk4un8ULMdXpM0K9LjOM98cSXRzF1nvffzEAv4Kshq8=; b=Q9cq4BSXvpDJkF44OYvCY3c1oNlMk/HQ/nVFmgKXULYf5hnvynyl4LikGYswNKiEGx /yI4NA4BAwec+DJ3xhTEO0gofWhtNsGyf/hrZsBHWCpFKdDzKptE6HacW24fZC43hd2B lG9h6Woq//WusKGad+EiF+BpB0Hhlj0V2OYr6VuW4/mWs8SUEfQvSBaGxx2CwbivYy0P z4B2EE3rOjqIN6Kup7jkFz6VasY4jB7+ESfCMYXSdBdo0bXRccODX4ziXzXrF162FjkF /oxw3WE+k0Q+lf9GxyYv5KWuztzuUHdjSEQWPxQ2yEUHEB6xZtR9h6YclLmVop3r8z9S c/tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Qk4un8ULMdXpM0K9LjOM98cSXRzF1nvffzEAv4Kshq8=; b=l7mIOAVjm1Dz+6yHQcsqeEppu0en//LGQo3r+yNYiFVeNJt3tRHCPfIHNPBSsLe0qR gFJa0+4U6zaZgng3I3JR3+SSF1EVMqamhx+U7gh6wF1bzrg8c6mDJ3VCo9SI4FrVNKoe q5rKBaGqHd+qt/2w3p9vcWP9sbWDpLIPpWEHCj0CVDWppEezU7f7pmgDceaJA8nvLKKA Ym9njJCF8uoNfoOeGhY6f8pZ68yC7y1oC53xd6ri57LiC3TDXwiRnrLjJOqnfffz9F17 A8EMm/GY+pO4sB47qWqEXCTp+6lJsJ+BBjcQvf/JuD2//mNRSRFWyc/wZQEVxt9pLVcj bjtg== X-Gm-Message-State: AGi0PubLkAb0wG6toxMW5wN4rqqseLxqbpxlxwY++WV0uIW3JiW1LPfe 13IBm87J8bTs5nRGIlihIRBZpFV0 X-Google-Smtp-Source: APiQypIDipqbTZlvQxQvYbcGI5NK+NXEeeCaaI1z/YyLpYU1GKzS0bbZemmOMw2bVFHR03T3+hzcDw== X-Received: by 2002:a62:7b84:: with SMTP id w126mr18057331pfc.202.1586782485609; Mon, 13 Apr 2020 05:54:45 -0700 (PDT) Received: from bobo.ibm.com (60-241-117-97.tpgi.com.au. [60.241.117.97]) by smtp.gmail.com with ESMTPSA id j24sm9235610pji.20.2020.04.13.05.54.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Apr 2020 05:54:45 -0700 (PDT) From: Nicholas Piggin To: linux-mm@kvack.org Subject: [PATCH v2 4/4] mm/vmalloc: Hugepage vmalloc mappings Date: Mon, 13 Apr 2020 22:53:03 +1000 Message-Id: <20200413125303.423864-5-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200413125303.423864-1-npiggin@gmail.com> References: <20200413125303.423864-1-npiggin@gmail.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200413_055446_447577_BD5A5A2C X-CRM114-Status: GOOD ( 20.73 ) X-Spam-Score: -0.2 (/) X-Spam-Report: SpamAssassin version 3.4.4 on bombadil.infradead.org summary: Content analysis details: (-0.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:541 listed in] [list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider [npiggin[at]gmail.com] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Catalin Marinas , x86@kernel.org, linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , linux-kernel@vger.kernel.org, Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Thomas Gleixner , Will Deacon , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org For platforms that define HAVE_ARCH_HUGE_VMAP and support PMD vmap mappings, have vmalloc attempt to allocate PMD-sized pages first, before falling back to small pages. Allocations which use something other than PAGE_KERNEL protections are not permitted to use huge pages yet, not all callers expect this (e.g., module allocations vs strict module rwx). This gives a 6x reduction in dTLB misses for a `git diff` (of linux), from 45600 to 6500 and a 2.2% reduction in cycles on a 2-node POWER9. This can result in more internal fragmentation and memory overhead for a given allocation. It can also cause greater NUMA unbalance on hashdist allocations. There may be other callers that expect small pages under vmalloc but use PAGE_KERNEL, I'm not sure if it's feasible to catch them all. An alternative would be a new function or flag which enables large mappings, and use that in callers. Signed-off-by: Nicholas Piggin --- include/linux/vmalloc.h | 2 + mm/vmalloc.c | 135 +++++++++++++++++++++++++++++----------- 2 files changed, 102 insertions(+), 35 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 291313a7e663..853b82eac192 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -24,6 +24,7 @@ struct notifier_block; /* in notifier.h */ #define VM_UNINITIALIZED 0x00000020 /* vm_struct is not fully initialized */ #define VM_NO_GUARD 0x00000040 /* don't add guard page */ #define VM_KASAN 0x00000080 /* has allocated kasan shadow memory */ +#define VM_HUGE_PAGES 0x00000100 /* may use huge pages */ /* * VM_KASAN is used slighly differently depending on CONFIG_KASAN_VMALLOC. @@ -58,6 +59,7 @@ struct vm_struct { unsigned long size; unsigned long flags; struct page **pages; + unsigned int page_order; unsigned int nr_pages; phys_addr_t phys_addr; const void *caller; diff --git a/mm/vmalloc.c b/mm/vmalloc.c index c898d16ddd25..7b7e992c5ff1 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -436,7 +436,7 @@ static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr, * * Ie. pte at addr+N*PAGE_SIZE shall point to pfn corresponding to pages[N] */ -static int vmap_pages_range_noflush(unsigned long start, unsigned long end, +static int vmap_small_pages_range_noflush(unsigned long start, unsigned long end, pgprot_t prot, struct page **pages) { pgd_t *pgd; @@ -457,13 +457,44 @@ static int vmap_pages_range_noflush(unsigned long start, unsigned long end, return nr; } +static int vmap_pages_range_noflush(unsigned long start, unsigned long end, + pgprot_t prot, struct page **pages, + unsigned int page_shift) +{ + if (page_shift == PAGE_SIZE) { + return vmap_small_pages_range_noflush(start, end, prot, pages); + } else { + unsigned long addr = start; + unsigned int i, nr = (end - start) >> page_shift; + + for (i = 0; i < nr; i++) { + int err; + + err = vmap_range_noflush(addr, + addr + (1UL << page_shift), + __pa(page_address(pages[i])), prot, + page_shift); + if (err) + return err; + + addr += 1UL << page_shift; + } + + return 0; + } +} + static int vmap_pages_range(unsigned long start, unsigned long end, - pgprot_t prot, struct page **pages) + pgprot_t prot, struct page **pages, + unsigned int page_shift) { int ret; - ret = vmap_pages_range_noflush(start, end, prot, pages); + BUG_ON(page_shift < PAGE_SHIFT); + + ret = vmap_pages_range_noflush(start, end, prot, pages, page_shift); flush_cache_vmap(start, end); + return ret; } @@ -2064,7 +2095,7 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node, pgprot_t pro kasan_unpoison_vmalloc(mem, size); - if (vmap_pages_range(addr, addr + size, prot, pages) < 0) { + if (vmap_pages_range(addr, addr + size, prot, pages, PAGE_SHIFT) < 0) { vm_unmap_ram(mem, count); return NULL; } @@ -2230,7 +2261,7 @@ void __init vmalloc_init(void) int map_kernel_range_noflush(unsigned long addr, unsigned long size, pgprot_t prot, struct page **pages) { - return vmap_pages_range_noflush(addr, addr + size, prot, pages); + return vmap_pages_range_noflush(addr, addr + size, prot, pages, PAGE_SHIFT); } /** @@ -2277,7 +2308,7 @@ int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page **pages) unsigned long end = addr + get_vm_area_size(area); int err; - err = vmap_pages_range(addr, end, prot, pages); + err = vmap_pages_range(addr, end, prot, pages, PAGE_SHIFT); return err > 0 ? 0 : err; } @@ -2325,9 +2356,11 @@ static struct vm_struct *__get_vm_area_node(unsigned long size, if (unlikely(!size)) return NULL; - if (flags & VM_IOREMAP) - align = 1ul << clamp_t(int, get_count_order_long(size), - PAGE_SHIFT, IOREMAP_MAX_ORDER); + if (flags & VM_IOREMAP) { + align = max(align, + 1ul << clamp_t(int, get_count_order_long(size), + PAGE_SHIFT, IOREMAP_MAX_ORDER)); + } area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); if (unlikely(!area)) @@ -2534,7 +2567,7 @@ static void __vunmap(const void *addr, int deallocate_pages) struct page *page = area->pages[i]; BUG_ON(!page); - __free_pages(page, 0); + __free_pages(page, area->page_order); } atomic_long_sub(area->nr_pages, &nr_vmalloc_pages); @@ -2672,26 +2705,29 @@ void *vmap(struct page **pages, unsigned int count, EXPORT_SYMBOL(vmap); static void *__vmalloc_node(unsigned long size, unsigned long align, - gfp_t gfp_mask, pgprot_t prot, - int node, const void *caller); + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, const void *caller); static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, - pgprot_t prot, int node) + pgprot_t prot, unsigned int page_shift, + int node) { struct page **pages; + unsigned long addr = (unsigned long)area->addr; + unsigned long size = get_vm_area_size(area); + unsigned int page_order = page_shift - PAGE_SHIFT; unsigned int nr_pages, array_size, i; const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO; const gfp_t alloc_mask = gfp_mask | __GFP_NOWARN; const gfp_t highmem_mask = (gfp_mask & (GFP_DMA | GFP_DMA32)) ? - 0 : - __GFP_HIGHMEM; + 0 : __GFP_HIGHMEM; - nr_pages = get_vm_area_size(area) >> PAGE_SHIFT; + nr_pages = size >> page_shift; array_size = (nr_pages * sizeof(struct page *)); /* Please note that the recursion is strictly bounded. */ if (array_size > PAGE_SIZE) { pages = __vmalloc_node(array_size, 1, nested_gfp|highmem_mask, - PAGE_KERNEL, node, area->caller); + PAGE_KERNEL, 0, node, area->caller); } else { pages = kmalloc_node(array_size, nested_gfp, node); } @@ -2704,14 +2740,13 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, area->pages = pages; area->nr_pages = nr_pages; + area->page_order = page_order; for (i = 0; i < area->nr_pages; i++) { struct page *page; - if (node == NUMA_NO_NODE) - page = alloc_page(alloc_mask|highmem_mask); - else - page = alloc_pages_node(node, alloc_mask|highmem_mask, 0); + page = alloc_pages_node(node, + alloc_mask|highmem_mask, page_order); if (unlikely(!page)) { /* Successfully allocated i pages, free them in __vunmap() */ @@ -2725,8 +2760,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, } atomic_long_add(area->nr_pages, &nr_vmalloc_pages); - if (map_vm_area(area, prot, pages)) + if (vmap_pages_range(addr, addr + size, prot, pages, page_shift) < 0) goto fail; + return area->addr; fail: @@ -2760,22 +2796,39 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, pgprot_t prot, unsigned long vm_flags, int node, const void *caller) { - struct vm_struct *area; + struct vm_struct *area = NULL; void *addr; unsigned long real_size = size; + unsigned long real_align = align; + unsigned int shift = PAGE_SHIFT; size = PAGE_ALIGN(size); if (!size || (size >> PAGE_SHIFT) > totalram_pages()) goto fail; - area = __get_vm_area_node(real_size, align, VM_ALLOC | VM_UNINITIALIZED | + if (IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP) && + (vm_flags & VM_HUGE_PAGES)) { + unsigned long size_per_node; + + size_per_node = size; + if (node == NUMA_NO_NODE) + size_per_node /= num_online_nodes(); + if (size_per_node >= PMD_SIZE) + shift = PMD_SHIFT; + } + +again: + align = max(real_align, 1UL << shift); + size = ALIGN(real_size, align); + + area = __get_vm_area_node(size, align, VM_ALLOC | VM_UNINITIALIZED | vm_flags, start, end, node, gfp_mask, caller); if (!area) goto fail; - addr = __vmalloc_area_node(area, gfp_mask, prot, node); + addr = __vmalloc_area_node(area, gfp_mask, prot, shift, node); if (!addr) - return NULL; + goto fail; /* * In this function, newly allocated vm_struct has VM_UNINITIALIZED @@ -2789,8 +2842,16 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, return addr; fail: - warn_alloc(gfp_mask, NULL, + if (shift > PAGE_SHIFT) { + shift = PAGE_SHIFT; + goto again; + } + + if (!area) { + /* Warn for area allocation, page allocations already warn */ + warn_alloc(gfp_mask, NULL, "vmalloc: allocation failure: %lu bytes", real_size); + } return NULL; } @@ -2825,16 +2886,19 @@ EXPORT_SYMBOL_GPL(__vmalloc_node_range); * Return: pointer to the allocated memory or %NULL on error */ static void *__vmalloc_node(unsigned long size, unsigned long align, - gfp_t gfp_mask, pgprot_t prot, - int node, const void *caller) + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, const void *caller) { return __vmalloc_node_range(size, align, VMALLOC_START, VMALLOC_END, - gfp_mask, prot, 0, node, caller); + gfp_mask, prot, vm_flags, node, caller); } void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot) { - return __vmalloc_node(size, 1, gfp_mask, prot, NUMA_NO_NODE, + unsigned long vm_flags = 0; + if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) + vm_flags |= VM_HUGE_PAGES; + return __vmalloc_node(size, 1, gfp_mask, prot, vm_flags, NUMA_NO_NODE, __builtin_return_address(0)); } EXPORT_SYMBOL(__vmalloc); @@ -2842,7 +2906,7 @@ EXPORT_SYMBOL(__vmalloc); static inline void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags) { - return __vmalloc_node(size, 1, flags, PAGE_KERNEL, + return __vmalloc_node(size, 1, flags, PAGE_KERNEL, VM_HUGE_PAGES, node, __builtin_return_address(0)); } @@ -2850,7 +2914,8 @@ static inline void *__vmalloc_node_flags(unsigned long size, void *__vmalloc_node_flags_caller(unsigned long size, int node, gfp_t flags, void *caller) { - return __vmalloc_node(size, 1, flags, PAGE_KERNEL, node, caller); + return __vmalloc_node(size, 1, flags, PAGE_KERNEL, VM_HUGE_PAGES, + node, caller); } /** @@ -2925,7 +2990,7 @@ EXPORT_SYMBOL(vmalloc_user); */ void *vmalloc_node(unsigned long size, int node) { - return __vmalloc_node(size, 1, GFP_KERNEL, PAGE_KERNEL, + return __vmalloc_node(size, 1, GFP_KERNEL, PAGE_KERNEL, VM_HUGE_PAGES, node, __builtin_return_address(0)); } EXPORT_SYMBOL(vmalloc_node); @@ -3014,7 +3079,7 @@ void *vmalloc_exec(unsigned long size) */ void *vmalloc_32(unsigned long size) { - return __vmalloc_node(size, 1, GFP_VMALLOC32, PAGE_KERNEL, + return __vmalloc_node(size, 1, GFP_VMALLOC32, PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0)); } EXPORT_SYMBOL(vmalloc_32);