From patchwork Tue Aug 4 09:50:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 11700153 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE72B14E3 for ; Tue, 4 Aug 2020 09:52:35 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D991422B45 for ; Tue, 4 Aug 2020 09:52:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="hV/CZ8Dg"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="X7cnTaVm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D991422B45 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=1OjqYRTmLG7vvp566YSR3bzi1T54UmmhEgLFNJjhhO0=; b=hV/CZ8DgnMnv4KTGJafHalO3U pp6W/FpGcNN2hhNAS7CnHJxxINmmfp2S1glWSBEBKUK4t3MEWFYxSkQXqt+70dXRmBY3V5FZVBdIB LSFG2DDnBUKzx4f8Dca1e/FagoVgdxa0vafMl5PPAL3NYQ/rOrD3sGrpWRopDRPdEEM/gXWAUBgiV BY7RxNoyDn44ldMYoBqDn7+uwEuEfeZbCEA5tFL/CQPa5V/mAwMhmRna+OMNUZFsl9s3698teb6V3 6n42A6gUC4cokYoalMacKtJk96xhmOySFSlSaXA7CWeghoIUtI3nmA5HrcLH9VlW+9zt6wsAfa5Gg RX/ueP8PA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tb4-0001n9-5x; Tue, 04 Aug 2020 09:51:02 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tb0-0001lC-6b; Tue, 04 Aug 2020 09:51:00 +0000 Received: from aquarius.haifa.ibm.com (nesher1.haifa.il.ibm.com [195.110.40.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3645A22CAE; Tue, 4 Aug 2020 09:50:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596534657; bh=hdVX0IsUo7eSkaqkoaF98FlrWWgQtbf1bZ3UXG9vp9E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=X7cnTaVm/Bl1eW2nMrOCgh0lUZ8PxcHLUKmgdf9qrOwGxnNMYyGqVrCGsBRVFIZrL GVCSzLQon3awaBxIvZvAOjmC2TIc66V+CeQJY2ZIo3MNTgeHhHBWICSKyDE9Q4oljX kDbIoOUx1Fkpf4PYjoYLeoUYuWtmBzdU1qmeZ85o= From: Mike Rapoport To: linux-kernel@vger.kernel.org Subject: [PATCH v3 1/6] mm: add definition of PMD_PAGE_ORDER Date: Tue, 4 Aug 2020 12:50:30 +0300 Message-Id: <20200804095035.18778-2-rppt@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200804095035.18778-1-rppt@kernel.org> References: <20200804095035.18778-1-rppt@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200804_055058_437873_E9C1A6E3 X-CRM114-Status: GOOD ( 19.73 ) X-Spam-Score: -5.2 (-----) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-5.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high trust [198.145.29.99 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.0 DKIMWL_WL_HIGH DKIMwl.org - Whitelisted High sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Andrew Morton , Mike Rapoport Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org From: Mike Rapoport The definition of PMD_PAGE_ORDER denoting the number of base pages in the second-level leaf page is already used by DAX and maybe handy in other cases as well. Several architectures already have definition of PMD_ORDER as the size of second level page table, so to avoid conflict with these definitions use PMD_PAGE_ORDER name and update DAX respectively. Signed-off-by: Mike Rapoport --- fs/dax.c | 10 +++++----- include/linux/pgtable.h | 3 +++ 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 11b16729b86f..b91d8c8dda45 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -50,7 +50,7 @@ static inline unsigned int pe_order(enum page_entry_size pe_size) #define PG_PMD_NR (PMD_SIZE >> PAGE_SHIFT) /* The order of a PMD entry */ -#define PMD_ORDER (PMD_SHIFT - PAGE_SHIFT) +#define PMD_PAGE_ORDER (PMD_SHIFT - PAGE_SHIFT) static wait_queue_head_t wait_table[DAX_WAIT_TABLE_ENTRIES]; @@ -98,7 +98,7 @@ static bool dax_is_locked(void *entry) static unsigned int dax_entry_order(void *entry) { if (xa_to_value(entry) & DAX_PMD) - return PMD_ORDER; + return PMD_PAGE_ORDER; return 0; } @@ -1456,7 +1456,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, { struct vm_area_struct *vma = vmf->vma; struct address_space *mapping = vma->vm_file->f_mapping; - XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, PMD_ORDER); + XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, PMD_PAGE_ORDER); unsigned long pmd_addr = vmf->address & PMD_MASK; bool write = vmf->flags & FAULT_FLAG_WRITE; bool sync; @@ -1515,7 +1515,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, * entry is already in the array, for instance), it will return * VM_FAULT_FALLBACK. */ - entry = grab_mapping_entry(&xas, mapping, PMD_ORDER); + entry = grab_mapping_entry(&xas, mapping, PMD_PAGE_ORDER); if (xa_is_internal(entry)) { result = xa_to_internal(entry); goto fallback; @@ -1681,7 +1681,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) if (order == 0) ret = vmf_insert_mixed_mkwrite(vmf->vma, vmf->address, pfn); #ifdef CONFIG_FS_DAX_PMD - else if (order == PMD_ORDER) + else if (order == PMD_PAGE_ORDER) ret = vmf_insert_pfn_pmd(vmf, pfn, FAULT_FLAG_WRITE); #endif else diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 56c1e8eb7bb0..79f8443609e7 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -28,6 +28,9 @@ #define USER_PGTABLES_CEILING 0UL #endif +/* Number of base pages in a second level leaf page */ +#define PMD_PAGE_ORDER (PMD_SHIFT - PAGE_SHIFT) + /* * A page table page can be thought of an array like this: pXd_t[PTRS_PER_PxD] * From patchwork Tue Aug 4 09:50:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 11700151 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC29C138C for ; Tue, 4 Aug 2020 09:52:33 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D68FA206DA for ; Tue, 4 Aug 2020 09:52:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="JkPMfHgL"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="gPcJFTGB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D68FA206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=8FS2O8PdRpNGolNwS+tzcSG9P/VEFywQhkWY/doQwWY=; b=JkPMfHgL9MRQoTUGjlPCkmXoO 5Br70w2QUug6cFff0ZceNf90z1MMTKeSkhwgtsZnOfiBpCF9wJHyBaKVFpXqJ0mWVr8ESp98Szfqg VgRuZWEUsxQMxePN8An20cYdqXGLb2HJ0JKD82zi5DpMuO4umaZxFNS6iNxgsK7uPk2cxQdO0HCF5 Rvu9YzCCIyd1/kB1Z1L/1S8y7rh2JviAriqO11CW2MsiaGb+vbzCOnXTpO4Rb/MqyACk8dPl4YXEi Z58srdKKFIQegrFQCSp/Y5S83uv6dtliRFGfzxK5NMA41CNFx2dXM87aK7oJmQIXgup7zEysvOjYY BAMYXIZOg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbC-0001qG-Lu; Tue, 04 Aug 2020 09:51:10 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tb9-0001oO-AK; Tue, 04 Aug 2020 09:51:08 +0000 Received: from aquarius.haifa.ibm.com (nesher1.haifa.il.ibm.com [195.110.40.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4EF2B22B40; Tue, 4 Aug 2020 09:50:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596534666; bh=vyyX1LP4325hfnYgliYCjSjLjNxzM/CYe+v7FBLTqDc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gPcJFTGBdcSo7DYfuAeKCt8HfW4DNdvAb4vfmr0YecDhf2FOFp+S1oTaJs9dfb+KC kkNKGdTLhePIw+eLMVci+QkBZ5NJ6RtynqDxIL4D3UbhhGID/Z4UW6yW5sOtNpsJDN J2xMyFEGi0cbT/pK0v/yLK+WQtaMB6FLIU6HCX68= From: Mike Rapoport To: linux-kernel@vger.kernel.org Subject: [PATCH v3 2/6] mmap: make mlock_future_check() global Date: Tue, 4 Aug 2020 12:50:31 +0300 Message-Id: <20200804095035.18778-3-rppt@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200804095035.18778-1-rppt@kernel.org> References: <20200804095035.18778-1-rppt@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200804_055107_493376_A720D4A9 X-CRM114-Status: GOOD ( 15.42 ) X-Spam-Score: -5.2 (-----) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-5.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high trust [198.145.29.99 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.0 DKIMWL_WL_HIGH DKIMwl.org - Whitelisted High sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Andrew Morton , Mike Rapoport Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org From: Mike Rapoport It will be used by the upcoming secret memory implementation. Signed-off-by: Mike Rapoport --- mm/internal.h | 3 +++ mm/mmap.c | 5 ++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 9886db20d94f..af0a92f8f6bc 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -349,6 +349,9 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma) extern void mlock_vma_page(struct page *page); extern unsigned int munlock_vma_page(struct page *page); +extern int mlock_future_check(struct mm_struct *mm, unsigned long flags, + unsigned long len); + /* * Clear the page's PageMlocked(). This can be useful in a situation where * we want to unconditionally remove a page from the pagecache -- e.g., diff --git a/mm/mmap.c b/mm/mmap.c index 59a4682ebf3f..4dd40a4fedfb 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1310,9 +1310,8 @@ static inline unsigned long round_hint_to_min(unsigned long hint) return hint; } -static inline int mlock_future_check(struct mm_struct *mm, - unsigned long flags, - unsigned long len) +int mlock_future_check(struct mm_struct *mm, unsigned long flags, + unsigned long len) { unsigned long locked, lock_limit; From patchwork Tue Aug 4 09:50:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 11700157 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 23C08138C for ; Tue, 4 Aug 2020 09:52:39 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 206F9206DA for ; Tue, 4 Aug 2020 09:52:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="J4e2Poz2"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="HgBRvcc6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 206F9206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=hvuCWOB5hIlHOb5+dg4SWmUCquOVz5fKOFb8493BvN0=; b=J4e2Poz2elU9B2eFBW/YRAPA4 yV49LOMeShlD6KbSQZNIMQslyusPVC4m0CpWgMjgMbasXxWtcsbY4fM6jKWUVrT7KTpsfJhd2O0qc gGQRsT2T4uapKZc4XOQ/CI9ZQIB0HAC2MC3nA/9OFyrdlGbeqKsD0bs96D2FiMr+0T+3Tt/+F8kYd j2Yru/hjIhVoRmfmLLsy4aklHvBIja/kYHAC4rOt57q/Rg+QBTQpm2T1SQCwH+SBVGHPqn0O0gyRV CAcEyugPJsqc+FJcKMosjPBbv7XO67pdjPDBfvJx0EbAFLRsNFCMOWn4OG3nSXmPF1EDUwUMbTbni q4Ak6sA9Q==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbN-0001vm-N4; Tue, 04 Aug 2020 09:51:21 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbI-0001si-FE; Tue, 04 Aug 2020 09:51:18 +0000 Received: from aquarius.haifa.ibm.com (nesher1.haifa.il.ibm.com [195.110.40.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5217522B45; Tue, 4 Aug 2020 09:51:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596534676; bh=q+YVMSt7q6Yhx9KStmQ5cdTOPOhLCpUw08ISktRlGu8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HgBRvcc6p0BKkFzlhqqiUF3l6kBrDKDGwx6frjqEuAVoVaw0e98JMOiY9/j/0mInS u5Z70eTw4t0PvL9mg+k9ObMJ9SaxxzRYGifFMQJX66RQMvkIWmwkLsRDv+Ls0/YUvp SDC1OB9H2xO+g4808RNX736DupJo9he7uQGKlr+c= From: Mike Rapoport To: linux-kernel@vger.kernel.org Subject: [PATCH v3 3/6] mm: introduce memfd_secret system call to create "secret" memory areas Date: Tue, 4 Aug 2020 12:50:32 +0300 Message-Id: <20200804095035.18778-4-rppt@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200804095035.18778-1-rppt@kernel.org> References: <20200804095035.18778-1-rppt@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200804_055116_735536_F25DC757 X-CRM114-Status: GOOD ( 34.70 ) X-Spam-Score: -5.2 (-----) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-5.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high trust [198.145.29.99 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.0 DKIMWL_WL_HIGH DKIMwl.org - Whitelisted High sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Andrew Morton , Mike Rapoport Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org From: Mike Rapoport Introduce "memfd_secret" system call with the ability to create memory areas visible only in the context of the owning process and not mapped not only to other processes but in the kernel page tables as well. The user will create a file descriptor using the memfd_secret() system call where flags supplied as a parameter to this system call will define the desired protection mode for the memory associated with that file descriptor. Currently there are two protection modes: * exclusive - the memory area is unmapped from the kernel direct map and it is present only in the page tables of the owning mm. * uncached - the memory area is present only in the page tables of the owning mm and it is mapped there as uncached. For instance, the following example will create an uncached mapping (error handling is omitted): fd = memfd_secret(SECRETMEM_UNCACHED); ftruncate(fd, MAP_SIZE); ptr = mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); Signed-off-by: Mike Rapoport --- arch/Kconfig | 7 + arch/x86/Kconfig | 1 + include/uapi/linux/magic.h | 1 + include/uapi/linux/secretmem.h | 9 ++ kernel/sys_ni.c | 2 + mm/Kconfig | 4 + mm/Makefile | 1 + mm/secretmem.c | 271 +++++++++++++++++++++++++++++++++ 8 files changed, 296 insertions(+) create mode 100644 include/uapi/linux/secretmem.h create mode 100644 mm/secretmem.c diff --git a/arch/Kconfig b/arch/Kconfig index 8cc35dc556c7..ba2a4b0594a9 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -979,6 +979,13 @@ config HAVE_SPARSE_SYSCALL_NR entries at 4000, 5000 and 6000 locations. This option turns on syscall related optimizations for a given architecture. +config HAVE_SECRETMEM_UNCACHED + bool + help + An architecture can select this if its semantics of non-cached + mappings can be used to prevent speculative loads and it is + useful for secret protection. + source "kernel/gcov/Kconfig" source "scripts/gcc-plugins/Kconfig" diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 883da0abf779..c235b869b022 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -218,6 +218,7 @@ config X86 select HAVE_UNSTABLE_SCHED_CLOCK select HAVE_USER_RETURN_NOTIFIER select HAVE_GENERIC_VDSO + select HAVE_SECRETMEM_UNCACHED select HOTPLUG_SMT if SMP select IRQ_FORCED_THREADING select NEED_SG_DMA_LENGTH diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index f3956fc11de6..35687dcb1a42 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -97,5 +97,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define Z3FOLD_MAGIC 0x33 #define PPC_CMM_MAGIC 0xc7571590 +#define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/include/uapi/linux/secretmem.h b/include/uapi/linux/secretmem.h new file mode 100644 index 000000000000..cef7a59f7492 --- /dev/null +++ b/include/uapi/linux/secretmem.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_LINUX_SECRERTMEM_H +#define _UAPI_LINUX_SECRERTMEM_H + +/* secretmem operation modes */ +#define SECRETMEM_EXCLUSIVE 0x1 +#define SECRETMEM_UNCACHED 0x2 + +#endif /* _UAPI_LINUX_SECRERTMEM_H */ diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 3b69a560a7ac..fd40e1c083e5 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -349,6 +349,8 @@ COND_SYSCALL(pkey_mprotect); COND_SYSCALL(pkey_alloc); COND_SYSCALL(pkey_free); +/* memfd_secret */ +COND_SYSCALL(memfd_secret); /* * Architecture specific weak syscall entries. diff --git a/mm/Kconfig b/mm/Kconfig index f2104cc0d35c..8378175e72a4 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -872,4 +872,8 @@ config ARCH_HAS_HUGEPD config MAPPING_DIRTY_HELPERS bool +config SECRETMEM + def_bool ARCH_HAS_SET_DIRECT_MAP && !EMBEDDED + select GENERIC_ALLOCATOR + endmenu diff --git a/mm/Makefile b/mm/Makefile index 6e9d46b2efc9..c2aa7a393b73 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -121,3 +121,4 @@ obj-$(CONFIG_MEMFD_CREATE) += memfd.o obj-$(CONFIG_MAPPING_DIRTY_HELPERS) += mapping_dirty_helpers.o obj-$(CONFIG_PTDUMP_CORE) += ptdump.o obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o +obj-$(CONFIG_SECRETMEM) += secretmem.o diff --git a/mm/secretmem.c b/mm/secretmem.c new file mode 100644 index 000000000000..65cd6660991d --- /dev/null +++ b/mm/secretmem.c @@ -0,0 +1,271 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include + +#include "internal.h" + +#undef pr_fmt +#define pr_fmt(fmt) "secretmem: " fmt + +#ifdef CONFIG_HAVE_SECRETMEM_UNCACHED +#define SECRETMEM_MODE_MASK (SECRETMEM_EXCLUSIVE | SECRETMEM_UNCACHED) +#else +#define SECRETMEM_MODE_MASK (SECRETMEM_EXCLUSIVE) +#endif + +#define SECRETMEM_FLAGS_MASK SECRETMEM_MODE_MASK + +struct secretmem_ctx { + unsigned int mode; +}; + +static struct page *secretmem_alloc_page(gfp_t gfp) +{ + /* + * FIXME: use a cache of large pages to reduce the direct map + * fragmentation + */ + return alloc_page(gfp); +} + +static vm_fault_t secretmem_fault(struct vm_fault *vmf) +{ + struct address_space *mapping = vmf->vma->vm_file->f_mapping; + struct inode *inode = file_inode(vmf->vma->vm_file); + pgoff_t offset = vmf->pgoff; + unsigned long addr; + struct page *page; + int ret = 0; + + if (((loff_t)vmf->pgoff << PAGE_SHIFT) >= i_size_read(inode)) + return vmf_error(-EINVAL); + + page = find_get_entry(mapping, offset); + if (!page) { + page = secretmem_alloc_page(vmf->gfp_mask); + if (!page) + return vmf_error(-ENOMEM); + + ret = add_to_page_cache(page, mapping, offset, vmf->gfp_mask); + if (unlikely(ret)) + goto err_put_page; + + ret = set_direct_map_invalid_noflush(page); + if (ret) + goto err_del_page_cache; + + addr = (unsigned long)page_address(page); + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + + __SetPageUptodate(page); + + ret = VM_FAULT_LOCKED; + } + + vmf->page = page; + return ret; + +err_del_page_cache: + delete_from_page_cache(page); +err_put_page: + put_page(page); + return vmf_error(ret); +} + +static const struct vm_operations_struct secretmem_vm_ops = { + .fault = secretmem_fault, +}; + +static int secretmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct secretmem_ctx *ctx = file->private_data; + unsigned long mode = ctx->mode; + unsigned long len = vma->vm_end - vma->vm_start; + + if (!mode) + return -EINVAL; + + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) + return -EINVAL; + + if (mlock_future_check(vma->vm_mm, vma->vm_flags | VM_LOCKED, len)) + return -EAGAIN; + + switch (mode) { + case SECRETMEM_UNCACHED: + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + fallthrough; + case SECRETMEM_EXCLUSIVE: + vma->vm_ops = &secretmem_vm_ops; + break; + default: + return -EINVAL; + } + + vma->vm_flags |= VM_LOCKED; + + return 0; +} + +const struct file_operations secretmem_fops = { + .mmap = secretmem_mmap, +}; + +static bool secretmem_isolate_page(struct page *page, isolate_mode_t mode) +{ + return false; +} + +static int secretmem_migratepage(struct address_space *mapping, + struct page *newpage, struct page *page, + enum migrate_mode mode) +{ + return -EBUSY; +} + +static void secretmem_freepage(struct page *page) +{ + set_direct_map_default_noflush(page); +} + +static const struct address_space_operations secretmem_aops = { + .freepage = secretmem_freepage, + .migratepage = secretmem_migratepage, + .isolate_page = secretmem_isolate_page, +}; + +static struct vfsmount *secretmem_mnt; + +static struct file *secretmem_file_create(unsigned long flags) +{ + struct file *file = ERR_PTR(-ENOMEM); + struct secretmem_ctx *ctx; + struct inode *inode; + + inode = alloc_anon_inode(secretmem_mnt->mnt_sb); + if (IS_ERR(inode)) + return ERR_CAST(inode); + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + goto err_free_inode; + + file = alloc_file_pseudo(inode, secretmem_mnt, "secretmem", + O_RDWR, &secretmem_fops); + if (IS_ERR(file)) + goto err_free_ctx; + + mapping_set_unevictable(inode->i_mapping); + + inode->i_mapping->private_data = ctx; + inode->i_mapping->a_ops = &secretmem_aops; + + /* pretend we are a normal file with zero size */ + inode->i_mode |= S_IFREG; + inode->i_size = 0; + + file->private_data = ctx; + + ctx->mode = flags & SECRETMEM_MODE_MASK; + + return file; + +err_free_ctx: + kfree(ctx); +err_free_inode: + iput(inode); + return file; +} + +SYSCALL_DEFINE1(memfd_secret, unsigned long, flags) +{ + struct file *file; + unsigned int mode; + int fd, err; + + /* make sure local flags do not confict with global fcntl.h */ + BUILD_BUG_ON(SECRETMEM_FLAGS_MASK & O_CLOEXEC); + + if (flags & ~(SECRETMEM_FLAGS_MASK | O_CLOEXEC)) + return -EINVAL; + + /* modes are mutually exclusive, only one mode bit should be set */ + mode = flags & SECRETMEM_FLAGS_MASK; + if (ffs(mode) != fls(mode)) + return -EINVAL; + + fd = get_unused_fd_flags(flags & O_CLOEXEC); + if (fd < 0) + return fd; + + file = secretmem_file_create(flags); + if (IS_ERR(file)) { + err = PTR_ERR(file); + goto err_put_fd; + } + + file->f_flags |= O_LARGEFILE; + + fd_install(fd, file); + return fd; + +err_put_fd: + put_unused_fd(fd); + return err; +} + +static void secretmem_evict_inode(struct inode *inode) +{ + struct secretmem_ctx *ctx = inode->i_private; + + truncate_inode_pages_final(&inode->i_data); + clear_inode(inode); + kfree(ctx); +} + +static const struct super_operations secretmem_super_ops = { + .evict_inode = secretmem_evict_inode, +}; + +static int secretmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx = init_pseudo(fc, SECRETMEM_MAGIC); + + if (!ctx) + return -ENOMEM; + ctx->ops = &secretmem_super_ops; + + return 0; +} + +static struct file_system_type secretmem_fs = { + .name = "secretmem", + .init_fs_context = secretmem_init_fs_context, + .kill_sb = kill_anon_super, +}; + +static int secretmem_init(void) +{ + int ret = 0; + + secretmem_mnt = kern_mount(&secretmem_fs); + if (IS_ERR(secretmem_mnt)) + ret = PTR_ERR(secretmem_mnt); + + return ret; +} +fs_initcall(secretmem_init); From patchwork Tue Aug 4 09:50:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 11700159 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D6831392 for ; Tue, 4 Aug 2020 09:52:55 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 78332206DA for ; Tue, 4 Aug 2020 09:52:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="f8hc8/GY"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="DOAz2xnk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 78332206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=T06BQ6pEWcV8FGJ+10xNgb0UFtrp9xB+l4/3x/b0phY=; b=f8hc8/GYkRE23S6QGCgzoDKcO gL08UFpCBLilgC+71U2VWccMOY+U7CkSRUDTqej7KE8EggByZFZyWqTjDlYoYQmH6SXLMpWeXqEMM jZqPcpkJ9zx424JeQJAT+blTRtEymUGbdK6DJnizoii/3Ta5I0SDpuFycd3rmy7ROKMuJtGl7J7+r 4ngnXXnETA5b3Qwyy7NQsda4WyfiO3OegIbOs/vXagjvF1RItWDmEPLl3T2OtmXlOpfCAaGE/Y9c6 dwMvL4OmnH8i4B+viZ9716mQSMkIAj4YKDTCMqeqqcaIMITQWtnUMjBT35fHQvMi473W3N1ja7UUI 6rUTkGVuA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbX-00020w-IX; Tue, 04 Aug 2020 09:51:31 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbR-0001yC-Qo; Tue, 04 Aug 2020 09:51:27 +0000 Received: from aquarius.haifa.ibm.com (nesher1.haifa.il.ibm.com [195.110.40.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8BAB8206DA; Tue, 4 Aug 2020 09:51:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596534685; bh=r3t4v9zzG8VLdzS6wGyezAn51/RhQzyxGikcEYKDlSY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DOAz2xnk75CPIM0JBgwUbng7nSMz4ytAY0sw3rTGaBnOYNBorqh5ypPIMXdYfEab8 uYuHG+nTnxk6NAylbbH0Sn0MUoX7eICNkyvLNZvs3wGG6S1W8IksdI1OfZm3QwkuBr hcAtxRHyTtx907zC1fU31YR9T0fqtrr7OYYi3/JM= From: Mike Rapoport To: linux-kernel@vger.kernel.org Subject: [PATCH v3 4/6] arch, mm: wire up memfd_secret system call were relevant Date: Tue, 4 Aug 2020 12:50:33 +0300 Message-Id: <20200804095035.18778-5-rppt@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200804095035.18778-1-rppt@kernel.org> References: <20200804095035.18778-1-rppt@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200804_055126_049124_0315EF05 X-CRM114-Status: GOOD ( 16.81 ) X-Spam-Score: -5.2 (-----) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-5.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high trust [198.145.29.99 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.0 DKIMWL_WL_HIGH DKIMwl.org - Whitelisted High sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Palmer Dabbelt , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Andrew Morton , Mike Rapoport Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org From: Mike Rapoport Wire up memfd_secret system call on architectures that define ARCH_HAS_SET_DIRECT_MAP, namely arm64, risc-v and x86. Signed-off-by: Mike Rapoport Acked-by: Palmer Dabbelt Acked-by: Arnd Bergmann --- arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 ++ arch/arm64/include/uapi/asm/unistd.h | 1 + arch/riscv/include/asm/unistd.h | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/syscalls.h | 1 + include/uapi/asm-generic/unistd.h | 7 ++++++- 8 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 3b859596840d..b3b2019f8d16 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -38,7 +38,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 440 +#define __NR_compat_syscalls 441 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index 6d95d0c8bf2f..3d9c3a3012db 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -885,6 +885,8 @@ __SYSCALL(__NR_openat2, sys_openat2) __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd) #define __NR_faccessat2 439 __SYSCALL(__NR_faccessat2, sys_faccessat2) +#define __NR_memfd_secret 440 +__SYSCALL(__NR_memfd_secret, sys_memfd_secret) /* * Please add new compat syscalls above this comment and update diff --git a/arch/arm64/include/uapi/asm/unistd.h b/arch/arm64/include/uapi/asm/unistd.h index f83a70e07df8..ce2ee8f1e361 100644 --- a/arch/arm64/include/uapi/asm/unistd.h +++ b/arch/arm64/include/uapi/asm/unistd.h @@ -20,5 +20,6 @@ #define __ARCH_WANT_SET_GET_RLIMIT #define __ARCH_WANT_TIME32_SYSCALLS #define __ARCH_WANT_SYS_CLONE3 +#define __ARCH_WANT_MEMFD_SECRET #include diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h index 977ee6181dab..6c316093a1e5 100644 --- a/arch/riscv/include/asm/unistd.h +++ b/arch/riscv/include/asm/unistd.h @@ -9,6 +9,7 @@ */ #define __ARCH_WANT_SYS_CLONE +#define __ARCH_WANT_MEMFD_SECRET #include diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index d8f8a1a69ed1..6f8b5978053b 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -443,3 +443,4 @@ 437 i386 openat2 sys_openat2 438 i386 pidfd_getfd sys_pidfd_getfd 439 i386 faccessat2 sys_faccessat2 +440 i386 memfd_secret sys_memfd_secret diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 78847b32e137..7d3775d1c3d7 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -360,6 +360,7 @@ 437 common openat2 sys_openat2 438 common pidfd_getfd sys_pidfd_getfd 439 common faccessat2 sys_faccessat2 +440 common memfd_secret sys_memfd_secret # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index b951a87da987..e4d7b30867c6 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1005,6 +1005,7 @@ asmlinkage long sys_pidfd_send_signal(int pidfd, int sig, siginfo_t __user *info, unsigned int flags); asmlinkage long sys_pidfd_getfd(int pidfd, int fd, unsigned int flags); +asmlinkage long sys_memfd_secret(unsigned long flags); /* * Architecture-specific system calls diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index f4a01305d9a6..7b288347c5a9 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -858,8 +858,13 @@ __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd) #define __NR_faccessat2 439 __SYSCALL(__NR_faccessat2, sys_faccessat2) +#ifdef __ARCH_WANT_MEMFD_SECRET +#define __NR_memfd_secret 440 +__SYSCALL(__NR_memfd_secret, sys_memfd_secret) +#endif + #undef __NR_syscalls -#define __NR_syscalls 440 +#define __NR_syscalls 441 /* * 32 bit systems traditionally used different From patchwork Tue Aug 4 09:50:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 11700163 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8900A138C for ; Tue, 4 Aug 2020 09:53:06 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 94800206DA for ; Tue, 4 Aug 2020 09:53:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="pd9mACsQ"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="XCtJy4ww" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 94800206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=uiOLiSJLLcGvysdffxX+/2h61ub+KQ7LP1O5s0YuTOc=; b=pd9mACsQgXp8UH3O9dlEfOvKj /5AQ4N1oaCaSq4P0n0ipkEPf2zetOgNDfc8pB2iR754Na5YE3vnmAIYFnuZi9hBWkqjUE8kBEPHWi EAbOhdUlEFweW6VTu5c0Tb99fgDegjWwdeGyuUV3t/rS7Z09IbhQlwCMDTycu4e4eVZOMo18qZKn8 jDdZ20E+S5nuvoYiOMhtQtsheZvGAsFf9U5pPIxkvwqdK0fuSbgDhGN+zNgPkHsW3ijVR7vHHpRbX ACoCT7H/CV0dUk/OBFfpcBFo/OIan0p7mjaaqagJUMqvxmio55qtFObEsUCzzScO5u2uRfseumoJ7 GySkram7w==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbh-00027i-4o; Tue, 04 Aug 2020 09:51:41 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tba-00023X-R7; Tue, 04 Aug 2020 09:51:37 +0000 Received: from aquarius.haifa.ibm.com (nesher1.haifa.il.ibm.com [195.110.40.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CE62122B40; Tue, 4 Aug 2020 09:51:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596534694; bh=am7f7I1RHomCs7G6ojJiP5O0iGwPclBUydJw4JV6+EQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XCtJy4wwSCwHJk7CgnijdNjvgXvkD7f+mtf2hN+2dllyOyAOVogBMM09jQoMOjawR YgQilBBffAfwKv+xgXmyXcC7aSnQz1qU1MWMSauhDSf2IJiS7SX3XmH235YsKFy4UU MqLVNrQ7tw2y2NTyy1seRZIMcvb+Qwiqd+g8/hbE= From: Mike Rapoport To: linux-kernel@vger.kernel.org Subject: [PATCH v3 5/6] mm: secretmem: use PMD-size pages to amortize direct map fragmentation Date: Tue, 4 Aug 2020 12:50:34 +0300 Message-Id: <20200804095035.18778-6-rppt@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200804095035.18778-1-rppt@kernel.org> References: <20200804095035.18778-1-rppt@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200804_055135_090368_CC97B8A4 X-CRM114-Status: GOOD ( 24.20 ) X-Spam-Score: -5.2 (-----) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-5.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high trust [198.145.29.99 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.0 DKIMWL_WL_HIGH DKIMwl.org - Whitelisted High sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Andrew Morton , Mike Rapoport Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org From: Mike Rapoport Removing a PAGE_SIZE page from the direct map every time such page is allocated for a secret memory mapping will cause severe fragmentation of the direct map. This fragmentation can be reduced by using PMD-size pages as a pool for small pages for secret memory mappings. Add a gen_pool per secretmem inode and lazily populate this pool with PMD-size pages. Signed-off-by: Mike Rapoport --- mm/secretmem.c | 107 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 88 insertions(+), 19 deletions(-) diff --git a/mm/secretmem.c b/mm/secretmem.c index 65cd6660991d..e42616785a88 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -30,24 +31,66 @@ #define SECRETMEM_FLAGS_MASK SECRETMEM_MODE_MASK struct secretmem_ctx { + struct gen_pool *pool; unsigned int mode; }; -static struct page *secretmem_alloc_page(gfp_t gfp) +static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp) { - /* - * FIXME: use a cache of large pages to reduce the direct map - * fragmentation - */ - return alloc_page(gfp); + unsigned long nr_pages = (1 << PMD_PAGE_ORDER); + struct gen_pool *pool = ctx->pool; + unsigned long addr; + struct page *page; + int err; + + page = alloc_pages(gfp, PMD_PAGE_ORDER); + if (!page) + return -ENOMEM; + + addr = (unsigned long)page_address(page); + split_page(page, PMD_PAGE_ORDER); + + err = gen_pool_add(pool, addr, PMD_SIZE, NUMA_NO_NODE); + if (err) { + __free_pages(page, PMD_PAGE_ORDER); + return err; + } + + __kernel_map_pages(page, nr_pages, 0); + + return 0; +} + +static struct page *secretmem_alloc_page(struct secretmem_ctx *ctx, + gfp_t gfp) +{ + struct gen_pool *pool = ctx->pool; + unsigned long addr; + struct page *page; + int err; + + if (gen_pool_avail(pool) < PAGE_SIZE) { + err = secretmem_pool_increase(ctx, gfp); + if (err) + return NULL; + } + + addr = gen_pool_alloc(pool, PAGE_SIZE); + if (!addr) + return NULL; + + page = virt_to_page(addr); + get_page(page); + + return page; } static vm_fault_t secretmem_fault(struct vm_fault *vmf) { + struct secretmem_ctx *ctx = vmf->vma->vm_file->private_data; struct address_space *mapping = vmf->vma->vm_file->f_mapping; struct inode *inode = file_inode(vmf->vma->vm_file); pgoff_t offset = vmf->pgoff; - unsigned long addr; struct page *page; int ret = 0; @@ -56,7 +99,7 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf) page = find_get_entry(mapping, offset); if (!page) { - page = secretmem_alloc_page(vmf->gfp_mask); + page = secretmem_alloc_page(ctx, vmf->gfp_mask); if (!page) return vmf_error(-ENOMEM); @@ -64,14 +107,8 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf) if (unlikely(ret)) goto err_put_page; - ret = set_direct_map_invalid_noflush(page); - if (ret) - goto err_del_page_cache; - - addr = (unsigned long)page_address(page); - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); - __SetPageUptodate(page); + set_page_private(page, (unsigned long)ctx); ret = VM_FAULT_LOCKED; } @@ -79,8 +116,6 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf) vmf->page = page; return ret; -err_del_page_cache: - delete_from_page_cache(page); err_put_page: put_page(page); return vmf_error(ret); @@ -139,7 +174,11 @@ static int secretmem_migratepage(struct address_space *mapping, static void secretmem_freepage(struct page *page) { - set_direct_map_default_noflush(page); + unsigned long addr = (unsigned long)page_address(page); + struct secretmem_ctx *ctx = (struct secretmem_ctx *)page_private(page); + struct gen_pool *pool = ctx->pool; + + gen_pool_free(pool, addr, PAGE_SIZE); } static const struct address_space_operations secretmem_aops = { @@ -164,13 +203,18 @@ static struct file *secretmem_file_create(unsigned long flags) if (!ctx) goto err_free_inode; + ctx->pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE); + if (!ctx->pool) + goto err_free_ctx; + file = alloc_file_pseudo(inode, secretmem_mnt, "secretmem", O_RDWR, &secretmem_fops); if (IS_ERR(file)) - goto err_free_ctx; + goto err_free_pool; mapping_set_unevictable(inode->i_mapping); + inode->i_private = ctx; inode->i_mapping->private_data = ctx; inode->i_mapping->a_ops = &secretmem_aops; @@ -184,6 +228,8 @@ static struct file *secretmem_file_create(unsigned long flags) return file; +err_free_pool: + gen_pool_destroy(ctx->pool); err_free_ctx: kfree(ctx); err_free_inode: @@ -228,11 +274,34 @@ SYSCALL_DEFINE1(memfd_secret, unsigned long, flags) return err; } +static void secretmem_cleanup_chunk(struct gen_pool *pool, + struct gen_pool_chunk *chunk, void *data) +{ + unsigned long start = chunk->start_addr; + unsigned long end = chunk->end_addr; + unsigned long nr_pages, addr; + + nr_pages = (end - start + 1) / PAGE_SIZE; + __kernel_map_pages(virt_to_page(start), nr_pages, 1); + + for (addr = start; addr < end; addr += PAGE_SIZE) + put_page(virt_to_page(addr)); +} + +static void secretmem_cleanup_pool(struct secretmem_ctx *ctx) +{ + struct gen_pool *pool = ctx->pool; + + gen_pool_for_each_chunk(pool, secretmem_cleanup_chunk, ctx); + gen_pool_destroy(pool); +} + static void secretmem_evict_inode(struct inode *inode) { struct secretmem_ctx *ctx = inode->i_private; truncate_inode_pages_final(&inode->i_data); + secretmem_cleanup_pool(ctx); clear_inode(inode); kfree(ctx); } From patchwork Tue Aug 4 09:50:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 11700161 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84D96138C for ; Tue, 4 Aug 2020 09:53:05 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8FF64206DA for ; Tue, 4 Aug 2020 09:53:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="3X+47Dy5"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="Wb/i9DqC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8FF64206DA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=WzxpAgq/vdgf7ZG07yarZCrFpQyVwdmz5U4Kcw/xAfE=; b=3X+47Dy5Y7dT/8freWY51pJ0m 8bzOIW5xVe+tKA/OLzd9bs1/1AJhYju7Tsg1QbWAb3Xpy6DvQx2Vpxjat+UvrkDbYM62pROt6KFzw JZ5B+JnGgmUxCyvCUTID78iefzg93YxFgtzmqAt3Deu8yQcjk4I6rBs4gP4QGs9vU0cwTqDQBW7rW e7Hqb+6SRH+rahfjr7TrgiCO75WPZbTzvSkbk1sQcnaxBOM8RYArZ8NxeTKfeLMIdiH4mLQRJPPFW 01fktH61Bo53fOJoO6aWXNDgx2NyBeq6h3gXdXfsnzUyvW3d0LDrTCDvKpzKj3syVdnsusygMn+f7 ZbR6tOaIA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbq-0002D8-9P; Tue, 04 Aug 2020 09:51:50 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2tbk-000291-3L; Tue, 04 Aug 2020 09:51:47 +0000 Received: from aquarius.haifa.ibm.com (nesher1.haifa.il.ibm.com [195.110.40.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DA0AE22B45; Tue, 4 Aug 2020 09:51:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596534703; bh=x0DAxp5W8dl2Y9jNBG2mD6C6l9O/uu9bnorpOsB3o5U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Wb/i9DqCCGMY1sy+ymVE2IwBdDO3PPSz+UTA8Mo0f0h/pscUMpmqT5WulteBV4oQZ FyeWJ0852XMneG9X+T4hrFXC/Y1pepOy2YVe94MJZsZb2c8aukRwjJQmGDZxCCGpZ6 r1vibdwZ//4Q6PlZPZ3+MksZdbDrecbblYOex86c= From: Mike Rapoport To: linux-kernel@vger.kernel.org Subject: [PATCH v3 6/6] mm: secretmem: add ability to reserve memory at boot Date: Tue, 4 Aug 2020 12:50:35 +0300 Message-Id: <20200804095035.18778-7-rppt@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200804095035.18778-1-rppt@kernel.org> References: <20200804095035.18778-1-rppt@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200804_055144_444461_157F0735 X-CRM114-Status: GOOD ( 26.19 ) X-Spam-Score: -5.2 (-----) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-5.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high trust [198.145.29.99 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.0 DKIMWL_WL_HIGH DKIMwl.org - Whitelisted High sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Andrew Morton , Mike Rapoport Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org From: Mike Rapoport Taking pages out from the direct map and bringing them back may create undesired fragmentation and usage of the smaller pages in the direct mapping of the physical memory. This can be avoided if a significantly large area of the physical memory would be reserved for secretmem purposes at boot time. Add ability to reserve physical memory for secretmem at boot time using "secretmem" kernel parameter and then use that reserved memory as a global pool for secret memory needs. Signed-off-by: Mike Rapoport --- mm/secretmem.c | 134 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 126 insertions(+), 8 deletions(-) diff --git a/mm/secretmem.c b/mm/secretmem.c index e42616785a88..0f3e7b30a0a7 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -35,6 +36,39 @@ struct secretmem_ctx { unsigned int mode; }; +struct secretmem_pool { + struct gen_pool *pool; + unsigned long reserved_size; + void *reserved; +}; + +static struct secretmem_pool secretmem_pool; + +static struct page *secretmem_alloc_huge_page(gfp_t gfp) +{ + struct gen_pool *pool = secretmem_pool.pool; + unsigned long addr = 0; + struct page *page = NULL; + + if (pool) { + if (gen_pool_avail(pool) < PMD_SIZE) + return NULL; + + addr = gen_pool_alloc(pool, PMD_SIZE); + if (!addr) + return NULL; + + page = virt_to_page(addr); + } else { + page = alloc_pages(gfp, PMD_PAGE_ORDER); + + if (page) + split_page(page, PMD_PAGE_ORDER); + } + + return page; +} + static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp) { unsigned long nr_pages = (1 << PMD_PAGE_ORDER); @@ -43,12 +77,11 @@ static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp) struct page *page; int err; - page = alloc_pages(gfp, PMD_PAGE_ORDER); + page = secretmem_alloc_huge_page(gfp); if (!page) return -ENOMEM; addr = (unsigned long)page_address(page); - split_page(page, PMD_PAGE_ORDER); err = gen_pool_add(pool, addr, PMD_SIZE, NUMA_NO_NODE); if (err) { @@ -274,11 +307,13 @@ SYSCALL_DEFINE1(memfd_secret, unsigned long, flags) return err; } -static void secretmem_cleanup_chunk(struct gen_pool *pool, - struct gen_pool_chunk *chunk, void *data) +static void secretmem_recycle_range(unsigned long start, unsigned long end) +{ + gen_pool_free(secretmem_pool.pool, start, PMD_SIZE); +} + +static void secretmem_release_range(unsigned long start, unsigned long end) { - unsigned long start = chunk->start_addr; - unsigned long end = chunk->end_addr; unsigned long nr_pages, addr; nr_pages = (end - start + 1) / PAGE_SIZE; @@ -288,6 +323,18 @@ static void secretmem_cleanup_chunk(struct gen_pool *pool, put_page(virt_to_page(addr)); } +static void secretmem_cleanup_chunk(struct gen_pool *pool, + struct gen_pool_chunk *chunk, void *data) +{ + unsigned long start = chunk->start_addr; + unsigned long end = chunk->end_addr; + + if (secretmem_pool.pool) + secretmem_recycle_range(start, end); + else + secretmem_release_range(start, end); +} + static void secretmem_cleanup_pool(struct secretmem_ctx *ctx) { struct gen_pool *pool = ctx->pool; @@ -327,14 +374,85 @@ static struct file_system_type secretmem_fs = { .kill_sb = kill_anon_super, }; +static int secretmem_reserved_mem_init(void) +{ + struct gen_pool *pool; + struct page *page; + void *addr; + int err; + + if (!secretmem_pool.reserved) + return 0; + + pool = gen_pool_create(PMD_SHIFT, NUMA_NO_NODE); + if (!pool) + return -ENOMEM; + + err = gen_pool_add(pool, (unsigned long)secretmem_pool.reserved, + secretmem_pool.reserved_size, NUMA_NO_NODE); + if (err) + goto err_destroy_pool; + + for (addr = secretmem_pool.reserved; + addr < secretmem_pool.reserved + secretmem_pool.reserved_size; + addr += PAGE_SIZE) { + page = virt_to_page(addr); + __ClearPageReserved(page); + set_page_count(page, 1); + } + + secretmem_pool.pool = pool; + page = virt_to_page(secretmem_pool.reserved); + __kernel_map_pages(page, secretmem_pool.reserved_size / PAGE_SIZE, 0); + return 0; + +err_destroy_pool: + gen_pool_destroy(pool); + return err; +} + static int secretmem_init(void) { - int ret = 0; + int ret; + + ret = secretmem_reserved_mem_init(); + if (ret) + return ret; secretmem_mnt = kern_mount(&secretmem_fs); - if (IS_ERR(secretmem_mnt)) + if (IS_ERR(secretmem_mnt)) { + gen_pool_destroy(secretmem_pool.pool); ret = PTR_ERR(secretmem_mnt); + } return ret; } fs_initcall(secretmem_init); + +static int __init secretmem_setup(char *str) +{ + phys_addr_t align = PMD_SIZE; + unsigned long reserved_size; + void *reserved; + + reserved_size = memparse(str, NULL); + if (!reserved_size) + return 0; + + if (reserved_size * 2 > PUD_SIZE) + align = PUD_SIZE; + + reserved = memblock_alloc(reserved_size, align); + if (!reserved) { + pr_err("failed to reserve %lu bytes\n", secretmem_pool.reserved_size); + return 0; + } + + secretmem_pool.reserved_size = reserved_size; + secretmem_pool.reserved = reserved; + + pr_info("reserved %luM\n", reserved_size >> 20); + + return 1; +} +__setup("secretmem=", secretmem_setup);