From patchwork Tue Dec 6 13:59:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065844 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6AA61C352A1 for ; Tue, 6 Dec 2022 14:02:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=rjcoxAZ2Y6JRV3M9p+7BliN3HTUk3qLWZrmIobS9V4Q=; b=dfFb1G13WVewkh z2POKE4MIu+Y2QSGlaoqVwn3JYYB83lt1KOoxAIDvG+PhigV7cOjpT5n58tJUsbdj6FK+koRf4AOe zS5NvPUWi1BjIvVZmj3KGIp0jPa9IAX9VCPXh046B9Bbz9nZrxrc0a+R6OVDEdgbnaqKWmxmOspHa 057S4ITDIgyLEKtSOnql0lAYhcJ2+OMKUI4ZNPkMf1l+ODCztrjgTfjx3AKAKufOb75eyQcaD8YbM 6YgvLq6jn9TFQVyUOZe5AQ/8kKcKej6Fa/wJS+lTIqtA2JsG30dEhuDdFmkNWIlBSG91dgqsS9R9Q ES65TZuiSUqP3c1sUhwg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUb-00A9CC-Dp; Tue, 06 Dec 2022 14:00:17 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUW-00A953-7Z for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:15 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2DD37D6E; Tue, 6 Dec 2022 06:00:11 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C98973F73D; Tue, 6 Dec 2022 06:00:02 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 01/12] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Date: Tue, 6 Dec 2022 13:59:19 +0000 Message-Id: <20221206135930.3277585-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060012_348030_7685202B X-CRM114-Status: UNSURE ( 7.47 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Anshuman Khandual PAGE_SIZE support is tested against possible minimum and maximum values for its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A). Acked-by: Catalin Marinas Signed-off-by: Anshuman Khandual Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/sysreg.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 7d301700d1a9..9ad8172eea58 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -673,10 +673,12 @@ /* id_aa64mmfr0 */ #define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN 0x0 +#define ID_AA64MMFR0_EL1_TGRAN4_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT #define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX 0x7 #define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MIN 0x0 #define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MAX 0x7 #define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN 0x1 +#define ID_AA64MMFR0_EL1_TGRAN16_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT #define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX 0xf #define ARM64_MIN_PARANGE_BITS 32 @@ -684,6 +686,7 @@ #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_DEFAULT 0x0 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_NONE 0x1 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MIN 0x2 +#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2 0x3 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MAX 0x7 #ifdef CONFIG_ARM64_PA_BITS_52 @@ -800,11 +803,13 @@ #if defined(CONFIG_ARM64_4K_PAGES) #define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN4_SHIFT +#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX #define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN4_2_SHIFT #elif defined(CONFIG_ARM64_16K_PAGES) #define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN16_SHIFT +#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX #define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN16_2_SHIFT From patchwork Tue Dec 6 13:59:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EEF1CC352A1 for ; Tue, 6 Dec 2022 14:02:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=E3Xrfwhtlwvq/qHOlMIoX4+6y3nCtAx3o7CMUqCPjIE=; b=MvC5RJntJN5aKA 2/XO+RC4KOa6c2DTfQzaK1JOlZyELNuUzhXr6WtY5Z2E+082wh0lSVP3g1TboeUaXwVGubKZIyHd2 A1w1yygwOCcSJxmY1aZZS7GWCY49sG4Bp5Yf64AHH6MFMigpA0LMCqMrJseEhZpEfoOP5QVxxmi/o mBHP81AHvEDK3DxPL090VJ7k9aaKLqOgsqpJgyUB57cOnBc4ffMl3PicEnv+NZuE9+m4HPJmkhUQ7 89BIxFsrmksyNaE+5pVNcjytlLDr1TBIpIoeO1QlsAAvvckeiMhGRD1Txb5B6xBI2tQbOrxq8E1/F rnR5l9/522YhGPC3wIBg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUx-00A9U5-Q1; Tue, 06 Dec 2022 14:00:39 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUW-00A952-7V for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:18 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0D9E5113E; Tue, 6 Dec 2022 06:00:13 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A8BB53F73D; Tue, 6 Dec 2022 06:00:04 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 02/12] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 Date: Tue, 6 Dec 2022 13:59:20 +0000 Message-Id: <20221206135930.3277585-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060012_490854_B5134093 X-CRM114-Status: GOOD ( 37.65 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org FEAT_LPA2 impacts tlb invalidation in 2 ways; Firstly, the TTL field in the non-range tlbi instructions can now validly take a 0 value for the 4KB granule (this is due to the extra level of translation). Secondly, the BADDR field in the range tlbi instructions must be aligned to 64KB when LPA2 is in use (TCR.DS=1). Changes are required for tlbi to continue to operate correctly when LPA2 is in use. We solve the first by always adding the level hint if the level is between [0, 3] (previously anything other than 0 was hinted, which breaks in the new level -1 case from kvm). When running on non-LPA2 HW, 0 is still safe to hint as the HW will fall back to non-hinted. We also update kernel code to take advantage of the new hint for p4d flushing. While we are at it, we replace the notion of 0 being the non-hinted seninel with a macro, TLBI_TTL_UNKNOWN. This means callers won't need updating if/when translation depth increases in future. The second problem is tricker. When LPA2 is in use, we need to use the non-range tlbi instructions to forward align to a 64KB boundary first, then we can use range-based tlbi from there on, until we have either invalidated all pages or we have a single page remaining. If the latter, that is done with non-range tlbi. (Previously we invalidated a single odd page first, but we can no longer do this because it could wreck our 64KB alignment). When LPA2 is not in use, we don't need the initial alignemnt step. However, the bigger impact is that we can no longer use the previous method of iterating from smallest to largest 'scale', since this would likely unalign the boundary again for the LPA2 case. So instead we iterate from highest to lowest scale, which guarrantees that we remain 64KB aligned until the last op (at scale=0). The original commit (d1d3aa9 "arm64: tlb: Use the TLBI RANGE feature in arm64") stated this as the reason for incrementing scale: However, in most scenarios, the pages = 1 when flush_tlb_range() is called. Start from scale = 3 or other proper value (such as scale =ilog2(pages)), will incur extra overhead. So increase 'scale' from 0 to maximum, the flush order is exactly opposite to the example. But pages=1 is already special cased by the non-range invalidation path, which will take care of it the first time through the loop (both in the original commit and in my change), so I don't think switching to decrement scale should have any extra performance impact after all. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable-prot.h | 6 ++ arch/arm64/include/asm/tlb.h | 15 +++-- arch/arm64/include/asm/tlbflush.h | 83 +++++++++++++++++---------- 3 files changed, 69 insertions(+), 35 deletions(-) diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index 9b165117a454..308cc02fcdf3 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -40,6 +40,12 @@ extern bool arm64_use_ng_mappings; #define PTE_MAYBE_NG (arm64_use_ng_mappings ? PTE_NG : 0) #define PMD_MAYBE_NG (arm64_use_ng_mappings ? PMD_SECT_NG : 0) +/* + * For now the kernel never uses lpa2 for its stage1 tables. But kvm does and + * this hook allows us to update the common tlbi code to handle lpa2. + */ +#define lpa2_is_enabled() false + /* * If we have userspace only BTI we don't want to mark kernel pages * guarded even if the system does support BTI. diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h index c995d1f4594f..3a189c435973 100644 --- a/arch/arm64/include/asm/tlb.h +++ b/arch/arm64/include/asm/tlb.h @@ -22,15 +22,15 @@ static void tlb_flush(struct mmu_gather *tlb); #include /* - * get the tlbi levels in arm64. Default value is 0 if more than one - * of cleared_* is set or neither is set. - * Arm64 doesn't support p4ds now. + * get the tlbi levels in arm64. Default value is TLBI_TTL_UNKNOWN if more than + * one of cleared_* is set or neither is set - this elides the level hinting to + * the hardware. */ static inline int tlb_get_level(struct mmu_gather *tlb) { /* The TTL field is only valid for the leaf entry. */ if (tlb->freed_tables) - return 0; + return TLBI_TTL_UNKNOWN; if (tlb->cleared_ptes && !(tlb->cleared_pmds || tlb->cleared_puds || @@ -47,7 +47,12 @@ static inline int tlb_get_level(struct mmu_gather *tlb) tlb->cleared_p4ds)) return 1; - return 0; + if (tlb->cleared_p4ds && !(tlb->cleared_ptes || + tlb->cleared_pmds || + tlb->cleared_puds)) + return 0; + + return TLBI_TTL_UNKNOWN; } static inline void tlb_flush(struct mmu_gather *tlb) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 412a3b9a3c25..903d95a4bef5 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -93,19 +93,22 @@ static inline unsigned long get_trans_granule(void) * When ARMv8.4-TTL exists, TLBI operations take an additional hint for * the level at which the invalidation must take place. If the level is * wrong, no invalidation may take place. In the case where the level - * cannot be easily determined, a 0 value for the level parameter will - * perform a non-hinted invalidation. + * cannot be easily determined, the value TLBI_TTL_UNKNOWN will perform + * a non-hinted invalidation. Any provided level outside the hint range + * will also cause fall-back to non-hinted invalidation. * * For Stage-2 invalidation, use the level values provided to that effect * in asm/stage2_pgtable.h. */ #define TLBI_TTL_MASK GENMASK_ULL(47, 44) +#define TLBI_TTL_UNKNOWN (-1) + #define __tlbi_level(op, addr, level) do { \ u64 arg = addr; \ \ if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) && \ - level) { \ + level >= 0 && level <= 3) { \ u64 ttl = level & 3; \ ttl |= get_trans_granule() << 2; \ arg &= ~TLBI_TTL_MASK; \ @@ -132,17 +135,22 @@ static inline unsigned long get_trans_granule(void) * The address range is determined by below formula: * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE) * + * If LPA2 is in use, BADDR holds addr[52:16]. Else BADDR holds page number. + * See ARM DDI 0487I.a C5.5.21. + * */ -#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl) \ - ({ \ - unsigned long __ta = (addr) >> PAGE_SHIFT; \ - __ta &= GENMASK_ULL(36, 0); \ - __ta |= (unsigned long)(ttl) << 37; \ - __ta |= (unsigned long)(num) << 39; \ - __ta |= (unsigned long)(scale) << 44; \ - __ta |= get_trans_granule() << 46; \ - __ta |= (unsigned long)(asid) << 48; \ - __ta; \ +#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl, lpa2_ena) \ + ({ \ + unsigned long __addr_shift = (lpa2_ena) ? 16 : PAGE_SHIFT; \ + unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0; \ + unsigned long __ta = (addr) >> __addr_shift; \ + __ta &= GENMASK_ULL(36, 0); \ + __ta |= __ttl << 37; \ + __ta |= (unsigned long)(num) << 39; \ + __ta |= (unsigned long)(scale) << 44; \ + __ta |= get_trans_granule() << 46; \ + __ta |= (unsigned long)(asid) << 48; \ + __ta; \ }) /* These macros are used by the TLBI RANGE feature. */ @@ -215,12 +223,16 @@ static inline unsigned long get_trans_granule(void) * CPUs, ensuring that any walk-cache entries associated with the * translation are also invalidated. * - * __flush_tlb_range(vma, start, end, stride, last_level) + * __flush_tlb_range(vma, start, end, stride, last_level, tlb_level) * Invalidate the virtual-address range '[start, end)' on all * CPUs for the user address space corresponding to 'vma->mm'. * The invalidation operations are issued at a granularity * determined by 'stride' and only affect any walk-cache entries - * if 'last_level' is equal to false. + * if 'last_level' is equal to false. tlb_level is the level at + * which the invalidation must take place. If the level is wrong, + * no invalidation may take place. In the case where the level + * cannot be easily determined, the value TLBI_TTL_UNKNOWN will + * perform a non-hinted invalidation. * * * Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented @@ -284,8 +296,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, int tlb_level) { int num = 0; - int scale = 0; + int scale = 3; unsigned long asid, addr, pages; + bool lpa2_ena = lpa2_is_enabled(); start = round_down(start, stride); end = round_up(end, stride); @@ -309,17 +322,25 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, /* * When the CPU does not support TLB range operations, flush the TLB - * entries one by one at the granularity of 'stride'. If the TLB - * range ops are supported, then: + * entries one by one at the granularity of 'stride'. If the TLB range + * ops are supported, then: + * + * 1. If FEAT_LPA2 is in use, the start address of a range operation + * must be 64KB aligned, so flush pages one by one until the + * alignment is reached using the non-range operations. This step is + * skipped if LPA2 is not in use. * - * 1. If 'pages' is odd, flush the first page through non-range - * operations; + * 2. For remaining pages: the minimum range granularity is decided by + * 'scale', so multiple range TLBI operations may be required. Start + * from scale = 3, flush the corresponding number of pages + * ((num+1)*2^(5*scale+1) starting from 'addr'), then descrease it + * until one or zero pages are left. We must start from highest scale + * to ensure 64KB start alignment is maintained in the LPA2 case. * - * 2. For remaining pages: the minimum range granularity is decided - * by 'scale', so multiple range TLBI operations may be required. - * Start from scale = 0, flush the corresponding number of pages - * ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it - * until no pages left. + * 3. If there is 1 page remaining, flush it through non-range + * operations. Range operations can only span an even number of + * pages. We save this for last to ensure 64KB start alignment is + * maintained for the LPA2 case. * * Note that certain ranges can be represented by either num = 31 and * scale or num = 0 and scale + 1. The loop below favours the latter @@ -327,7 +348,8 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, */ while (pages > 0) { if (!system_supports_tlb_range() || - pages % 2 == 1) { + pages == 1 || + (lpa2_ena && start != ALIGN(start, SZ_64K))) { addr = __TLBI_VADDR(start, asid); if (last_level) { __tlbi_level(vale1is, addr, tlb_level); @@ -344,7 +366,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, num = __TLBI_RANGE_NUM(pages, scale); if (num >= 0) { addr = __TLBI_VADDR_RANGE(start, asid, scale, - num, tlb_level); + num, tlb_level, lpa2_ena); if (last_level) { __tlbi(rvale1is, addr); __tlbi_user(rvale1is, addr); @@ -355,7 +377,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; pages -= __TLBI_RANGE_PAGES(num, scale); } - scale++; + scale--; } dsb(ish); } @@ -366,9 +388,10 @@ static inline void flush_tlb_range(struct vm_area_struct *vma, /* * We cannot use leaf-only invalidation here, since we may be invalidating * table entries as part of collapsing hugepages or moving page tables. - * Set the tlb_level to 0 because we can not get enough information here. + * Set the tlb_level to TLBI_TTL_UNKNOWN because we can not get enough + * information here. */ - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); } static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) From patchwork Tue Dec 6 13:59:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C5215C352A1 for ; Tue, 6 Dec 2022 14:02:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=7yG+jEeD3JjTCNcSrQTqWi9jT08+SN46kfN6YdPCdn4=; b=vwY8VkzMV9cqSc HJZJ62TaeLJJ3GitTRepsblIPRSBjfoamY9fPhcAbqOnqlZepbycvNpiP9/1d78bNwL4Cpo9aLhUC SYRXsjU0cPD2g+tzsovGyZA7OU5jGiL3XzqKtDzS+jMb6Td8wXAzLBiOqDIT61/lHQvzsVkvocEDl Lq15fXKt74qpVaAfA3n+vJRlS97lqocQJywoa9LJPCNEElO5DLTjeESK144Bc/dxwJ/sX84cZu6LL 2uiGyYhBB1FOtibMy6PkIcM9mhs0kMIXMIASzcbIHkCjfC3sd/zZEJIr7tZbr3mCkK3pIJRnsTZVf OZnxsphFcf3kS6Jk0BYw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUn-00A9Mb-Ks; Tue, 06 Dec 2022 14:00:29 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUW-00A959-7X for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:15 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E279412FC; Tue, 6 Dec 2022 06:00:14 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 87D623F73D; Tue, 6 Dec 2022 06:00:06 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 03/12] KVM: arm64: Add new (V)TCR_EL2 field definitions for FEAT_LPA2 Date: Tue, 6 Dec 2022 13:59:21 +0000 Message-Id: <20221206135930.3277585-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060012_347315_143CAC82 X-CRM114-Status: UNSURE ( 7.89 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org As per Arm ARM (0487I.a), (V)TCR_EL2.DS fields control whether 52 bit input and output addresses are supported on 4K and 16K page size configurations when FEAT_LPA2 is known to have been implemented. Additionally, VTCR_EL2.SL2 field is added to enable encoding of a 5th starting level of translation, which is required with 4KB IPA size of 49-52 bits if concatenated first level page tables are not used. This adds these field definitions which will be used by KVM when FEAT_LPA2 is enabled. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_arm.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index a82f2493a72b..f9619a10d5d9 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -92,6 +92,7 @@ #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H) /* TCR_EL2 Registers bits */ +#define TCR_EL2_DS (1UL << 32) #define TCR_EL2_RES1 ((1U << 31) | (1 << 23)) #define TCR_EL2_TBI (1 << 20) #define TCR_EL2_PS_SHIFT 16 @@ -106,6 +107,9 @@ TCR_EL2_ORGN0_MASK | TCR_EL2_IRGN0_MASK | TCR_EL2_T0SZ_MASK) /* VTCR_EL2 Registers bits */ +#define VTCR_EL2_SL2_SHIFT 33 +#define VTCR_EL2_SL2_MASK (1UL << VTCR_EL2_SL2_SHIFT) +#define VTCR_EL2_DS TCR_EL2_DS #define VTCR_EL2_RES1 (1U << 31) #define VTCR_EL2_HD (1 << 22) #define VTCR_EL2_HA (1 << 21) From patchwork Tue Dec 6 13:59:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065848 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 823B0C352A1 for ; Tue, 6 Dec 2022 14:03:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=DPbOdRm2pZrTlgog8iZad90C9Xg6Nt8uUp+VR603qVE=; b=r5R7JpO33vbvh/ 5dlEaK2M5vUEESCr1D5nzKux00ZRAzcMTpLqe8spBgNOCcINodz+NvStTHI31FRapwf1HNWUzzUvu ued6YYejo6169HbAzFOB6AjWJOJxoGxxygiAiXOCAEUrPu9KrEhKzOr3X/bFjhpaX/2HxmYqa5m/N P8no1W9Shp9sK61D0gIikD2hDZJD2C9E+hLXCCZPxP84L9jc8xFbgy82zZq5LNP6BbzS32TsumJjl 84DhMwRG7TLUHlrK8bYKek3+ePfczC9FuPbFJS1bxef/5/ZqRQ8Z1M8ZarYFO8/bnEBtqmpyh3OVO DIWh/YN32XQSIogr52iA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YVF-00A9sx-Qp; Tue, 06 Dec 2022 14:00:57 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUW-00A95J-7Y for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:19 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DD583139F; Tue, 6 Dec 2022 06:00:16 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 699F13F73D; Tue, 6 Dec 2022 06:00:08 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 04/12] KVM: arm64: Plumbing to enable multiple pgtable formats Date: Tue, 6 Dec 2022 13:59:22 +0000 Message-Id: <20221206135930.3277585-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060012_523334_42332850 X-CRM114-Status: GOOD ( 18.98 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org FEAT_LPA2 brings support for 52-bit input and output addresses for both stage1 and stage2 translation when using 4KB and 16KB page sizes. The architecture allows for the HW to support FEAT_LPA2 in one or both stages of translation. When FEAT_LPA2 is enabled for a given stage, it effectively changes the page table format; PTE bits change meaning and blocks can be mapped at levels that were previously not possible. All of this means that KVM has to support 2 page table formats and decide which to use at runtime, after querying the HW. If FEAT_LPA2 is advertised for stage1, KVM must choose to either use the classic format or lpa2 format according to some policy for its hyp stage1, else it must use the classic format. Independently, if FEAT_LPA2 is advertised for stage2, KVM must which format to use for the vm stage2 tables according to a policy. As a first step towards enabling FEAT_LPA2, make struct kvm_pgtable accessible to functions that will need to take different actions depending on the page-table format. These functions are: - kvm_pte_to_phys() - kvm_phys_to_pte() - kvm_level_supports_block_mapping() - hyp_set_prot_attr() - stage2_set_prot_attr() Fix this by more consistently passing the struct kvm_pgtable around as the first parameter of each kvm_pgtable function call. As a result of always passing it to walker callbacks, we can remove some ad-hoc members from walker-specific data structures because those members are accessible through the struct kvm_pgtable (notably mmu and mm_ops). No functional changes are intended. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_pgtable.h | 23 ++-- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 5 +- arch/arm64/kvm/hyp/nvhe/setup.c | 8 +- arch/arm64/kvm/hyp/pgtable.c | 181 +++++++++++++------------- 4 files changed, 109 insertions(+), 108 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 3252eb50ecfe..2247ed74871a 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -47,16 +47,6 @@ static inline bool kvm_pte_valid(kvm_pte_t pte) return pte & KVM_PTE_VALID; } -static inline u64 kvm_pte_to_phys(kvm_pte_t pte) -{ - u64 pa = pte & KVM_PTE_ADDR_MASK; - - if (PAGE_SHIFT == 16) - pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; - - return pa; -} - static inline u64 kvm_granule_shift(u32 level) { /* Assumes KVM_PGTABLE_MAX_LEVELS is 4 */ @@ -184,6 +174,16 @@ struct kvm_pgtable { kvm_pgtable_force_pte_cb_t force_pte_cb; }; +static inline u64 kvm_pte_to_phys(struct kvm_pgtable *pgt, kvm_pte_t pte) +{ + u64 pa = pte & KVM_PTE_ADDR_MASK; + + if (PAGE_SHIFT == 16) + pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; + + return pa; +} + /** * enum kvm_pgtable_walk_flags - Flags to control a depth-first page-table walk. * @KVM_PGTABLE_WALK_LEAF: Visit leaf entries, including invalid @@ -199,7 +199,8 @@ enum kvm_pgtable_walk_flags { KVM_PGTABLE_WALK_TABLE_POST = BIT(2), }; -typedef int (*kvm_pgtable_visitor_fn_t)(u64 addr, u64 end, u32 level, +typedef int (*kvm_pgtable_visitor_fn_t)(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg); diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 07f9dc9848ef..6bf54c8daffa 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -417,7 +417,8 @@ struct check_walk_data { enum pkvm_page_state (*get_page_state)(kvm_pte_t pte); }; -static int __check_page_state_visitor(u64 addr, u64 end, u32 level, +static int __check_page_state_visitor(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) @@ -425,7 +426,7 @@ static int __check_page_state_visitor(u64 addr, u64 end, u32 level, struct check_walk_data *d = arg; kvm_pte_t pte = *ptep; - if (kvm_pte_valid(pte) && !addr_is_memory(kvm_pte_to_phys(pte))) + if (kvm_pte_valid(pte) && !addr_is_memory(kvm_pte_to_phys(pgt, pte))) return -EINVAL; return d->get_page_state(pte) == d->desired ? 0 : -EPERM; diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index e8d4ea2fcfa0..60a6821ae98a 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -186,12 +186,13 @@ static void hpool_put_page(void *addr) hyp_put_page(&hpool, addr); } -static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level, +static int finalize_host_mappings_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { - struct kvm_pgtable_mm_ops *mm_ops = arg; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; enum kvm_pgtable_prot prot; enum pkvm_page_state state; kvm_pte_t pte = *ptep; @@ -212,7 +213,7 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level, if (level != (KVM_PGTABLE_MAX_LEVELS - 1)) return -EINVAL; - phys = kvm_pte_to_phys(pte); + phys = kvm_pte_to_phys(pgt, pte); if (!addr_is_memory(phys)) return -EINVAL; @@ -242,7 +243,6 @@ static int finalize_host_mappings(void) struct kvm_pgtable_walker walker = { .cb = finalize_host_mappings_walker, .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, - .arg = pkvm_pgtable.mm_ops, }; int i, ret; diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index cdf8e76b0be1..221e0dafb149 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -59,12 +59,13 @@ struct kvm_pgtable_walk_data { #define KVM_PHYS_INVALID (-1ULL) -static bool kvm_phys_is_valid(u64 phys) +static bool kvm_phys_is_valid(struct kvm_pgtable *pgt, u64 phys) { return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX)); } -static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level) +static bool kvm_block_mapping_supported(struct kvm_pgtable *pgt, + u64 addr, u64 end, u64 phys, u32 level) { u64 granule = kvm_granule_size(level); @@ -74,7 +75,7 @@ static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level) if (granule > (end - addr)) return false; - if (kvm_phys_is_valid(phys) && !IS_ALIGNED(phys, granule)) + if (kvm_phys_is_valid(pgt, phys) && !IS_ALIGNED(phys, granule)) return false; return IS_ALIGNED(addr, granule); @@ -122,7 +123,7 @@ static bool kvm_pte_table(kvm_pte_t pte, u32 level) return FIELD_GET(KVM_PTE_TYPE, pte) == KVM_PTE_TYPE_TABLE; } -static kvm_pte_t kvm_phys_to_pte(u64 pa) +static kvm_pte_t kvm_phys_to_pte(struct kvm_pgtable *pgt, u64 pa) { kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK; @@ -132,9 +133,9 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa) return pte; } -static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct kvm_pgtable_mm_ops *mm_ops) +static kvm_pte_t *kvm_pte_follow(struct kvm_pgtable *pgt, kvm_pte_t pte) { - return mm_ops->phys_to_virt(kvm_pte_to_phys(pte)); + return pgt->mm_ops->phys_to_virt(kvm_pte_to_phys(pgt, pte)); } static void kvm_clear_pte(kvm_pte_t *ptep) @@ -142,10 +143,11 @@ static void kvm_clear_pte(kvm_pte_t *ptep) WRITE_ONCE(*ptep, 0); } -static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp, - struct kvm_pgtable_mm_ops *mm_ops) +static void kvm_set_table_pte(struct kvm_pgtable *pgt, + kvm_pte_t *ptep, kvm_pte_t *childp) { - kvm_pte_t old = *ptep, pte = kvm_phys_to_pte(mm_ops->virt_to_phys(childp)); + kvm_pte_t old = *ptep; + kvm_pte_t pte = kvm_phys_to_pte(pgt, pgt->mm_ops->virt_to_phys(childp)); pte |= FIELD_PREP(KVM_PTE_TYPE, KVM_PTE_TYPE_TABLE); pte |= KVM_PTE_VALID; @@ -154,9 +156,10 @@ static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp, smp_store_release(ptep, pte); } -static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level) +static kvm_pte_t kvm_init_valid_leaf_pte(struct kvm_pgtable *pgt, + u64 pa, kvm_pte_t attr, u32 level) { - kvm_pte_t pte = kvm_phys_to_pte(pa); + kvm_pte_t pte = kvm_phys_to_pte(pgt, pa); u64 type = (level == KVM_PGTABLE_MAX_LEVELS - 1) ? KVM_PTE_TYPE_PAGE : KVM_PTE_TYPE_BLOCK; @@ -177,7 +180,8 @@ static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr, enum kvm_pgtable_walk_flags flag) { struct kvm_pgtable_walker *walker = data->walker; - return walker->cb(addr, data->end, level, ptep, flag, walker->arg); + return walker->cb(data->pgt, + addr, data->end, level, ptep, flag, walker->arg); } static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, @@ -213,7 +217,7 @@ static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, goto out; } - childp = kvm_pte_follow(pte, data->pgt->mm_ops); + childp = kvm_pte_follow(data->pgt, pte); ret = __kvm_pgtable_walk(data, childp, level + 1); if (ret) goto out; @@ -292,7 +296,8 @@ struct leaf_walk_data { u32 level; }; -static int leaf_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int leaf_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { struct leaf_walk_data *data = arg; @@ -329,10 +334,10 @@ int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr, struct hyp_map_data { u64 phys; kvm_pte_t attr; - struct kvm_pgtable_mm_ops *mm_ops; }; -static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep) +static int hyp_set_prot_attr(struct kvm_pgtable *pgt, + enum kvm_pgtable_prot prot, kvm_pte_t *ptep) { bool device = prot & KVM_PGTABLE_PROT_DEVICE; u32 mtype = device ? MT_DEVICE_nGnRE : MT_NORMAL; @@ -383,21 +388,22 @@ enum kvm_pgtable_prot kvm_pgtable_hyp_pte_prot(kvm_pte_t pte) return prot; } -static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level, +static bool hyp_map_walker_try_leaf(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct hyp_map_data *data) { kvm_pte_t new, old = *ptep; u64 granule = kvm_granule_size(level), phys = data->phys; - if (!kvm_block_mapping_supported(addr, end, phys, level)) + if (!kvm_block_mapping_supported(pgt, addr, end, phys, level)) return false; data->phys += granule; - new = kvm_init_valid_leaf_pte(phys, data->attr, level); + new = kvm_init_valid_leaf_pte(pgt, phys, data->attr, level); if (old == new) return true; if (!kvm_pte_valid(old)) - data->mm_ops->get_page(ptep); + pgt->mm_ops->get_page(ptep); else if (WARN_ON((old ^ new) & ~KVM_PTE_LEAF_ATTR_HI_SW)) return false; @@ -405,14 +411,15 @@ static bool hyp_map_walker_try_leaf(u64 addr, u64 end, u32 level, return true; } -static int hyp_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int hyp_map_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { kvm_pte_t *childp; struct hyp_map_data *data = arg; - struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; - if (hyp_map_walker_try_leaf(addr, end, level, ptep, arg)) + if (hyp_map_walker_try_leaf(pgt, addr, end, level, ptep, data)) return 0; if (WARN_ON(level == KVM_PGTABLE_MAX_LEVELS - 1)) @@ -422,7 +429,7 @@ static int hyp_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, if (!childp) return -ENOMEM; - kvm_set_table_pte(ptep, childp, mm_ops); + kvm_set_table_pte(pgt, ptep, childp); mm_ops->get_page(ptep); return 0; } @@ -433,7 +440,6 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, int ret; struct hyp_map_data map_data = { .phys = ALIGN_DOWN(phys, PAGE_SIZE), - .mm_ops = pgt->mm_ops, }; struct kvm_pgtable_walker walker = { .cb = hyp_map_walker, @@ -441,7 +447,7 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, .arg = &map_data, }; - ret = hyp_set_prot_attr(prot, &map_data.attr); + ret = hyp_set_prot_attr(pgt, prot, &map_data.attr); if (ret) return ret; @@ -453,22 +459,22 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, struct hyp_unmap_data { u64 unmapped; - struct kvm_pgtable_mm_ops *mm_ops; }; -static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int hyp_unmap_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { kvm_pte_t pte = *ptep, *childp = NULL; u64 granule = kvm_granule_size(level); struct hyp_unmap_data *data = arg; - struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; if (!kvm_pte_valid(pte)) return -EINVAL; if (kvm_pte_table(pte, level)) { - childp = kvm_pte_follow(pte, mm_ops); + childp = kvm_pte_follow(pgt, pte); if (mm_ops->page_count(childp) != 1) return 0; @@ -498,9 +504,7 @@ static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) { - struct hyp_unmap_data unmap_data = { - .mm_ops = pgt->mm_ops, - }; + struct hyp_unmap_data unmap_data = {}; struct kvm_pgtable_walker walker = { .cb = hyp_unmap_walker, .arg = &unmap_data, @@ -532,10 +536,11 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, return 0; } -static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int hyp_free_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { - struct kvm_pgtable_mm_ops *mm_ops = arg; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; kvm_pte_t pte = *ptep; if (!kvm_pte_valid(pte)) @@ -544,7 +549,7 @@ static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, mm_ops->put_page(ptep); if (kvm_pte_table(pte, level)) - mm_ops->put_page(kvm_pte_follow(pte, mm_ops)); + mm_ops->put_page(kvm_pte_follow(pgt, pte)); return 0; } @@ -554,7 +559,6 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt) struct kvm_pgtable_walker walker = { .cb = hyp_free_walker, .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, - .arg = pgt->mm_ops, }; WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), &walker)); @@ -570,11 +574,8 @@ struct stage2_map_data { kvm_pte_t *anchor; kvm_pte_t *childp; - struct kvm_s2_mmu *mmu; void *memcache; - struct kvm_pgtable_mm_ops *mm_ops; - /* Force mappings to page granularity */ bool force_pte; }; @@ -708,29 +709,30 @@ static bool stage2_pte_executable(kvm_pte_t pte) return !(pte & KVM_PTE_LEAF_ATTR_HI_S2_XN); } -static bool stage2_leaf_mapping_allowed(u64 addr, u64 end, u32 level, +static bool stage2_leaf_mapping_allowed(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, struct stage2_map_data *data) { if (data->force_pte && (level < (KVM_PGTABLE_MAX_LEVELS - 1))) return false; - return kvm_block_mapping_supported(addr, end, data->phys, level); + return kvm_block_mapping_supported(pgt, addr, end, data->phys, level); } -static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, +static int stage2_map_walker_try_leaf(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) { kvm_pte_t new, old = *ptep; u64 granule = kvm_granule_size(level), phys = data->phys; - struct kvm_pgtable *pgt = data->mmu->pgt; - struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; - if (!stage2_leaf_mapping_allowed(addr, end, level, data)) + if (!stage2_leaf_mapping_allowed(pgt, addr, end, level, data)) return -E2BIG; - if (kvm_phys_is_valid(phys)) - new = kvm_init_valid_leaf_pte(phys, data->attr, level); + if (kvm_phys_is_valid(pgt, phys)) + new = kvm_init_valid_leaf_pte(pgt, phys, data->attr, level); else new = kvm_init_invalid_leaf_owner(data->owner_id); @@ -744,36 +746,37 @@ static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, if (!stage2_pte_needs_update(old, new)) return -EAGAIN; - stage2_put_pte(ptep, data->mmu, addr, level, mm_ops); + stage2_put_pte(ptep, pgt->mmu, addr, level, mm_ops); } /* Perform CMOs before installation of the guest stage-2 PTE */ if (mm_ops->dcache_clean_inval_poc && stage2_pte_cacheable(pgt, new)) - mm_ops->dcache_clean_inval_poc(kvm_pte_follow(new, mm_ops), + mm_ops->dcache_clean_inval_poc(kvm_pte_follow(pgt, new), granule); if (mm_ops->icache_inval_pou && stage2_pte_executable(new)) - mm_ops->icache_inval_pou(kvm_pte_follow(new, mm_ops), granule); + mm_ops->icache_inval_pou(kvm_pte_follow(pgt, new), granule); smp_store_release(ptep, new); if (stage2_pte_is_counted(new)) mm_ops->get_page(ptep); - if (kvm_phys_is_valid(phys)) + if (kvm_phys_is_valid(pgt, phys)) data->phys += granule; return 0; } -static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level, +static int stage2_map_walk_table_pre(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) { if (data->anchor) return 0; - if (!stage2_leaf_mapping_allowed(addr, end, level, data)) + if (!stage2_leaf_mapping_allowed(pgt, addr, end, level, data)) return 0; - data->childp = kvm_pte_follow(*ptep, data->mm_ops); + data->childp = kvm_pte_follow(pgt, *ptep); kvm_clear_pte(ptep); /* @@ -781,15 +784,16 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level, * entries below us which would otherwise need invalidating * individually. */ - kvm_call_hyp(__kvm_tlb_flush_vmid, data->mmu); + kvm_call_hyp(__kvm_tlb_flush_vmid, pgt->mmu); data->anchor = ptep; return 0; } -static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int stage2_map_walk_leaf(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) { - struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; kvm_pte_t *childp, pte = *ptep; int ret; @@ -800,7 +804,7 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, return 0; } - ret = stage2_map_walker_try_leaf(addr, end, level, ptep, data); + ret = stage2_map_walker_try_leaf(pgt, addr, end, level, ptep, data); if (ret != -E2BIG) return ret; @@ -820,19 +824,20 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, * will be mapped lazily. */ if (stage2_pte_is_counted(pte)) - stage2_put_pte(ptep, data->mmu, addr, level, mm_ops); + stage2_put_pte(ptep, pgt->mmu, addr, level, mm_ops); - kvm_set_table_pte(ptep, childp, mm_ops); + kvm_set_table_pte(pgt, ptep, childp); mm_ops->get_page(ptep); return 0; } -static int stage2_map_walk_table_post(u64 addr, u64 end, u32 level, +static int stage2_map_walk_table_post(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) { - struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; kvm_pte_t *childp; int ret = 0; @@ -843,9 +848,9 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, u32 level, childp = data->childp; data->anchor = NULL; data->childp = NULL; - ret = stage2_map_walk_leaf(addr, end, level, ptep, data); + ret = stage2_map_walk_leaf(pgt, addr, end, level, ptep, data); } else { - childp = kvm_pte_follow(*ptep, mm_ops); + childp = kvm_pte_follow(pgt, *ptep); } mm_ops->put_page(childp); @@ -873,18 +878,19 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, u32 level, * the page-table, installing the block entry when it revisits the anchor * pointer and clearing the anchor to NULL. */ -static int stage2_map_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int stage2_map_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { struct stage2_map_data *data = arg; switch (flag) { case KVM_PGTABLE_WALK_TABLE_PRE: - return stage2_map_walk_table_pre(addr, end, level, ptep, data); + return stage2_map_walk_table_pre(pgt, addr, end, level, ptep, data); case KVM_PGTABLE_WALK_LEAF: - return stage2_map_walk_leaf(addr, end, level, ptep, data); + return stage2_map_walk_leaf(pgt, addr, end, level, ptep, data); case KVM_PGTABLE_WALK_TABLE_POST: - return stage2_map_walk_table_post(addr, end, level, ptep, data); + return stage2_map_walk_table_post(pgt, addr, end, level, ptep, data); } return -EINVAL; @@ -897,9 +903,7 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, int ret; struct stage2_map_data map_data = { .phys = ALIGN_DOWN(phys, PAGE_SIZE), - .mmu = pgt->mmu, .memcache = mc, - .mm_ops = pgt->mm_ops, .force_pte = pgt->force_pte_cb && pgt->force_pte_cb(addr, addr + size, prot), }; struct kvm_pgtable_walker walker = { @@ -928,9 +932,7 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size, int ret; struct stage2_map_data map_data = { .phys = KVM_PHYS_INVALID, - .mmu = pgt->mmu, .memcache = mc, - .mm_ops = pgt->mm_ops, .owner_id = owner_id, .force_pte = true, }; @@ -949,11 +951,11 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size, return ret; } -static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int stage2_unmap_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { - struct kvm_pgtable *pgt = arg; struct kvm_s2_mmu *mmu = pgt->mmu; struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; kvm_pte_t pte = *ptep, *childp = NULL; @@ -968,7 +970,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, } if (kvm_pte_table(pte, level)) { - childp = kvm_pte_follow(pte, mm_ops); + childp = kvm_pte_follow(pgt, pte); if (mm_ops->page_count(childp) != 1) return 0; @@ -984,7 +986,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, stage2_put_pte(ptep, mmu, addr, level, mm_ops); if (need_flush && mm_ops->dcache_clean_inval_poc) - mm_ops->dcache_clean_inval_poc(kvm_pte_follow(pte, mm_ops), + mm_ops->dcache_clean_inval_poc(kvm_pte_follow(pgt, pte), kvm_granule_size(level)); if (childp) @@ -997,7 +999,6 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) { struct kvm_pgtable_walker walker = { .cb = stage2_unmap_walker, - .arg = pgt, .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, }; @@ -1009,16 +1010,16 @@ struct stage2_attr_data { kvm_pte_t attr_clr; kvm_pte_t pte; u32 level; - struct kvm_pgtable_mm_ops *mm_ops; }; -static int stage2_attr_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int stage2_attr_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { kvm_pte_t pte = *ptep; struct stage2_attr_data *data = arg; - struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; if (!kvm_pte_valid(pte)) return 0; @@ -1040,7 +1041,7 @@ static int stage2_attr_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, */ if (mm_ops->icache_inval_pou && stage2_pte_executable(pte) && !stage2_pte_executable(*ptep)) - mm_ops->icache_inval_pou(kvm_pte_follow(pte, mm_ops), + mm_ops->icache_inval_pou(kvm_pte_follow(pgt, pte), kvm_granule_size(level)); WRITE_ONCE(*ptep, pte); } @@ -1058,7 +1059,6 @@ static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr, struct stage2_attr_data data = { .attr_set = attr_set & attr_mask, .attr_clr = attr_clr & attr_mask, - .mm_ops = pgt->mm_ops, }; struct kvm_pgtable_walker walker = { .cb = stage2_attr_walker, @@ -1140,11 +1140,11 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr, return ret; } -static int stage2_flush_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int stage2_flush_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { - struct kvm_pgtable *pgt = arg; struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; kvm_pte_t pte = *ptep; @@ -1152,7 +1152,7 @@ static int stage2_flush_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, return 0; if (mm_ops->dcache_clean_inval_poc) - mm_ops->dcache_clean_inval_poc(kvm_pte_follow(pte, mm_ops), + mm_ops->dcache_clean_inval_poc(kvm_pte_follow(pgt, pte), kvm_granule_size(level)); return 0; } @@ -1162,7 +1162,6 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size) struct kvm_pgtable_walker walker = { .cb = stage2_flush_walker, .flags = KVM_PGTABLE_WALK_LEAF, - .arg = pgt, }; if (stage2_has_fwb(pgt)) @@ -1200,11 +1199,12 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, return 0; } -static int stage2_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, +static int stage2_free_walker(struct kvm_pgtable *pgt, + u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { - struct kvm_pgtable_mm_ops *mm_ops = arg; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; kvm_pte_t pte = *ptep; if (!stage2_pte_is_counted(pte)) @@ -1213,7 +1213,7 @@ static int stage2_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, mm_ops->put_page(ptep); if (kvm_pte_table(pte, level)) - mm_ops->put_page(kvm_pte_follow(pte, mm_ops)); + mm_ops->put_page(kvm_pte_follow(pgt, pte)); return 0; } @@ -1225,7 +1225,6 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt) .cb = stage2_free_walker, .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, - .arg = pgt->mm_ops, }; WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), &walker)); From patchwork Tue Dec 6 13:59:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D132EC352A1 for ; Tue, 6 Dec 2022 14:02:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=EAB4VNxqjoOAVIzvXsaroinpRJw8JNYJSgmPtJmB/fg=; b=FbO75nMS9S2dDr QBUZkVwcEAR7kO0YIRMr3MpB5ZKC3tWSHKFRyHeAbJO21iDek5PYv+rdl3Ax9sww28sKHgLKixFbt 1oapMBGP2xRamMVf6PS6wf5buxj3IB1xEGLm3kkOZRdpXrVyUSfBSmb4UeNWv+nLQWPk9rhjw4dO6 1BhNYrD+bsz7ymD8XMzeIz7bf/4dZHiUHrHn1OFWqCYtmAVqv8p+VI6VHeryY5/IvxiyGxhS8SqFQ L7PdwPuKjGVWo6kXOWEt6Au6I6VhQ6ZvBBq2LEF1Woq6tLmEvOSk3xIQ2VqKxXt2cS6ooZhwtagSL K+XwM8ZRpvZDj4KETmHA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YVW-00AAC2-M5; Tue, 06 Dec 2022 14:01:14 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUa-00A9CD-Ux for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:22 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BC13E1424; Tue, 6 Dec 2022 06:00:18 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 637E93F73D; Tue, 6 Dec 2022 06:00:10 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 05/12] KVM: arm64: Maintain page-table format info in struct kvm_pgtable Date: Tue, 6 Dec 2022 13:59:23 +0000 Message-Id: <20221206135930.3277585-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060017_106338_A955CC8F X-CRM114-Status: GOOD ( 11.15 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org As the next step on the journey to supporting FEAT_LPA2 in KVM, add a flag to struct kvm_pgtable, which functions can then use to select the approprate behavior for either the `classic` or `lpa2` page-table formats. For now, all page-tables remain in the `classic` format. No functional changes are intended. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_pgtable.h | 2 ++ arch/arm64/kvm/hyp/pgtable.c | 2 ++ arch/arm64/kvm/mmu.c | 1 + 3 files changed, 5 insertions(+) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 2247ed74871a..744e224d964b 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -157,6 +157,7 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end, * @start_level: Level at which the page-table walk starts. * @pgd: Pointer to the first top-level entry of the page-table. * @mm_ops: Memory management callbacks. + * @lpa2_ena: Format used for page-table; false->classic, true->lpa2. * @mmu: Stage-2 KVM MMU struct. Unused for stage-1 page-tables. * @flags: Stage-2 page-table flags. * @force_pte_cb: Function that returns true if page level mappings must @@ -167,6 +168,7 @@ struct kvm_pgtable { u32 start_level; kvm_pte_t *pgd; struct kvm_pgtable_mm_ops *mm_ops; + bool lpa2_ena; /* Stage-2 only */ struct kvm_s2_mmu *mmu; diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 221e0dafb149..c7799cd50af8 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -530,6 +530,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, pgt->ia_bits = va_bits; pgt->start_level = KVM_PGTABLE_MAX_LEVELS - levels; pgt->mm_ops = mm_ops; + pgt->lpa2_ena = false; pgt->mmu = NULL; pgt->force_pte_cb = NULL; @@ -1190,6 +1191,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, pgt->ia_bits = ia_bits; pgt->start_level = start_level; pgt->mm_ops = mm_ops; + pgt->lpa2_ena = false; pgt->mmu = mmu; pgt->flags = flags; pgt->force_pte_cb = force_pte_cb; diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 1ef0704420d9..e3fe3e194fd1 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -645,6 +645,7 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr) .start_level = (KVM_PGTABLE_MAX_LEVELS - CONFIG_PGTABLE_LEVELS), .mm_ops = &kvm_user_mm_ops, + .lpa2_ena = lpa2_is_enabled(), }; kvm_pte_t pte = 0; /* Keep GCC quiet... */ u32 level = ~0; From patchwork Tue Dec 6 13:59:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C33C8C352A1 for ; Tue, 6 Dec 2022 14:03:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=EeXOJuNn8kGVWrQt0Wu3BumP9ABN+ol8mt3lzskuPr4=; b=3EQpTaoJk1OEQU egA1AzduA6b9G6+Cl29YhDfW+wRQ1gPzsDI5SYgwd6n8LMiwniQeFDCMmE5kIS+/mtXo2BVw7zKnw p7VLUdpBswriSAxD7zUglJFxycGuA8gVthtf+ov4GTbp95VGoDlBsmVFejfQmif2spUQztgvngGej aR6v8huaaOCeEVNsGyGRwSF85PsdD2iyeOkUMgY3SW4bekNkwv98W/wGCy25BwxUieiV/dMyYg21f Y/TvdBkt0/L9+sJs5H3Svi43SnNhR9GG+8MH/XU2kCgnQsHK1xUHvhXv+YNDBYGjPqKoGNg0jmX+g a5aVpHCHhpyEbQ+XylXw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YVp-00AASJ-H3; Tue, 06 Dec 2022 14:01:33 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUa-00A9CB-Uh for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:23 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9B20C143D; Tue, 6 Dec 2022 06:00:20 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 428383F73D; Tue, 6 Dec 2022 06:00:12 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 06/12] KVM: arm64: Use LPA2 page-tables for stage2 if HW supports it Date: Tue, 6 Dec 2022 13:59:24 +0000 Message-Id: <20221206135930.3277585-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060017_146420_6FCF63CD X-CRM114-Status: GOOD ( 20.15 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Implement a simple policy whereby if the HW supports FEAT_LPA2 for the page size we are using, always use LPA2-style page-tables for stage 2, regardless of the VMM-requested IPA size or HW-implemented PA size. When in use we can now support up to 52-bit IPA and PA sizes. We use the preparitory work that tracks the page-table format in struct kvm_pgtable and passes the pgt pointer to all kvm_pgtable functions that need to modify their behavior based on the format. Note that FEAT_LPA2 brings support for bigger block mappings (512GB with 4KB, 64GB with 16KB). We explicitly don't enable these in the library because stage2_apply_range() works on batch sizes of the largest used block mapping, and increasing the size of the batch would lead to soft lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit stage2_apply_range() batch size to largest block"). Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_pgtable.h | 42 ++++++++++++++++++++----- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 12 +++---- arch/arm64/kvm/hyp/pgtable.c | 45 ++++++++++++++++++++++----- 3 files changed, 78 insertions(+), 21 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 744e224d964b..a7fd547dcc71 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -25,12 +25,32 @@ #define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U #endif -static inline u64 kvm_get_parange(u64 mmfr0) +static inline bool kvm_supports_stage2_lpa2(u64 mmfr0) { + unsigned int tgran; + + tgran = cpuid_feature_extract_unsigned_field(mmfr0, + ID_AA64MMFR0_EL1_TGRAN_2_SHIFT); + return (tgran == ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2 && + PAGE_SIZE != SZ_64K); +} + +static inline u64 kvm_get_parange_max(bool lpa2_ena) +{ + if (lpa2_ena || + (IS_ENABLED(CONFIG_ARM64_PA_BITS_52) && PAGE_SIZE == SZ_64K)) + return ID_AA64MMFR0_EL1_PARANGE_52; + else + return ID_AA64MMFR0_EL1_PARANGE_48; +} + +static inline u64 kvm_get_parange(u64 mmfr0, bool lpa2_ena) +{ + u64 parange_max = kvm_get_parange_max(lpa2_ena); u64 parange = cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_EL1_PARANGE_SHIFT); - if (parange > ID_AA64MMFR0_EL1_PARANGE_MAX) - parange = ID_AA64MMFR0_EL1_PARANGE_MAX; + if (parange > parange_max) + parange = parange_max; return parange; } @@ -41,6 +61,8 @@ typedef u64 kvm_pte_t; #define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT) #define KVM_PTE_ADDR_51_48 GENMASK(15, 12) +#define KVM_PTE_ADDR_MASK_LPA2 GENMASK(49, PAGE_SHIFT) +#define KVM_PTE_ADDR_51_50_LPA2 GENMASK(9, 8) static inline bool kvm_pte_valid(kvm_pte_t pte) { @@ -178,10 +200,16 @@ struct kvm_pgtable { static inline u64 kvm_pte_to_phys(struct kvm_pgtable *pgt, kvm_pte_t pte) { - u64 pa = pte & KVM_PTE_ADDR_MASK; + u64 pa; - if (PAGE_SHIFT == 16) - pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; + if (pgt->lpa2_ena) { + pa = pte & KVM_PTE_ADDR_MASK_LPA2; + pa |= FIELD_GET(KVM_PTE_ADDR_51_50_LPA2, pte) << 50; + } else { + pa = pte & KVM_PTE_ADDR_MASK; + if (PAGE_SHIFT == 16) + pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; + } return pa; } @@ -287,7 +315,7 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size); * kvm_get_vtcr() - Helper to construct VTCR_EL2 * @mmfr0: Sanitized value of SYS_ID_AA64MMFR0_EL1 register. * @mmfr1: Sanitized value of SYS_ID_AA64MMFR1_EL1 register. - * @phys_shfit: Value to set in VTCR_EL2.T0SZ. + * @phys_shift: Value to set in VTCR_EL2.T0SZ, or 0 to infer from parange. * * The VTCR value is common across all the physical CPUs on the system. * We use system wide sanitised values to fill in different fields, diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 6bf54c8daffa..43e729694deb 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -105,14 +105,12 @@ static int prepare_s2_pool(void *pgt_pool_base) static void prepare_host_vtcr(void) { - u32 parange, phys_shift; - - /* The host stage 2 is id-mapped, so use parange for T0SZ */ - parange = kvm_get_parange(id_aa64mmfr0_el1_sys_val); - phys_shift = id_aa64mmfr0_parange_to_phys_shift(parange); - + /* + * The host stage 2 is id-mapped; passing phys_shift=0 forces parange to + * be used for T0SZ. + */ host_kvm.arch.vtcr = kvm_get_vtcr(id_aa64mmfr0_el1_sys_val, - id_aa64mmfr1_el1_sys_val, phys_shift); + id_aa64mmfr1_el1_sys_val, 0); } static bool host_stage2_force_pte_cb(u64 addr, u64 end, enum kvm_pgtable_prot prot); diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index c7799cd50af8..8ed7353f07bc 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -61,7 +61,10 @@ struct kvm_pgtable_walk_data { static bool kvm_phys_is_valid(struct kvm_pgtable *pgt, u64 phys) { - return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX)); + u64 parange_max = kvm_get_parange_max(pgt->lpa2_ena); + u8 shift = id_aa64mmfr0_parange_to_phys_shift(parange_max); + + return phys < BIT(shift); } static bool kvm_block_mapping_supported(struct kvm_pgtable *pgt, @@ -125,10 +128,16 @@ static bool kvm_pte_table(kvm_pte_t pte, u32 level) static kvm_pte_t kvm_phys_to_pte(struct kvm_pgtable *pgt, u64 pa) { - kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK; + kvm_pte_t pte; - if (PAGE_SHIFT == 16) - pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); + if (pgt->lpa2_ena) { + pte = pa & KVM_PTE_ADDR_MASK_LPA2; + pte |= FIELD_PREP(KVM_PTE_ADDR_51_50_LPA2, pa >> 50); + } else { + pte = pa & KVM_PTE_ADDR_MASK; + if (PAGE_SHIFT == 16) + pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); + } return pte; } @@ -585,8 +594,24 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) { u64 vtcr = VTCR_EL2_FLAGS; u8 lvls; + u64 parange; + bool lpa2_ena = false; + + /* + * If stage 2 reports that it supports FEAT_LPA2 for our page size, then + * we always use the LPA2 format regardless of IA and OA size. + */ + lpa2_ena = kvm_supports_stage2_lpa2(mmfr0); + + parange = kvm_get_parange(mmfr0, lpa2_ena); - vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT; + /* + * Infer IPA size to be equal to PA size if phys_shift is 0. + */ + if (phys_shift == 0) + phys_shift = id_aa64mmfr0_parange_to_phys_shift(parange); + + vtcr |= parange << VTCR_EL2_PS_SHIFT; vtcr |= VTCR_EL2_T0SZ(phys_shift); /* * Use a minimum 2 level page table to prevent splitting @@ -604,6 +629,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) */ vtcr |= VTCR_EL2_HA; + if (lpa2_ena) + vtcr |= VTCR_EL2_DS; + /* Set the vmid bits */ vtcr |= (get_vmid_bits(mmfr1) == 16) ? VTCR_EL2_VS_16BIT : @@ -641,7 +669,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p if (prot & KVM_PGTABLE_PROT_W) attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W; - attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh); + if (!pgt->lpa2_ena) + attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh); + attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF; attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW; *ptep = attr; @@ -1182,6 +1212,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, u32 ia_bits = VTCR_EL2_IPA(vtcr); u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; + bool lpa2_ena = (vtcr & VTCR_EL2_DS) != 0; pgd_sz = kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE; pgt->pgd = mm_ops->zalloc_pages_exact(pgd_sz); @@ -1191,7 +1222,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, pgt->ia_bits = ia_bits; pgt->start_level = start_level; pgt->mm_ops = mm_ops; - pgt->lpa2_ena = false; + pgt->lpa2_ena = lpa2_ena; pgt->mmu = mmu; pgt->flags = flags; pgt->force_pte_cb = force_pte_cb; From patchwork Tue Dec 6 13:59:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C71D7C352A1 for ; Tue, 6 Dec 2022 14:03:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dPlzqvIIErzcfmGkXT2J/5dgeSQ73iGIQdueCN8z3nw=; b=jhXeVtyjwgdoMr X6StUyTXYOcR9+H689BRgWqUWv5rbPbcMIR6Ecg0Ot2qi2T6DZonHOVqQk6nhN4L26290UQE8Jzpt 6gS73r4ovFYSpT6oKILfrT8cpB8yzfynTnJI53bK4Nj3rUP/xxhg+lye8nEJb+3zXfkRCTTeTT4E8 E/gxC+yuc6Nd6PwX+Le7u/8tYtJDB8eBCCiGAdZR0fMuHU95qaJQetGviz9hRZbtUcXDkse1exRod aypnyArMOOVZyY35G+5aMjtA1pILqWP7iFXVFJsQ343Hd67xBHxT1bkdhbWPScbSVqCJHKw40sA5d 2AJAAszw/4LVzJK5krRA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YW6-00AAbk-Ig; Tue, 06 Dec 2022 14:01:50 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUa-00A9C8-Uk for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:23 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7A063D6E; Tue, 6 Dec 2022 06:00:22 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 216D23F73D; Tue, 6 Dec 2022 06:00:14 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 07/12] KVM: arm64: Use LPA2 page-tables for hyp stage1 if HW supports it Date: Tue, 6 Dec 2022 13:59:25 +0000 Message-Id: <20221206135930.3277585-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060017_163843_1194ED35 X-CRM114-Status: GOOD ( 17.09 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Implement a simple policy whereby if the HW supports FEAT_LPA2 for the page size we are using, always use LPA2-style page-tables for hyp stage 1, regardless of the IPA or PA size requirements. When in use we can now support up to 52-bit IPA and PA sizes. For the protected kvm case, the host creates the initial page-tables using either the lpa2 or `classic` format as determined by whats reported in mmfr0 and also sets the TCR_EL2.DS bit in the params structure. The hypervisor then looks at this DS bit to determine the format that it should use to re-create the page-tables. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_pgtable.h | 18 +++++++++++++++++- arch/arm64/kvm/arm.c | 2 ++ arch/arm64/kvm/hyp/nvhe/setup.c | 18 +++++++++++++----- arch/arm64/kvm/hyp/pgtable.c | 7 ++++--- arch/arm64/kvm/mmu.c | 3 ++- 5 files changed, 38 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index a7fd547dcc71..d6f4dcdd00fd 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -25,6 +25,21 @@ #define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U #endif +static inline bool kvm_supports_hyp_lpa2(void) +{ +#if defined(CONFIG_ARM64_4K_PAGES) || defined(CONFIG_ARM64_16K_PAGES) + u64 mmfr0; + unsigned int tgran; + + mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); + tgran = cpuid_feature_extract_unsigned_field(mmfr0, + ID_AA64MMFR0_EL1_TGRAN_SHIFT); + return (tgran == ID_AA64MMFR0_EL1_TGRAN_LPA2); +#else + return false; +#endif +} + static inline bool kvm_supports_stage2_lpa2(u64 mmfr0) { unsigned int tgran; @@ -253,11 +268,12 @@ struct kvm_pgtable_walker { * @pgt: Uninitialised page-table structure to initialise. * @va_bits: Maximum virtual address bits. * @mm_ops: Memory management callbacks. + * @lpa2_ena: Whether to use the lpa2 page-table format. * * Return: 0 on success, negative error code on failure. */ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, - struct kvm_pgtable_mm_ops *mm_ops); + struct kvm_pgtable_mm_ops *mm_ops, bool lpa2_ena); /** * kvm_pgtable_hyp_destroy() - Destroy an unused hypervisor stage-1 page-table. diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 803055da3ee3..a234c6252c3c 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1537,6 +1537,8 @@ static void cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits) tcr = (read_sysreg(tcr_el1) & TCR_EL2_MASK) | TCR_EL2_RES1; tcr &= ~TCR_T0SZ_MASK; tcr |= TCR_T0SZ(hyp_va_bits); + if (kvm_supports_hyp_lpa2()) + tcr |= TCR_EL2_DS; params->tcr_el2 = tcr; params->pgd_pa = kvm_mmu_get_httbr(); diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 60a6821ae98a..b44e87b9d168 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -56,7 +56,7 @@ static int divide_memory_pool(void *virt, unsigned long size) static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size, unsigned long *per_cpu_base, - u32 hyp_va_bits) + u32 hyp_va_bits, bool lpa2_ena) { void *start, *end, *virt = hyp_phys_to_virt(phys); unsigned long pgt_size = hyp_s1_pgtable_pages() << PAGE_SHIFT; @@ -66,7 +66,7 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size, /* Recreate the hyp page-table using the early page allocator */ hyp_early_alloc_init(hyp_pgt_base, pgt_size); ret = kvm_pgtable_hyp_init(&pkvm_pgtable, hyp_va_bits, - &hyp_early_alloc_mm_ops); + &hyp_early_alloc_mm_ops, lpa2_ena); if (ret) return ret; @@ -304,10 +304,11 @@ void __noreturn __pkvm_init_finalise(void) int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus, unsigned long *per_cpu_base, u32 hyp_va_bits) { - struct kvm_nvhe_init_params *params; + struct kvm_nvhe_init_params *params = this_cpu_ptr(&kvm_init_params); void *virt = hyp_phys_to_virt(phys); void (*fn)(phys_addr_t params_pa, void *finalize_fn_va); int ret; + bool lpa2_ena; BUG_ON(kvm_check_pvm_sysreg_table()); @@ -321,14 +322,21 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus, if (ret) return ret; - ret = recreate_hyp_mappings(phys, size, per_cpu_base, hyp_va_bits); + /* + * The host has already done the hard work to figure out if LPA2 is + * supported at stage 1 and passed the info in the in the DS bit of the + * TCR. Extract and pass on so that the page-tables are constructed with + * the correct format. + */ + lpa2_ena = (params->tcr_el2 & TCR_EL2_DS) != 0; + ret = recreate_hyp_mappings(phys, size, per_cpu_base, + hyp_va_bits, lpa2_ena); if (ret) return ret; update_nvhe_init_params(); /* Jump in the idmap page to switch to the new page-tables */ - params = this_cpu_ptr(&kvm_init_params); fn = (typeof(fn))__hyp_pa(__pkvm_init_switch_pgd); fn(__hyp_pa(params), __pkvm_init_finalise); diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 8ed7353f07bc..cde852f91db8 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -369,7 +369,8 @@ static int hyp_set_prot_attr(struct kvm_pgtable *pgt, } attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap); - attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh); + if (!pgt->lpa2_ena) + attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh); attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF; attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW; *ptep = attr; @@ -528,7 +529,7 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) } int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, - struct kvm_pgtable_mm_ops *mm_ops) + struct kvm_pgtable_mm_ops *mm_ops, bool lpa2_ena) { u64 levels = ARM64_HW_PGTABLE_LEVELS(va_bits); @@ -539,7 +540,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, pgt->ia_bits = va_bits; pgt->start_level = KVM_PGTABLE_MAX_LEVELS - levels; pgt->mm_ops = mm_ops; - pgt->lpa2_ena = false; + pgt->lpa2_ena = lpa2_ena; pgt->mmu = NULL; pgt->force_pte_cb = NULL; diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index e3fe3e194fd1..13e48539f022 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1684,7 +1684,8 @@ int kvm_mmu_init(u32 *hyp_va_bits) goto out; } - err = kvm_pgtable_hyp_init(hyp_pgtable, *hyp_va_bits, &kvm_hyp_mm_ops); + err = kvm_pgtable_hyp_init(hyp_pgtable, *hyp_va_bits, + &kvm_hyp_mm_ops, kvm_supports_hyp_lpa2()); if (err) goto out_free_pgtable; From patchwork Tue Dec 6 13:59:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 62FF4C352A1 for ; Tue, 6 Dec 2022 14:03:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=WaX/7Wh2OkxIGZwhRzAkVlrxBlE5InStFEEfl+JDxVY=; b=JhdCeInZhp3toZ +WqyMZPJRCxXAWoXO1yaxd3iZO20G7rK/ces1lqA//MJAssWikGvG9kueTcLKEd2qZlbnRYM2UYeD nc4gMXm/amQWW9fmdUzhaX/uYeDZKD70Y5GczvAl9dAP9uADpm62xerpJCiAIV9KYABSfsXAGpdoq lN87/07T21OZs4pQ4qdmhlm1tgqMa+lxUk2FEoj3jDs/q42UvhIyKlhXqlkeFdMRrfu+zUWsjuXMy 8QBpCQSpgbl/8frM65BBitHxK1rKQyVQEbAUBdww1DoAHs5ZP/cppALa+nEMfHrKmy4NZ8WAfZHCD jiulGWPIU+RGQNJAYnfA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YWT-00AAt7-GY; Tue, 06 Dec 2022 14:02:13 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUe-00A9En-QX for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:23 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 58DE712FC; Tue, 6 Dec 2022 06:00:24 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 002193F73D; Tue, 6 Dec 2022 06:00:15 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 08/12] KVM: arm64: Insert PS field at TCR_EL2 assembly time Date: Tue, 6 Dec 2022 13:59:26 +0000 Message-Id: <20221206135930.3277585-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060020_931029_C821D133 X-CRM114-Status: GOOD ( 12.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With the addition of LPA2 support in the hypervisor, the PA size supported by the HW must be capped with a runtime decision, rather than simply using a compile-time decision based on PA_BITS.For example, on a system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB or 16KB kernel compiled with LPA2 support must still limit the PA size to 48 bits. Therefore, move the insertion of the PS field into TCR_EL2 out of __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode() where the rest of TCR_EL2 is assembled. This allows us to figure out PS with kvm_get_parange(), which has the appropriate logic to ensure the above requirement. (and the PS field of VTCR_EL2 is already populated this way). Signed-off-by: Ryan Roberts --- arch/arm64/kvm/arm.c | 5 ++++- arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 ---- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index a234c6252c3c..ac30d849a308 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1522,6 +1522,8 @@ static void cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits) { struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu); unsigned long tcr; + bool lpa2_ena = kvm_supports_hyp_lpa2(); + u64 mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); /* * Calculate the raw per-cpu offset without a translation from the @@ -1537,7 +1539,8 @@ static void cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits) tcr = (read_sysreg(tcr_el1) & TCR_EL2_MASK) | TCR_EL2_RES1; tcr &= ~TCR_T0SZ_MASK; tcr |= TCR_T0SZ(hyp_va_bits); - if (kvm_supports_hyp_lpa2()) + tcr |= kvm_get_parange(mmfr0, lpa2_ena) << TCR_EL2_PS_SHIFT; + if (lpa2_ena) tcr |= TCR_EL2_DS; params->tcr_el2 = tcr; diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S index c953fb4b9a13..3cc6dd2ff253 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S @@ -108,11 +108,7 @@ alternative_if ARM64_HAS_CNP alternative_else_nop_endif msr ttbr0_el2, x2 - /* - * Set the PS bits in TCR_EL2. - */ ldr x0, [x0, #NVHE_INIT_TCR_EL2] - tcr_compute_pa_size x0, #TCR_EL2_PS_SHIFT, x1, x2 msr tcr_el2, x0 isb From patchwork Tue Dec 6 13:59:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8447BC352A1 for ; Tue, 6 Dec 2022 14:04:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=3pX2ANXFN4EtsYJQOs20lKUPa3qaNm3bugBA3pIibLU=; b=wm78YPHbtE+BpA Kmo2rneCfIt3ej17SWBrjryrn3bRVDFwa2Iq0gRr1lUWqGZGTqiShnCgiKdDIfyNcUWR67Gvbrhno HfFDmkWLGnFK+z/WyHWSKJYEqNC4pfAuT6G3i/Sqku9+1B0mxnXcPl5257bVwTYGWOFfpy8ysMraw zWsn+MIQIaZOjftiYJEvgccICMQH/v2uVWNExAzQaRh13WLyrU8qOoU5W4g0slYABj8khki4TB0OF ZHLGbUblfOdc8K4CGc+o9a/Q1tFKGmReWRdMdSXSkVIUlA5qE3GiZGE/5YgQq+soE0jBWYhg8UqP0 TIQaMglcPQdo706Jn1hQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YWs-00AB8n-Mw; Tue, 06 Dec 2022 14:02:38 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUf-00A9GG-Qd for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:26 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3869223A; Tue, 6 Dec 2022 06:00:26 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D411C3F73D; Tue, 6 Dec 2022 06:00:17 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 09/12] KVM: arm64: Convert translation level parameter to s8 Date: Tue, 6 Dec 2022 13:59:27 +0000 Message-Id: <20221206135930.3277585-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060022_036460_5F781726 X-CRM114-Status: GOOD ( 22.14 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With the introduction of FEAT_LPA2, the Arm ARM adds a new level of translation, level -1, so levels can now be in the range [-1;3]. 3 is always the last level and the first level is determined based on the number of VA bits in use. Convert level variables to use a signed type in preparation for supporting this new level -1. Since the last level is always anchored at 3, and the first level varies to suit the number of VA/IPA bits, take the opportunity to replace KVM_PGTABLE_MAX_LEVELS with the 2 macros KVM_PGTABLE_FIRST_LEVEL and KVM_PGTABLE_LAST_LEVEL. This removes the assumption from the code that levels run from 0 to KVM_PGTABLE_MAX_LEVELS - 1, which will soon no longer be true. No behavioral changes intended. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_emulate.h | 2 +- arch/arm64/include/asm/kvm_pgtable.h | 21 +++--- arch/arm64/include/asm/kvm_pkvm.h | 5 +- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 6 +- arch/arm64/kvm/hyp/nvhe/setup.c | 4 +- arch/arm64/kvm/hyp/pgtable.c | 94 ++++++++++++++------------- arch/arm64/kvm/mmu.c | 11 ++-- 7 files changed, 75 insertions(+), 68 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 9bdba47f7e14..270f49e7f29a 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -341,7 +341,7 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_TYPE; } -static __always_inline u8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu) +static __always_inline s8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu) { return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL; } diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index d6f4dcdd00fd..a282a3d5ddbc 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -11,7 +11,8 @@ #include #include -#define KVM_PGTABLE_MAX_LEVELS 4U +#define KVM_PGTABLE_FIRST_LEVEL 0 +#define KVM_PGTABLE_LAST_LEVEL 3 /* * The largest supported block sizes for KVM (no 52-bit PA support): @@ -20,9 +21,9 @@ * - 64K (level 2): 512MB */ #ifdef CONFIG_ARM64_4K_PAGES -#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1U +#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1 #else -#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U +#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2 #endif static inline bool kvm_supports_hyp_lpa2(void) @@ -84,18 +85,18 @@ static inline bool kvm_pte_valid(kvm_pte_t pte) return pte & KVM_PTE_VALID; } -static inline u64 kvm_granule_shift(u32 level) +static inline u64 kvm_granule_shift(s8 level) { - /* Assumes KVM_PGTABLE_MAX_LEVELS is 4 */ + /* Assumes KVM_PGTABLE_LAST_LEVEL is 3 */ return ARM64_HW_PGTABLE_LEVEL_SHIFT(level); } -static inline u64 kvm_granule_size(u32 level) +static inline u64 kvm_granule_size(s8 level) { return BIT(kvm_granule_shift(level)); } -static inline bool kvm_level_supports_block_mapping(u32 level) +static inline bool kvm_level_supports_block_mapping(s8 level) { return level >= KVM_PGTABLE_MIN_BLOCK_LEVEL; } @@ -202,7 +203,7 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end, */ struct kvm_pgtable { u32 ia_bits; - u32 start_level; + s8 start_level; kvm_pte_t *pgd; struct kvm_pgtable_mm_ops *mm_ops; bool lpa2_ena; @@ -245,7 +246,7 @@ enum kvm_pgtable_walk_flags { }; typedef int (*kvm_pgtable_visitor_fn_t)(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg); @@ -581,7 +582,7 @@ int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size, * Return: 0 on success, negative error code on failure. */ int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr, - kvm_pte_t *ptep, u32 *level); + kvm_pte_t *ptep, s8 *level); /** * kvm_pgtable_stage2_pte_prot() - Retrieve the protection attributes of a diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h index 9f4ad2a8df59..addcf63cf8d5 100644 --- a/arch/arm64/include/asm/kvm_pkvm.h +++ b/arch/arm64/include/asm/kvm_pkvm.h @@ -16,10 +16,11 @@ extern unsigned int kvm_nvhe_sym(hyp_memblock_nr); static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages) { - unsigned long total = 0, i; + unsigned long total = 0; + int i; /* Provision the worst case scenario */ - for (i = 0; i < KVM_PGTABLE_MAX_LEVELS; i++) { + for (i = KVM_PGTABLE_FIRST_LEVEL; i <= KVM_PGTABLE_LAST_LEVEL; i++) { nr_pages = DIV_ROUND_UP(nr_pages, PTRS_PER_PTE); total += nr_pages; } diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 43e729694deb..96a5567a9db3 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -281,7 +281,7 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range) { struct kvm_mem_range cur; kvm_pte_t pte; - u32 level; + s8 level; int ret; hyp_assert_lock_held(&host_kvm.lock); @@ -300,7 +300,7 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range) cur.start = ALIGN_DOWN(addr, granule); cur.end = cur.start + granule; level++; - } while ((level < KVM_PGTABLE_MAX_LEVELS) && + } while ((level <= KVM_PGTABLE_LAST_LEVEL) && !(kvm_level_supports_block_mapping(level) && range_included(&cur, range))); @@ -416,7 +416,7 @@ struct check_walk_data { }; static int __check_page_state_visitor(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index b44e87b9d168..0355c53b3530 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -187,7 +187,7 @@ static void hpool_put_page(void *addr) } static int finalize_host_mappings_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) @@ -210,7 +210,7 @@ static int finalize_host_mappings_walker(struct kvm_pgtable *pgt, if (flag != KVM_PGTABLE_WALK_LEAF) return 0; - if (level != (KVM_PGTABLE_MAX_LEVELS - 1)) + if (level != KVM_PGTABLE_LAST_LEVEL) return -EINVAL; phys = kvm_pte_to_phys(pgt, pte); diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index cde852f91db8..274f839bd0d7 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -68,7 +68,7 @@ static bool kvm_phys_is_valid(struct kvm_pgtable *pgt, u64 phys) } static bool kvm_block_mapping_supported(struct kvm_pgtable *pgt, - u64 addr, u64 end, u64 phys, u32 level) + u64 addr, u64 end, u64 phys, s8 level) { u64 granule = kvm_granule_size(level); @@ -84,7 +84,7 @@ static bool kvm_block_mapping_supported(struct kvm_pgtable *pgt, return IS_ALIGNED(addr, granule); } -static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 level) +static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, s8 level) { u64 shift = kvm_granule_shift(level); u64 mask = BIT(PAGE_SHIFT - 3) - 1; @@ -105,7 +105,7 @@ static u32 kvm_pgd_page_idx(struct kvm_pgtable_walk_data *data) return __kvm_pgd_page_idx(data->pgt, data->addr); } -static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level) +static u32 kvm_pgd_pages(u32 ia_bits, s8 start_level) { struct kvm_pgtable pgt = { .ia_bits = ia_bits, @@ -115,9 +115,9 @@ static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level) return __kvm_pgd_page_idx(&pgt, -1ULL) + 1; } -static bool kvm_pte_table(kvm_pte_t pte, u32 level) +static bool kvm_pte_table(kvm_pte_t pte, s8 level) { - if (level == KVM_PGTABLE_MAX_LEVELS - 1) + if (level == KVM_PGTABLE_LAST_LEVEL) return false; if (!kvm_pte_valid(pte)) @@ -166,11 +166,11 @@ static void kvm_set_table_pte(struct kvm_pgtable *pgt, } static kvm_pte_t kvm_init_valid_leaf_pte(struct kvm_pgtable *pgt, - u64 pa, kvm_pte_t attr, u32 level) + u64 pa, kvm_pte_t attr, s8 level) { kvm_pte_t pte = kvm_phys_to_pte(pgt, pa); - u64 type = (level == KVM_PGTABLE_MAX_LEVELS - 1) ? KVM_PTE_TYPE_PAGE : - KVM_PTE_TYPE_BLOCK; + u64 type = (level == KVM_PGTABLE_LAST_LEVEL) ? KVM_PTE_TYPE_PAGE : + KVM_PTE_TYPE_BLOCK; pte |= attr & (KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI); pte |= FIELD_PREP(KVM_PTE_TYPE, type); @@ -185,7 +185,7 @@ static kvm_pte_t kvm_init_invalid_leaf_owner(u8 owner_id) } static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr, - u32 level, kvm_pte_t *ptep, + s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag) { struct kvm_pgtable_walker *walker = data->walker; @@ -194,10 +194,10 @@ static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr, } static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, - kvm_pte_t *pgtable, u32 level); + kvm_pte_t *pgtable, s8 level); static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, - kvm_pte_t *ptep, u32 level) + kvm_pte_t *ptep, s8 level) { int ret = 0; u64 addr = data->addr; @@ -241,12 +241,12 @@ static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, } static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, - kvm_pte_t *pgtable, u32 level) + kvm_pte_t *pgtable, s8 level) { u32 idx; int ret = 0; - if (WARN_ON_ONCE(level >= KVM_PGTABLE_MAX_LEVELS)) + if (WARN_ON_ONCE(level > KVM_PGTABLE_LAST_LEVEL)) return -EINVAL; for (idx = kvm_pgtable_idx(data, level); idx < PTRS_PER_PTE; ++idx) { @@ -302,11 +302,11 @@ int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size, struct leaf_walk_data { kvm_pte_t pte; - u32 level; + s8 level; }; static int leaf_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { struct leaf_walk_data *data = arg; @@ -318,7 +318,7 @@ static int leaf_walker(struct kvm_pgtable *pgt, } int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr, - kvm_pte_t *ptep, u32 *level) + kvm_pte_t *ptep, s8 *level) { struct leaf_walk_data data; struct kvm_pgtable_walker walker = { @@ -399,7 +399,7 @@ enum kvm_pgtable_prot kvm_pgtable_hyp_pte_prot(kvm_pte_t pte) } static bool hyp_map_walker_try_leaf(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, struct hyp_map_data *data) { kvm_pte_t new, old = *ptep; @@ -422,7 +422,7 @@ static bool hyp_map_walker_try_leaf(struct kvm_pgtable *pgt, } static int hyp_map_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { kvm_pte_t *childp; @@ -432,7 +432,7 @@ static int hyp_map_walker(struct kvm_pgtable *pgt, if (hyp_map_walker_try_leaf(pgt, addr, end, level, ptep, data)) return 0; - if (WARN_ON(level == KVM_PGTABLE_MAX_LEVELS - 1)) + if (WARN_ON(level == KVM_PGTABLE_LAST_LEVEL)) return -EINVAL; childp = (kvm_pte_t *)mm_ops->zalloc_page(NULL); @@ -472,7 +472,7 @@ struct hyp_unmap_data { }; static int hyp_unmap_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { kvm_pte_t pte = *ptep, *childp = NULL; @@ -531,14 +531,18 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, struct kvm_pgtable_mm_ops *mm_ops, bool lpa2_ena) { - u64 levels = ARM64_HW_PGTABLE_LEVELS(va_bits); + s8 start_level = KVM_PGTABLE_LAST_LEVEL + 1 - + ARM64_HW_PGTABLE_LEVELS(va_bits); + if (start_level < KVM_PGTABLE_FIRST_LEVEL || + start_level > KVM_PGTABLE_LAST_LEVEL) + return -EINVAL; pgt->pgd = (kvm_pte_t *)mm_ops->zalloc_page(NULL); if (!pgt->pgd) return -ENOMEM; pgt->ia_bits = va_bits; - pgt->start_level = KVM_PGTABLE_MAX_LEVELS - levels; + pgt->start_level = start_level; pgt->mm_ops = mm_ops; pgt->lpa2_ena = lpa2_ena; pgt->mmu = NULL; @@ -548,7 +552,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, } static int hyp_free_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; @@ -594,7 +598,7 @@ struct stage2_map_data { u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) { u64 vtcr = VTCR_EL2_FLAGS; - u8 lvls; + s8 levels; u64 parange; bool lpa2_ena = false; @@ -618,10 +622,10 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) * Use a minimum 2 level page table to prevent splitting * host PMD huge pages at stage2. */ - lvls = stage2_pgtable_levels(phys_shift); - if (lvls < 2) - lvls = 2; - vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls); + levels = stage2_pgtable_levels(phys_shift); + if (levels < 2) + levels = 2; + vtcr |= VTCR_EL2_LVLS_TO_SL0(levels); /* * Enable the Hardware Access Flag management, unconditionally @@ -716,7 +720,7 @@ static bool stage2_pte_is_counted(kvm_pte_t pte) } static void stage2_put_pte(kvm_pte_t *ptep, struct kvm_s2_mmu *mmu, u64 addr, - u32 level, struct kvm_pgtable_mm_ops *mm_ops) + s8 level, struct kvm_pgtable_mm_ops *mm_ops) { /* * Clear the existing PTE, and perform break-before-make with @@ -742,17 +746,17 @@ static bool stage2_pte_executable(kvm_pte_t pte) } static bool stage2_leaf_mapping_allowed(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, struct stage2_map_data *data) { - if (data->force_pte && (level < (KVM_PGTABLE_MAX_LEVELS - 1))) + if (data->force_pte && level < KVM_PGTABLE_LAST_LEVEL) return false; return kvm_block_mapping_supported(pgt, addr, end, data->phys, level); } static int stage2_map_walker_try_leaf(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, struct stage2_map_data *data) { @@ -798,7 +802,7 @@ static int stage2_map_walker_try_leaf(struct kvm_pgtable *pgt, } static int stage2_map_walk_table_pre(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, struct stage2_map_data *data) { @@ -822,7 +826,7 @@ static int stage2_map_walk_table_pre(struct kvm_pgtable *pgt, } static int stage2_map_walk_leaf(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, struct stage2_map_data *data) { struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; @@ -840,7 +844,7 @@ static int stage2_map_walk_leaf(struct kvm_pgtable *pgt, if (ret != -E2BIG) return ret; - if (WARN_ON(level == KVM_PGTABLE_MAX_LEVELS - 1)) + if (WARN_ON(level == KVM_PGTABLE_LAST_LEVEL)) return -EINVAL; if (!data->memcache) @@ -865,7 +869,7 @@ static int stage2_map_walk_leaf(struct kvm_pgtable *pgt, } static int stage2_map_walk_table_post(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, struct stage2_map_data *data) { @@ -911,7 +915,7 @@ static int stage2_map_walk_table_post(struct kvm_pgtable *pgt, * pointer and clearing the anchor to NULL. */ static int stage2_map_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { struct stage2_map_data *data = arg; @@ -984,7 +988,7 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size, } static int stage2_unmap_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { @@ -1041,11 +1045,11 @@ struct stage2_attr_data { kvm_pte_t attr_set; kvm_pte_t attr_clr; kvm_pte_t pte; - u32 level; + s8 level; }; static int stage2_attr_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { @@ -1084,7 +1088,7 @@ static int stage2_attr_walker(struct kvm_pgtable *pgt, static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr, u64 size, kvm_pte_t attr_set, kvm_pte_t attr_clr, kvm_pte_t *orig_pte, - u32 *level) + s8 *level) { int ret; kvm_pte_t attr_mask = KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI; @@ -1151,7 +1155,7 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr, enum kvm_pgtable_prot prot) { int ret; - u32 level; + s8 level; kvm_pte_t set = 0, clr = 0; if (prot & KVM_PTE_LEAF_ATTR_HI_SW) @@ -1173,7 +1177,7 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr, } static int stage2_flush_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { @@ -1212,7 +1216,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, u64 vtcr = mmu->arch->vtcr; u32 ia_bits = VTCR_EL2_IPA(vtcr); u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); - u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; + s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; bool lpa2_ena = (vtcr & VTCR_EL2_DS) != 0; pgd_sz = kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE; @@ -1234,7 +1238,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, } static int stage2_free_walker(struct kvm_pgtable *pgt, - u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + u64 addr, u64 end, s8 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) { diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 13e48539f022..4ce46be3f0a0 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -642,18 +642,19 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr) struct kvm_pgtable pgt = { .pgd = (kvm_pte_t *)kvm->mm->pgd, .ia_bits = vabits_actual, - .start_level = (KVM_PGTABLE_MAX_LEVELS - - CONFIG_PGTABLE_LEVELS), + .start_level = (KVM_PGTABLE_LAST_LEVEL - + CONFIG_PGTABLE_LEVELS + 1), .mm_ops = &kvm_user_mm_ops, .lpa2_ena = lpa2_is_enabled(), }; kvm_pte_t pte = 0; /* Keep GCC quiet... */ - u32 level = ~0; + s8 level = ~0; int ret; ret = kvm_pgtable_get_leaf(&pgt, addr, &pte, &level); VM_BUG_ON(ret); - VM_BUG_ON(level >= KVM_PGTABLE_MAX_LEVELS); + VM_BUG_ON(level > KVM_PGTABLE_LAST_LEVEL); + VM_BUG_ON(level < KVM_PGTABLE_FIRST_LEVEL); VM_BUG_ON(!(pte & PTE_VALID)); return BIT(ARM64_HW_PGTABLE_LEVEL_SHIFT(level)); @@ -1138,7 +1139,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); bool use_read_lock = false; - unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu); + s8 fault_level = kvm_vcpu_trap_get_fault_level(vcpu); unsigned long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; From patchwork Tue Dec 6 13:59:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065856 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D0D5AC4708E for ; Tue, 6 Dec 2022 14:05:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dYHok+4949g7d87XnDHzzonWiT0Umgsf3UXChnHJfAU=; b=lpCZr1kxhVdWhR LXQv73xkG0t+bRP1ZSXV+uJEszNSNTxsTX8DRxCTSGyure6hOekT9UdzdZnzxGAQXLVGU1blM5nYP uJIr4A/8vncmuBG27e6qduDJprornKoNd5+yUvOSLYOB7dAc9yPohSH+mwzWxJ40PzCVGK6aEusv/ KXkQBqsfWt5PTX4FmnjhrYSCpCcdfxAxEXlOexT3oKKijSvQyRrvnzObgE3K8FNCE6yxWZ382zvTL Sg34v6guw8z9LfgPzNH/wRNXIApgx/MUx3Gdm9EVeOQrDH6nbU3+7XqtG4r6LOC/FgHmipVsb+GLe Gffpll064Z33IHJmhxoA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YXK-00ABYt-LF; Tue, 06 Dec 2022 14:03:07 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUf-00A9GK-Rp for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:27 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 17BDE113E; Tue, 6 Dec 2022 06:00:28 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B33A93F73D; Tue, 6 Dec 2022 06:00:19 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 10/12] KVM: arm64: Rework logic to en/decode VTCR_EL2.{SL0, SL2} fields Date: Tue, 6 Dec 2022 13:59:28 +0000 Message-Id: <20221206135930.3277585-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060022_185197_6FDE97E2 X-CRM114-Status: GOOD ( 19.16 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In order to support 5 level translation, FEAT_LPA2 introduces the 1-bit SL2 field within VTCR_EL2 to extend the existing 2-bit SL0 field. The SL2[0]:SL0[1:0] encodings have no simple algorithmic relationship to the start levels they represent (that I can find, at least), so replace the existing macros with functions that do lookups to encode and decode the values. These new functions no longer make hardcoded assumptions about the maximum level and instead rely on KVM_PGTABLE_FIRST_LEVEL and KVM_PGTABLE_LAST_LEVEL. This is preparatory work for enabling 52-bit IPA for 4KB and 16KB pages with FEAT_LPA2. No functional change intended. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_arm.h | 75 ++++++++++++++----------- arch/arm64/include/asm/kvm_pgtable.h | 33 +++++++++++ arch/arm64/include/asm/stage2_pgtable.h | 13 ++++- arch/arm64/kvm/hyp/pgtable.c | 67 +++++++++++++++++++++- 4 files changed, 150 insertions(+), 38 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index f9619a10d5d9..94bbb05e348f 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -150,58 +150,65 @@ VTCR_EL2_IRGN0_WBWA | VTCR_EL2_RES1) /* - * VTCR_EL2:SL0 indicates the entry level for Stage2 translation. - * Interestingly, it depends on the page size. - * See D.10.2.121, VTCR_EL2, in ARM DDI 0487C.a + * VTCR_EL2.{SL0, SL2} indicates the entry level for Stage2 translation. + * Interestingly, it depends on the page size. See D17.2.157, VTCR_EL2, in ARM + * DDI 0487I.a * - * ----------------------------------------- - * | Entry level | 4K | 16K/64K | - * ------------------------------------------ - * | Level: 0 | 2 | - | - * ------------------------------------------ - * | Level: 1 | 1 | 2 | - * ------------------------------------------ - * | Level: 2 | 0 | 1 | - * ------------------------------------------ - * | Level: 3 | - | 0 | - * ------------------------------------------ + * ---------------------------------------------------------- + * | Entry level | 4K | 16K | 64K | + * | | SL2:SL0 | SL2:SL0 | SL2:SL0 | + * ---------------------------------------------------------- + * | Level: -1 | 0b100 | - | - | + * ---------------------------------------------------------- + * | Level: 0 | 0b010 | 0b011 | - | + * ---------------------------------------------------------- + * | Level: 1 | 0b001 | 0b010 | 0b010 | + * ---------------------------------------------------------- + * | Level: 2 | 0b000 | 0b001 | 0b001 | + * ---------------------------------------------------------- + * | Level: 3 | 0b011 | 0b000 | 0b000 | + * ---------------------------------------------------------- * - * The table roughly translates to : - * - * SL0(PAGE_SIZE, Entry_level) = TGRAN_SL0_BASE - Entry_Level - * - * Where TGRAN_SL0_BASE is a magic number depending on the page size: - * TGRAN_SL0_BASE(4K) = 2 - * TGRAN_SL0_BASE(16K) = 3 - * TGRAN_SL0_BASE(64K) = 3 - * provided we take care of ruling out the unsupported cases and - * Entry_Level = 4 - Number_of_levels. + * There is no concise algorithm to convert between the SLx encodings and the + * level numbers, so we implement 2 helpers kvm_vtcr_el2_sl_encode() + * kvm_vtcr_el2_sl_decode() which can convert between the representations. These + * helpers use a concatenated form of SLx: SL2[0]:SL0[1:0] as the 3 LSBs in u8. + * If an invalid input value is provided, VTCR_EL2_SLx_ENC_INVAL is returned. We + * declare the appropriate encoded values here for the compiled in page size. * + * See kvm_pgtable.h for documentation on the helpers. */ +#define VTCR_EL2_SLx_ENC_INVAL 255 + #ifdef CONFIG_ARM64_64K_PAGES #define VTCR_EL2_TGRAN VTCR_EL2_TG0_64K -#define VTCR_EL2_TGRAN_SL0_BASE 3UL +#define VTCR_EL2_SLx_ENC_Lm1 VTCR_EL2_SLx_ENC_INVAL +#define VTCR_EL2_SLx_ENC_L0 VTCR_EL2_SLx_ENC_INVAL +#define VTCR_EL2_SLx_ENC_Lp1 2 +#define VTCR_EL2_SLx_ENC_Lp2 1 +#define VTCR_EL2_SLx_ENC_Lp3 0 #elif defined(CONFIG_ARM64_16K_PAGES) #define VTCR_EL2_TGRAN VTCR_EL2_TG0_16K -#define VTCR_EL2_TGRAN_SL0_BASE 3UL +#define VTCR_EL2_SLx_ENC_Lm1 VTCR_EL2_SLx_ENC_INVAL +#define VTCR_EL2_SLx_ENC_L0 3 +#define VTCR_EL2_SLx_ENC_Lp1 2 +#define VTCR_EL2_SLx_ENC_Lp2 1 +#define VTCR_EL2_SLx_ENC_Lp3 0 #else /* 4K */ #define VTCR_EL2_TGRAN VTCR_EL2_TG0_4K -#define VTCR_EL2_TGRAN_SL0_BASE 2UL +#define VTCR_EL2_SLx_ENC_Lm1 4 +#define VTCR_EL2_SLx_ENC_L0 2 +#define VTCR_EL2_SLx_ENC_Lp1 1 +#define VTCR_EL2_SLx_ENC_Lp2 0 +#define VTCR_EL2_SLx_ENC_Lp3 3 #endif -#define VTCR_EL2_LVLS_TO_SL0(levels) \ - ((VTCR_EL2_TGRAN_SL0_BASE - (4 - (levels))) << VTCR_EL2_SL0_SHIFT) -#define VTCR_EL2_SL0_TO_LVLS(sl0) \ - ((sl0) + 4 - VTCR_EL2_TGRAN_SL0_BASE) -#define VTCR_EL2_LVLS(vtcr) \ - VTCR_EL2_SL0_TO_LVLS(((vtcr) & VTCR_EL2_SL0_MASK) >> VTCR_EL2_SL0_SHIFT) - #define VTCR_EL2_FLAGS (VTCR_EL2_COMMON_BITS | VTCR_EL2_TGRAN) #define VTCR_EL2_IPA(vtcr) (64 - ((vtcr) & VTCR_EL2_T0SZ_MASK)) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index a282a3d5ddbc..3e0b64052c51 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -328,6 +328,39 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, */ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size); +/** + * kvm_vtcr_el2_sl_encode() - Helper to encode start level for vtcr_el2. + * @sl_dec: Start level to be encoded. + * + * Takes an unencoded translation start level value and returns it encoded for + * use in vtcr_el2 register. The returned value has SL0 (a 2 bit field) in bits + * [1:0] and SL2 (a 1 bit field) in bit [2]. The user is responsible for + * extracting and packing in the correct locations of vctr_el2. + * + * Do not call this function with a value that is out of range for the page size + * in operation. A warning will be output if this is detected and the function + * returns VTCR_EL2_SLx_ENC_INVAL. See comment in kvm_arm.h for more info. + * + * Return: 3 bit value containing SL2[0]:SL0[1:0], or VTCR_EL2_SLx_ENC_INVAL. + */ +u8 kvm_vtcr_el2_sl_encode(s8 sl_dec); + +/** + * kvm_vtcr_el2_sl_decode() - Helper to decode start level for vtcr_el2. + * @sl_enc: Start level encoded as SL2[0]:SL0[1:0]. + * + * Takes an encoded translation start level value, as used in the vtcr_el2 + * register and returns it decoded. See kvm_vtcr_el2_sl_encode() for description + * of input encoding. + * + * Do not call this function with a value that is invalid for the page size in + * operation. A warning will be output if this is detected and the function + * returns VTCR_EL2_SLx_ENC_INVAL. See comment in kvm_arm.h for more info. + * + * Return: Decoded start level, or VTCR_EL2_SLx_ENC_INVAL. + */ +s8 kvm_vtcr_el2_sl_decode(u8 sl_enc); + /** * kvm_get_vtcr() - Helper to construct VTCR_EL2 * @mmfr0: Sanitized value of SYS_ID_AA64MMFR0_EL1 register. diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h index c8dca8ae359c..02c5e04d4958 100644 --- a/arch/arm64/include/asm/stage2_pgtable.h +++ b/arch/arm64/include/asm/stage2_pgtable.h @@ -21,7 +21,18 @@ * (IPA_SHIFT - 4). */ #define stage2_pgtable_levels(ipa) ARM64_HW_PGTABLE_LEVELS((ipa) - 4) -#define kvm_stage2_levels(kvm) VTCR_EL2_LVLS(kvm->arch.vtcr) +static inline s8 kvm_stage2_levels(struct kvm *kvm) +{ + u64 vtcr = kvm->arch.vtcr; + u8 slx; + s8 start_level; + + slx = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); + slx |= FIELD_GET(VTCR_EL2_SL2_MASK, vtcr) << 2; + start_level = kvm_vtcr_el2_sl_decode(slx); + + return KVM_PGTABLE_LAST_LEVEL + 1 - start_level; +} /* * kvm_mmmu_cache_min_pages() is the number of pages required to install diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 274f839bd0d7..8ebd9aaed2c4 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -595,12 +595,67 @@ struct stage2_map_data { bool force_pte; }; +u8 kvm_vtcr_el2_sl_encode(s8 sl_dec) +{ + u8 sl_enc = VTCR_EL2_SLx_ENC_INVAL; + + BUILD_BUG_ON(KVM_PGTABLE_FIRST_LEVEL < -1); + BUILD_BUG_ON(KVM_PGTABLE_LAST_LEVEL > 3); + + switch (sl_dec) { + case -1: + sl_enc = VTCR_EL2_SLx_ENC_Lm1; + break; + case 0: + sl_enc = VTCR_EL2_SLx_ENC_L0; + break; + case 1: + sl_enc = VTCR_EL2_SLx_ENC_Lp1; + break; + case 2: + sl_enc = VTCR_EL2_SLx_ENC_Lp2; + break; + case 3: + sl_enc = VTCR_EL2_SLx_ENC_Lp3; + break; + } + + WARN_ON_ONCE(sl_enc == VTCR_EL2_SLx_ENC_INVAL); + return sl_enc; +} + +s8 kvm_vtcr_el2_sl_decode(u8 sl_enc) +{ + s8 sl_dec = VTCR_EL2_SLx_ENC_INVAL; + + BUILD_BUG_ON(KVM_PGTABLE_FIRST_LEVEL < -1); + BUILD_BUG_ON(KVM_PGTABLE_LAST_LEVEL > 3); + + if (sl_enc == VTCR_EL2_SLx_ENC_Lm1) + sl_dec = -1; + else if (sl_enc == VTCR_EL2_SLx_ENC_L0) + sl_dec = 0; + else if (sl_enc == VTCR_EL2_SLx_ENC_Lp1) + sl_dec = 1; + else if (sl_enc == VTCR_EL2_SLx_ENC_Lp2) + sl_dec = 2; + else if (sl_enc == VTCR_EL2_SLx_ENC_Lp3) + sl_dec = 3; + + if (WARN_ON_ONCE(sl_dec == VTCR_EL2_SLx_ENC_INVAL || + sl_enc == VTCR_EL2_SLx_ENC_INVAL)) + sl_dec = VTCR_EL2_SLx_ENC_INVAL; + + return sl_dec; +} + u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) { u64 vtcr = VTCR_EL2_FLAGS; s8 levels; u64 parange; bool lpa2_ena = false; + u8 slx; /* * If stage 2 reports that it supports FEAT_LPA2 for our page size, then @@ -625,7 +680,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) levels = stage2_pgtable_levels(phys_shift); if (levels < 2) levels = 2; - vtcr |= VTCR_EL2_LVLS_TO_SL0(levels); + slx = kvm_vtcr_el2_sl_encode(KVM_PGTABLE_LAST_LEVEL + 1 - levels); + vtcr |= FIELD_PREP(VTCR_EL2_SL0_MASK, slx); + vtcr |= FIELD_PREP(VTCR_EL2_SL2_MASK, slx >> 2); /* * Enable the Hardware Access Flag management, unconditionally @@ -1215,10 +1272,14 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, size_t pgd_sz; u64 vtcr = mmu->arch->vtcr; u32 ia_bits = VTCR_EL2_IPA(vtcr); - u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); - s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; + u8 slx; + s8 start_level; bool lpa2_ena = (vtcr & VTCR_EL2_DS) != 0; + slx = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); + slx |= FIELD_GET(VTCR_EL2_SL2_MASK, vtcr) << 2; + start_level = kvm_vtcr_el2_sl_decode(slx); + pgd_sz = kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE; pgt->pgd = mm_ops->zalloc_pages_exact(pgd_sz); if (!pgt->pgd) From patchwork Tue Dec 6 13:59:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 388E3C352A1 for ; Tue, 6 Dec 2022 14:05:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=+XN+KmCgrQqF3p4W99pZZ4rbcELVWcUFFItEu6EWWXc=; b=ComnJcITCC8wyf kKSO2Y+yFhLrmbCIc2RtEKwGo8wIXX9I39wo+fXFgTiiDBGE6wPrSPFlY+AxTfmSRqCUDdolNGpr7 v0muYMcIAZmmPMl7nzb52MjXYfXCBPrw6PGY9TYpZNkODmmSNNP9nmiUIgmtoLMWSMvRr876aK5Nt C5AVHEhYbw4MpXbF3W6B4qe5yDvuFn/4bEyY0oUgYM5w80W1s+WYxvysfQQXk8vrz1Ry+tnZdnWIV Q+ttn8fsQgLku75+8+hixj5FuamQKuBI1UbBrBehvSh5uRCjcNsMnMAGRWnXj7a1Kk6+edwEAzk4y NEB0GwW/egXREY7y1Mxw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YXp-00AC4T-Qb; Tue, 06 Dec 2022 14:03:37 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUh-00A9CD-N3 for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:28 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EA629139F; Tue, 6 Dec 2022 06:00:29 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 91E293F73D; Tue, 6 Dec 2022 06:00:21 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 11/12] KVM: arm64: Support upto 5 levels of translation in kvm_pgtable Date: Tue, 6 Dec 2022 13:59:29 +0000 Message-Id: <20221206135930.3277585-12-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060023_861187_A9304C4B X-CRM114-Status: GOOD ( 16.02 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org FEAT_LPA2 increases the maximum levels of translation from 4 to 5 for the 4KB page case, when IA is >48 bits. While we can still use 4 levels for stage2 translation in this case (due to stage2 allowing concatenated page tables for first level lookup), the same kvm_pgtable library is used for the hyp stage1 page tables and stage1 does not support concatenation. Therefore, modify the library to support upto 5 levels. Previous patches already laid the groundwork for this by refactoring code to work in terms of KVM_PGTABLE_FIRST_LEVEL and KVM_PGTABLE_LAST_LEVEL. So we just need to change these macros. The hardware sometimes encodes the new level differently from the others: One such place is when reading the level from the FSC field in the ESR_EL2 register. We never expect to see the lowest level (-1) here since the stage 2 page tables always use concatenated tables for first level lookup and therefore only use 4 levels of lookup. So we get away with just adding a comment to explain why we are not being careful about decoding level -1. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_emulate.h | 10 ++++++++++ arch/arm64/include/asm/kvm_pgtable.h | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 270f49e7f29a..6f68febfb214 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -343,6 +343,16 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc static __always_inline s8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu) { + /* + * Note: With the introduction of FEAT_LPA2 an extra level of + * translation (level -1) is added. This level (obviously) doesn't + * follow the previous convention of encoding the 4 levels in the 2 LSBs + * of the FSC so this function breaks if the fault is for level -1. + * + * However, stage2 tables always use concatenated tables for first level + * lookup and therefore it is guaranteed that the level will be between + * 0 and 3, and this function continues to work. + */ return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL; } diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 3e0b64052c51..3655279e6a7d 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -11,7 +11,7 @@ #include #include -#define KVM_PGTABLE_FIRST_LEVEL 0 +#define KVM_PGTABLE_FIRST_LEVEL -1 #define KVM_PGTABLE_LAST_LEVEL 3 /* From patchwork Tue Dec 6 13:59:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13065855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 73196C352A1 for ; Tue, 6 Dec 2022 14:05:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=r12cOYp4fk27aAKI51gXH5h+u+7DSyjWXYN7tzOUI9w=; b=ZaNC81KSzaDLKF iz9myGd8Hy3iZGxGgXJTgZRJ01awP0ynViVxj1JEzgK/Vku2otm90Z8LVYW8Nfq3Rl3E3WwugKOjs slx0LtOkoqvC2Pk5ijwXig4YExEcQhp5EJz5fLVaGRsJPM5N31THjCVcSMBfYs/m2PaOJJtjl2LaH Ap+eh60cDkXuRWYKwnUN4XRZ5xw8L/9SecgNnUcSyZzGj0FZdUbUgnfDXZNCntGcgmW8GKF+NBDPi Tq9qw1hADocJ5dWDe6HkB5jTR27xSGgrXUuJyxfDB3FTYJWUhDJq2G7p2+OiNs/nu9NmpIKFTLesr Yyv7VF8ZwF+3c1uJhgQw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YYO-00ACbB-8o; Tue, 06 Dec 2022 14:04:12 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p2YUj-00A9En-Au for linux-arm-kernel@lists.infradead.org; Tue, 06 Dec 2022 14:00:29 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C8E95D6E; Tue, 6 Dec 2022 06:00:31 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.159]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 706813F73D; Tue, 6 Dec 2022 06:00:23 -0800 (PST) From: Ryan Roberts To: Marc Zyngier , Catalin Marinas , Will Deacon , Ard Biesheuvel , Suzuki K Poulose , Anshuman Khandual Cc: Ryan Roberts , James Morse , Alexandru Elisei , Oliver Upton , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu Subject: [PATCH v1 12/12] KVM: arm64: Allow guests with >48-bit IPA size on FEAT_LPA2 systems Date: Tue, 6 Dec 2022 13:59:30 +0000 Message-Id: <20221206135930.3277585-13-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221206135930.3277585-1-ryan.roberts@arm.com> References: <20221206135930.3277585-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221206_060025_462115_28E25897 X-CRM114-Status: GOOD ( 10.11 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With all the page-table infrastructure in place, we can finally increase the maximum permisable IPA size to 52-bits on 4KB and 16KB page systems that have FEAT_LPA2. Signed-off-by: Ryan Roberts --- arch/arm64/kvm/reset.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 5ae18472205a..548756c3f43c 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -118,7 +118,7 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu) kfree(buf); return ret; } - + vcpu->arch.sve_state = buf; vcpu_set_flag(vcpu, VCPU_SVE_FINALIZED); return 0; @@ -361,12 +361,11 @@ int kvm_set_ipa_limit(void) parange = cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_EL1_PARANGE_SHIFT); /* - * IPA size beyond 48 bits could not be supported - * on either 4K or 16K page size. Hence let's cap - * it to 48 bits, in case it's reported as larger - * on the system. + * IPA size beyond 48 bits for 4K and 16K page size is only supported + * when LPA2 is available. So if we have LPA2, enable it, else cap to 48 + * bits, in case it's reported as larger on the system. */ - if (PAGE_SIZE != SZ_64K) + if (!kvm_supports_stage2_lpa2(mmfr0) && PAGE_SIZE != SZ_64K) parange = min(parange, (unsigned int)ID_AA64MMFR0_EL1_PARANGE_48); /*