diff mbox series

[v5,07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1

Message ID 20231116142931.1675485-8-ryan.roberts@arm.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 | expand

Commit Message

Ryan Roberts Nov. 16, 2023, 2:29 p.m. UTC
Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
page size we are using, always use LPA2-style page-tables for stage 2
and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
IPA size or HW-implemented PA size. When in use we can now support up to
52-bit IPA and PA sizes.

We use the previously created cpu feature to track whether LPA2 is
supported for deciding whether to use the LPA2 or classic pte format.

Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
4KB, 64GB with 16KB). We explicitly don't enable these in the library
because stage2_apply_range() works on batch sizes of the largest used
block mapping, and increasing the size of the batch would lead to soft
lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
stage2_apply_range() batch size to largest block").

With the addition of LPA2 support in the hypervisor, the PA size
supported by the HW must be capped with a runtime decision, rather than
simply using a compile-time decision based on PA_BITS. For example, on a
system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
or 16KB kernel compiled with LPA2 support must still limit the PA size
to 48 bits.

Therefore, move the insertion of the PS field into TCR_EL2 out of
__kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
where the rest of TCR_EL2 is prepared. This allows us to figure out PS
with kvm_get_parange(), which has the appropriate logic to ensure the
above requirement. (and the PS field of VTCR_EL2 is already populated
this way).

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h     |  2 +-
 arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
 arch/arm64/kvm/arm.c                 |  5 +++
 arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  4 ---
 arch/arm64/kvm/hyp/pgtable.c         | 15 +++++++--
 5 files changed, 54 insertions(+), 19 deletions(-)

Comments

Oliver Upton Nov. 21, 2023, 8:34 p.m. UTC | #1
On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
> page size we are using, always use LPA2-style page-tables for stage 2
> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
> IPA size or HW-implemented PA size. When in use we can now support up to
> 52-bit IPA and PA sizes.
> 
> We use the previously created cpu feature to track whether LPA2 is
> supported for deciding whether to use the LPA2 or classic pte format.
> 
> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
> because stage2_apply_range() works on batch sizes of the largest used
> block mapping, and increasing the size of the batch would lead to soft
> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
> stage2_apply_range() batch size to largest block").
> 
> With the addition of LPA2 support in the hypervisor, the PA size
> supported by the HW must be capped with a runtime decision, rather than
> simply using a compile-time decision based on PA_BITS. For example, on a
> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
> or 16KB kernel compiled with LPA2 support must still limit the PA size
> to 48 bits.
> 
> Therefore, move the insertion of the PS field into TCR_EL2 out of
> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
> with kvm_get_parange(), which has the appropriate logic to ensure the
> above requirement. (and the PS field of VTCR_EL2 is already populated
> this way).
> 
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h     |  2 +-
>  arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
>  arch/arm64/kvm/arm.c                 |  5 +++
>  arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  4 ---
>  arch/arm64/kvm/hyp/pgtable.c         | 15 +++++++--
>  5 files changed, 54 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 31e8d7faed65..f4e4fcb35afc 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
>  	return container_of(mmu->arch, struct kvm, arch);
>  }
>  
> -#define kvm_lpa2_is_enabled()		false
> +#define kvm_lpa2_is_enabled()		system_supports_lpa2()

Can we use this predicate consistently throughout the KVM code? Looks
like the rest of this diff is using system_supports_lpa2() directly.
Ryan Roberts Nov. 22, 2023, 1:41 p.m. UTC | #2
On 21/11/2023 20:34, Oliver Upton wrote:
> On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
>> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
>> page size we are using, always use LPA2-style page-tables for stage 2
>> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
>> IPA size or HW-implemented PA size. When in use we can now support up to
>> 52-bit IPA and PA sizes.
>>
>> We use the previously created cpu feature to track whether LPA2 is
>> supported for deciding whether to use the LPA2 or classic pte format.
>>
>> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
>> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
>> because stage2_apply_range() works on batch sizes of the largest used
>> block mapping, and increasing the size of the batch would lead to soft
>> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
>> stage2_apply_range() batch size to largest block").
>>
>> With the addition of LPA2 support in the hypervisor, the PA size
>> supported by the HW must be capped with a runtime decision, rather than
>> simply using a compile-time decision based on PA_BITS. For example, on a
>> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
>> or 16KB kernel compiled with LPA2 support must still limit the PA size
>> to 48 bits.
>>
>> Therefore, move the insertion of the PS field into TCR_EL2 out of
>> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
>> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
>> with kvm_get_parange(), which has the appropriate logic to ensure the
>> above requirement. (and the PS field of VTCR_EL2 is already populated
>> this way).
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_mmu.h     |  2 +-
>>  arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
>>  arch/arm64/kvm/arm.c                 |  5 +++
>>  arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  4 ---
>>  arch/arm64/kvm/hyp/pgtable.c         | 15 +++++++--
>>  5 files changed, 54 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 31e8d7faed65..f4e4fcb35afc 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
>>  	return container_of(mmu->arch, struct kvm, arch);
>>  }
>>  
>> -#define kvm_lpa2_is_enabled()		false
>> +#define kvm_lpa2_is_enabled()		system_supports_lpa2()
> 
> Can we use this predicate consistently throughout the KVM code? Looks
> like the rest of this diff is using system_supports_lpa2() directly.

My thinking was that system_supports_lpa2() is an input to KVM's policy to
decide if it is going to use LPA2 (currently that policy is very simple - if the
system supports it, then KVM uses it - but it doesn't have to be that way), and
kvm_lpa2_is_enabled() is how KVM exports its policy decision, so one is an input
and the other is an output.

It's a lightly held opinion though - I'll make the change if you insist? :)
Marc Zyngier Nov. 22, 2023, 3:21 p.m. UTC | #3
On Wed, 22 Nov 2023 13:41:33 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
> 
> On 21/11/2023 20:34, Oliver Upton wrote:
> > On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
> >> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
> >> page size we are using, always use LPA2-style page-tables for stage 2
> >> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
> >> IPA size or HW-implemented PA size. When in use we can now support up to
> >> 52-bit IPA and PA sizes.
> >>
> >> We use the previously created cpu feature to track whether LPA2 is
> >> supported for deciding whether to use the LPA2 or classic pte format.
> >>
> >> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
> >> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
> >> because stage2_apply_range() works on batch sizes of the largest used
> >> block mapping, and increasing the size of the batch would lead to soft
> >> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
> >> stage2_apply_range() batch size to largest block").
> >>
> >> With the addition of LPA2 support in the hypervisor, the PA size
> >> supported by the HW must be capped with a runtime decision, rather than
> >> simply using a compile-time decision based on PA_BITS. For example, on a
> >> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
> >> or 16KB kernel compiled with LPA2 support must still limit the PA size
> >> to 48 bits.
> >>
> >> Therefore, move the insertion of the PS field into TCR_EL2 out of
> >> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
> >> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
> >> with kvm_get_parange(), which has the appropriate logic to ensure the
> >> above requirement. (and the PS field of VTCR_EL2 is already populated
> >> this way).
> >>
> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_mmu.h     |  2 +-
> >>  arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
> >>  arch/arm64/kvm/arm.c                 |  5 +++
> >>  arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  4 ---
> >>  arch/arm64/kvm/hyp/pgtable.c         | 15 +++++++--
> >>  5 files changed, 54 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 31e8d7faed65..f4e4fcb35afc 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
> >>  	return container_of(mmu->arch, struct kvm, arch);
> >>  }
> >>  
> >> -#define kvm_lpa2_is_enabled()		false
> >> +#define kvm_lpa2_is_enabled()		system_supports_lpa2()
> > 
> > Can we use this predicate consistently throughout the KVM code? Looks
> > like the rest of this diff is using system_supports_lpa2() directly.
> 
> My thinking was that system_supports_lpa2() is an input to KVM's policy to
> decide if it is going to use LPA2 (currently that policy is very simple - if the
> system supports it, then KVM uses it - but it doesn't have to be that way), and
> kvm_lpa2_is_enabled() is how KVM exports its policy decision, so one is an input
> and the other is an output.
> 
> It's a lightly held opinion though - I'll make the change if you insist? :)

<bikeshed>
I personally don't find this dichotomy very useful. It could make
sense if we used the page table walker for S1 outside of KVM, but
that's not the case at the moment.

If there is no plan for such a use case, I'd rather see a single
predicate, making the code a bit more readable.
</bikeshed>

	M.
Ryan Roberts Nov. 24, 2023, 11:49 a.m. UTC | #4
On 22/11/2023 15:21, Marc Zyngier wrote:
> On Wed, 22 Nov 2023 13:41:33 +0000,
> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> On 21/11/2023 20:34, Oliver Upton wrote:
>>> On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
>>>> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
>>>> page size we are using, always use LPA2-style page-tables for stage 2
>>>> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
>>>> IPA size or HW-implemented PA size. When in use we can now support up to
>>>> 52-bit IPA and PA sizes.
>>>>
>>>> We use the previously created cpu feature to track whether LPA2 is
>>>> supported for deciding whether to use the LPA2 or classic pte format.
>>>>
>>>> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
>>>> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
>>>> because stage2_apply_range() works on batch sizes of the largest used
>>>> block mapping, and increasing the size of the batch would lead to soft
>>>> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
>>>> stage2_apply_range() batch size to largest block").
>>>>
>>>> With the addition of LPA2 support in the hypervisor, the PA size
>>>> supported by the HW must be capped with a runtime decision, rather than
>>>> simply using a compile-time decision based on PA_BITS. For example, on a
>>>> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
>>>> or 16KB kernel compiled with LPA2 support must still limit the PA size
>>>> to 48 bits.
>>>>
>>>> Therefore, move the insertion of the PS field into TCR_EL2 out of
>>>> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
>>>> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
>>>> with kvm_get_parange(), which has the appropriate logic to ensure the
>>>> above requirement. (and the PS field of VTCR_EL2 is already populated
>>>> this way).
>>>>
>>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>>> ---
>>>>  arch/arm64/include/asm/kvm_mmu.h     |  2 +-
>>>>  arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
>>>>  arch/arm64/kvm/arm.c                 |  5 +++
>>>>  arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  4 ---
>>>>  arch/arm64/kvm/hyp/pgtable.c         | 15 +++++++--
>>>>  5 files changed, 54 insertions(+), 19 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>>>> index 31e8d7faed65..f4e4fcb35afc 100644
>>>> --- a/arch/arm64/include/asm/kvm_mmu.h
>>>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>>>> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
>>>>  	return container_of(mmu->arch, struct kvm, arch);
>>>>  }
>>>>  
>>>> -#define kvm_lpa2_is_enabled()		false
>>>> +#define kvm_lpa2_is_enabled()		system_supports_lpa2()
>>>
>>> Can we use this predicate consistently throughout the KVM code? Looks
>>> like the rest of this diff is using system_supports_lpa2() directly.
>>
>> My thinking was that system_supports_lpa2() is an input to KVM's policy to
>> decide if it is going to use LPA2 (currently that policy is very simple - if the
>> system supports it, then KVM uses it - but it doesn't have to be that way), and
>> kvm_lpa2_is_enabled() is how KVM exports its policy decision, so one is an input
>> and the other is an output.
>>
>> It's a lightly held opinion though - I'll make the change if you insist? :)
> 
> <bikeshed>
> I personally don't find this dichotomy very useful. It could make
> sense if we used the page table walker for S1 outside of KVM, but
> that's not the case at the moment.
> 
> If there is no plan for such a use case, I'd rather see a single
> predicate, making the code a bit more readable.
> </bikeshed>

OK fair enough. I've made this change for the next rev.

> 
> 	M.
>
Marc Zyngier Nov. 27, 2023, 9:32 a.m. UTC | #5
On Fri, 24 Nov 2023 11:49:57 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
> 
> OK fair enough. I've made this change for the next rev.

Any chance you could post this new revision shortly? It looks ready to
me, and I would really like this to simmer in -next for a while.

Thanks,

	M.
Ryan Roberts Nov. 27, 2023, 9:43 a.m. UTC | #6
On 27/11/2023 09:32, Marc Zyngier wrote:
> On Fri, 24 Nov 2023 11:49:57 +0000,
> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> OK fair enough. I've made this change for the next rev.
> 
> Any chance you could post this new revision shortly? It looks ready to
> me, and I would really like this to simmer in -next for a while.

Yes; I was just rerunning the kvm selftests over the weekend. No new issues
there. But I want to rerun the boot tests too, which I will do this morning.
Assuming that's still good, I'll post later today.

> 
> Thanks,
> 
> 	M.
>
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 31e8d7faed65..f4e4fcb35afc 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -340,7 +340,7 @@  static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
 	return container_of(mmu->arch, struct kvm, arch);
 }
 
-#define kvm_lpa2_is_enabled()		false
+#define kvm_lpa2_is_enabled()		system_supports_lpa2()
 
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index d3e354bb8351..d738c47d8a77 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -25,12 +25,22 @@ 
 #define KVM_PGTABLE_MIN_BLOCK_LEVEL	2U
 #endif
 
+static inline u64 kvm_get_parange_max(void)
+{
+	if (system_supports_lpa2() ||
+	   (IS_ENABLED(CONFIG_ARM64_PA_BITS_52) && PAGE_SHIFT == 16))
+		return ID_AA64MMFR0_EL1_PARANGE_52;
+	else
+		return ID_AA64MMFR0_EL1_PARANGE_48;
+}
+
 static inline u64 kvm_get_parange(u64 mmfr0)
 {
+	u64 parange_max = kvm_get_parange_max();
 	u64 parange = cpuid_feature_extract_unsigned_field(mmfr0,
 				ID_AA64MMFR0_EL1_PARANGE_SHIFT);
-	if (parange > ID_AA64MMFR0_EL1_PARANGE_MAX)
-		parange = ID_AA64MMFR0_EL1_PARANGE_MAX;
+	if (parange > parange_max)
+		parange = parange_max;
 
 	return parange;
 }
@@ -41,6 +51,8 @@  typedef u64 kvm_pte_t;
 
 #define KVM_PTE_ADDR_MASK		GENMASK(47, PAGE_SHIFT)
 #define KVM_PTE_ADDR_51_48		GENMASK(15, 12)
+#define KVM_PTE_ADDR_MASK_LPA2		GENMASK(49, PAGE_SHIFT)
+#define KVM_PTE_ADDR_51_50_LPA2		GENMASK(9, 8)
 
 #define KVM_PHYS_INVALID		(-1ULL)
 
@@ -51,21 +63,34 @@  static inline bool kvm_pte_valid(kvm_pte_t pte)
 
 static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
 {
-	u64 pa = pte & KVM_PTE_ADDR_MASK;
-
-	if (PAGE_SHIFT == 16)
-		pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
+	u64 pa;
+
+	if (system_supports_lpa2()) {
+		pa = pte & KVM_PTE_ADDR_MASK_LPA2;
+		pa |= FIELD_GET(KVM_PTE_ADDR_51_50_LPA2, pte) << 50;
+	} else {
+		pa = pte & KVM_PTE_ADDR_MASK;
+		if (PAGE_SHIFT == 16)
+			pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
+	}
 
 	return pa;
 }
 
 static inline kvm_pte_t kvm_phys_to_pte(u64 pa)
 {
-	kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
-
-	if (PAGE_SHIFT == 16) {
-		pa &= GENMASK(51, 48);
-		pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
+	kvm_pte_t pte;
+
+	if (system_supports_lpa2()) {
+		pte = pa & KVM_PTE_ADDR_MASK_LPA2;
+		pa &= GENMASK(51, 50);
+		pte |= FIELD_PREP(KVM_PTE_ADDR_51_50_LPA2, pa >> 50);
+	} else {
+		pte = pa & KVM_PTE_ADDR_MASK;
+		if (PAGE_SHIFT == 16) {
+			pa &= GENMASK(51, 48);
+			pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
+		}
 	}
 
 	return pte;
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e5f75f1f1085..082100c582e2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1837,6 +1837,7 @@  static int kvm_init_vector_slots(void)
 static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
 {
 	struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
+	u64 mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
 	unsigned long tcr;
 
 	/*
@@ -1859,6 +1860,10 @@  static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
 	}
 	tcr &= ~TCR_T0SZ_MASK;
 	tcr |= TCR_T0SZ(hyp_va_bits);
+	tcr &= ~TCR_EL2_PS_MASK;
+	tcr |= FIELD_PREP(TCR_EL2_PS_MASK, kvm_get_parange(mmfr0));
+	if (system_supports_lpa2())
+		tcr |= TCR_EL2_DS;
 	params->tcr_el2 = tcr;
 
 	params->pgd_pa = kvm_mmu_get_httbr();
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index 1cc06e6797bd..f62a7d360285 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -122,11 +122,7 @@  alternative_if ARM64_HAS_CNP
 alternative_else_nop_endif
 	msr	ttbr0_el2, x2
 
-	/*
-	 * Set the PS bits in TCR_EL2.
-	 */
 	ldr	x0, [x0, #NVHE_INIT_TCR_EL2]
-	tcr_compute_pa_size x0, #TCR_EL2_PS_SHIFT, x1, x2
 	msr	tcr_el2, x0
 
 	isb
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 1966fdee740e..e0cf96bafe4a 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -79,7 +79,10 @@  static bool kvm_pgtable_walk_skip_cmo(const struct kvm_pgtable_visit_ctx *ctx)
 
 static bool kvm_phys_is_valid(u64 phys)
 {
-	return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX));
+	u64 parange_max = kvm_get_parange_max();
+	u8 shift = id_aa64mmfr0_parange_to_phys_shift(parange_max);
+
+	return phys < BIT(shift);
 }
 
 static bool kvm_block_mapping_supported(const struct kvm_pgtable_visit_ctx *ctx, u64 phys)
@@ -408,7 +411,8 @@  static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep)
 	}
 
 	attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap);
-	attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
+	if (!system_supports_lpa2())
+		attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
 	attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF;
 	attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
 	*ptep = attr;
@@ -654,6 +658,9 @@  u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
 		vtcr |= VTCR_EL2_HA;
 #endif /* CONFIG_ARM64_HW_AFDBM */
 
+	if (system_supports_lpa2())
+		vtcr |= VTCR_EL2_DS;
+
 	/* Set the vmid bits */
 	vtcr |= (get_vmid_bits(mmfr1) == 16) ?
 		VTCR_EL2_VS_16BIT :
@@ -711,7 +718,9 @@  static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p
 	if (prot & KVM_PGTABLE_PROT_W)
 		attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
 
-	attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+	if (!system_supports_lpa2())
+		attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+
 	attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
 	attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
 	*ptep = attr;