diff mbox series

[1/1] KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs

Message ID 20220608012015.19566-1-yuan.yao@intel.com (mailing list archive)
State New, archived
Headers show
Series [1/1] KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs | expand

Commit Message

Yao Yuan June 8, 2022, 1:20 a.m. UTC
Assign shadow_me_value, not shadow_me_mask, to PAE root entries,
a.k.a. shadow PDPTRs, when host memory encryption is supported.  The
"mask" is the set of all possible memory encryption bits, e.g. MKTME
KeyIDs, whereas "value" holds the actual value that needs to be
stuffed into host page tables.

Using shadow_me_mask results in a failed VM-Entry due to setting
reserved PA bits in the PDPTRs, and ultimately causes an OOPS due to
physical addresses with non-zero MKTME bits sending to_shadow_page()
into the weeds:

set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
BUG: unable to handle page fault for address: ffd43f00063049e8
PGD 86dfd8067 P4D 0
Oops: 0000 [#1] PREEMPT SMP
RIP: 0010:mmu_free_root_page+0x3c/0x90 [kvm]
 kvm_mmu_free_roots+0xd1/0x200 [kvm]
 __kvm_mmu_unload+0x29/0x70 [kvm]
 kvm_mmu_unload+0x13/0x20 [kvm]
 kvm_arch_destroy_vm+0x8a/0x190 [kvm]
 kvm_put_kvm+0x197/0x2d0 [kvm]
 kvm_vm_release+0x21/0x30 [kvm]
 __fput+0x8e/0x260
 ____fput+0xe/0x10
 task_work_run+0x6f/0xb0
 do_exit+0x327/0xa90
 do_group_exit+0x35/0xa0
 get_signal+0x911/0x930
 arch_do_signal_or_restart+0x37/0x720
 exit_to_user_mode_prepare+0xb2/0x140
 syscall_exit_to_user_mode+0x16/0x30
 do_syscall_64+0x4e/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: e54f1ff244ac ("KVM: x86/mmu: Add shadow_me_value and repurpose shadow_me_mask")
Signed-off-by: Yuan Yao <yuan.yao@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/mmu/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paolo Bonzini June 8, 2022, 12:37 p.m. UTC | #1
On 6/8/22 03:20, Yuan Yao wrote:
> Assign shadow_me_value, not shadow_me_mask, to PAE root entries,
> a.k.a. shadow PDPTRs, when host memory encryption is supported.  The
> "mask" is the set of all possible memory encryption bits, e.g. MKTME
> KeyIDs, whereas "value" holds the actual value that needs to be
> stuffed into host page tables.
> 
> Using shadow_me_mask results in a failed VM-Entry due to setting
> reserved PA bits in the PDPTRs, and ultimately causes an OOPS due to
> physical addresses with non-zero MKTME bits sending to_shadow_page()
> into the weeds:
> 
> set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
> BUG: unable to handle page fault for address: ffd43f00063049e8
> PGD 86dfd8067 P4D 0
> Oops: 0000 [#1] PREEMPT SMP
> RIP: 0010:mmu_free_root_page+0x3c/0x90 [kvm]
>   kvm_mmu_free_roots+0xd1/0x200 [kvm]
>   __kvm_mmu_unload+0x29/0x70 [kvm]
>   kvm_mmu_unload+0x13/0x20 [kvm]
>   kvm_arch_destroy_vm+0x8a/0x190 [kvm]
>   kvm_put_kvm+0x197/0x2d0 [kvm]
>   kvm_vm_release+0x21/0x30 [kvm]
>   __fput+0x8e/0x260
>   ____fput+0xe/0x10
>   task_work_run+0x6f/0xb0
>   do_exit+0x327/0xa90
>   do_group_exit+0x35/0xa0
>   get_signal+0x911/0x930
>   arch_do_signal_or_restart+0x37/0x720
>   exit_to_user_mode_prepare+0xb2/0x140
>   syscall_exit_to_user_mode+0x16/0x30
>   do_syscall_64+0x4e/0x90
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Fixes: e54f1ff244ac ("KVM: x86/mmu: Add shadow_me_value and repurpose shadow_me_mask")
> Signed-off-by: Yuan Yao <yuan.yao@intel.com>
> Reviewed-by: Kai Huang <kai.huang@intel.com>
> ---
>   arch/x86/kvm/mmu/mmu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index efe5a3dca1e0..6bd144f1e60c 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3411,7 +3411,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
>   			root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
>   					      i << 30, PT32_ROOT_LEVEL, true);
>   			mmu->pae_root[i] = root | PT_PRESENT_MASK |
> -					   shadow_me_mask;
> +					   shadow_me_value;
>   		}
>   		mmu->root.hpa = __pa(mmu->pae_root);
>   	} else {

Queued, thanks.

Paolo
Yuan Yao June 8, 2022, 10:53 p.m. UTC | #2
On Wed, Jun 08, 2022 at 02:37:27PM +0200, Paolo Bonzini wrote:
> On 6/8/22 03:20, Yuan Yao wrote:
> > Assign shadow_me_value, not shadow_me_mask, to PAE root entries,
> > a.k.a. shadow PDPTRs, when host memory encryption is supported.  The
> > "mask" is the set of all possible memory encryption bits, e.g. MKTME
> > KeyIDs, whereas "value" holds the actual value that needs to be
> > stuffed into host page tables.
> >
> > Using shadow_me_mask results in a failed VM-Entry due to setting
> > reserved PA bits in the PDPTRs, and ultimately causes an OOPS due to
> > physical addresses with non-zero MKTME bits sending to_shadow_page()
> > into the weeds:
> >
> > set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
> > BUG: unable to handle page fault for address: ffd43f00063049e8
> > PGD 86dfd8067 P4D 0
> > Oops: 0000 [#1] PREEMPT SMP
> > RIP: 0010:mmu_free_root_page+0x3c/0x90 [kvm]
> >   kvm_mmu_free_roots+0xd1/0x200 [kvm]
> >   __kvm_mmu_unload+0x29/0x70 [kvm]
> >   kvm_mmu_unload+0x13/0x20 [kvm]
> >   kvm_arch_destroy_vm+0x8a/0x190 [kvm]
> >   kvm_put_kvm+0x197/0x2d0 [kvm]
> >   kvm_vm_release+0x21/0x30 [kvm]
> >   __fput+0x8e/0x260
> >   ____fput+0xe/0x10
> >   task_work_run+0x6f/0xb0
> >   do_exit+0x327/0xa90
> >   do_group_exit+0x35/0xa0
> >   get_signal+0x911/0x930
> >   arch_do_signal_or_restart+0x37/0x720
> >   exit_to_user_mode_prepare+0xb2/0x140
> >   syscall_exit_to_user_mode+0x16/0x30
> >   do_syscall_64+0x4e/0x90
> >   entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > Fixes: e54f1ff244ac ("KVM: x86/mmu: Add shadow_me_value and repurpose shadow_me_mask")
> > Signed-off-by: Yuan Yao <yuan.yao@intel.com>
> > Reviewed-by: Kai Huang <kai.huang@intel.com>
> > ---
> >   arch/x86/kvm/mmu/mmu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index efe5a3dca1e0..6bd144f1e60c 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3411,7 +3411,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
> >   			root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
> >   					      i << 30, PT32_ROOT_LEVEL, true);
> >   			mmu->pae_root[i] = root | PT_PRESENT_MASK |
> > -					   shadow_me_mask;
> > +					   shadow_me_value;
> >   		}
> >   		mmu->root.hpa = __pa(mmu->pae_root);
> >   	} else {
>
> Queued, thanks.
>
> Paolo

Thanks Paolo, and Thanks again to Sean Christopherson
<seanjc@google.com>'s nice help for patch subject/format on this patch
(was [PATCH 1/1] KVM: MMU: Fix VM entry failure and OOPS for shdaow
page table).

>
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index efe5a3dca1e0..6bd144f1e60c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3411,7 +3411,7 @@  static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 			root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
 					      i << 30, PT32_ROOT_LEVEL, true);
 			mmu->pae_root[i] = root | PT_PRESENT_MASK |
-					   shadow_me_mask;
+					   shadow_me_value;
 		}
 		mmu->root.hpa = __pa(mmu->pae_root);
 	} else {