mbox series

[0/7] KVM: TDX SEPT SEAMCALL retry

Message ID 20250113020925.18789-1-yan.y.zhao@intel.com (mailing list archive)
Headers show
Series KVM: TDX SEPT SEAMCALL retry | expand

Message

Yan Zhao Jan. 13, 2025, 2:09 a.m. UTC
This series aims to provide a clean solution to avoid the blind retries in
the previous hack [1] in "TDX MMU Part 2," following the initial
discussions to [2], further discussions in the RFC, and the PUCK [3].

A full analysis of the lock status for each SEAMCALL relevant to KVM is
available at [4].

This series categorizes the SEPT-related SEAMCALLs (used for page
installation and uninstallation) into three groups:

Group 1: tdh_mem_page_add().
       - Invoked only during TD build time.
       - Proposal: Return -EBUSY on TDX_OPERAND_BUSY.
       - Patch 1.

Group 2: tdh_mem_sept_add(), tdh_mem_page_aug().
       - Invoked for TD runtime page installation.
       - Proposal: Retry locally in the TDX EPT violation handler for
         RET_PF_RETRY.
       - Patches 2-3.

Group 3: tdh_mem_range_block(), tdh_mem_track(), tdh_mem_page_remove().
       - Invoked for page uninstallation, with KVM mmu_lock held for write.
       - Proposal: Kick off vCPUs and no vCPU entry on TDX_OPERAND_BUSY.
       - Patch 4.

Patches 5/6/7 are fixup patches:
Patch 5: Return -EBUSY instead of -EAGAIN when tdh_mem_sept_add() is busy.
Patch 6: Remove the retry loop for tdh_phymem_page_wbinvd_hkid().
Patch 7: Warn on force_immediate_exit in tdx_vcpu_run().

Code base: kvm-coco-queue 2f30b837bf7b.
Applies to the tail since the dependence on
commit 8e801e55ba8f ("KVM: TDX: Handle EPT violation/misconfig exit"),

Thanks
Yan

RFC --> v1:
- Split patch 1 in RFC into patches 1,2,3,5, and add new fixup patches 6/7.
- Add contention analysis of tdh_mem_page_add() in patch 1 log.
- Provide justification in patch 2 log and add checks for RET_PF_CONTINUE.
- Use "a per-VM flag wait_for_sept_zap + KVM_REQ_OUTSIDE_GUEST_MODE"
  instead of a arch-specific request to prevent vCPUs from TD entry in patch 4
  (Sean).

RFC: https://lore.kernel.org/all/20241121115139.26338-1-yan.y.zhao@intel.com
[1] https://lore.kernel.org/all/20241112073909.22326-1-yan.y.zhao@intel.com
[2] https://lore.kernel.org/kvm/20240904030751.117579-10-rick.p.edgecombe@intel.com/
[3] https://drive.google.com/drive/folders/1k0qOarKuZXpzRsKDtVeC5Lpl9-amJ6AJ?resourcekey=0-l9uVpVEBC34Uar1ReaqisQ
[4] https://lore.kernel.org/kvm/ZuP5eNXFCljzRgWo@yzhao56-desk.sh.intel.com

Yan Zhao (7):
  KVM: TDX: Return -EBUSY when tdh_mem_page_add() encounters
    TDX_OPERAND_BUSY
  KVM: x86/mmu: Return RET_PF* instead of 1 in kvm_mmu_page_fault()
  KVM: TDX: Retry locally in TDX EPT violation handler on RET_PF_RETRY
  KVM: TDX: Kick off vCPUs when SEAMCALL is busy during TD page removal
  fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
    mirror page table
  fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
    mirror page table
  fixup! KVM: TDX: Implement TDX vcpu enter/exit path

 arch/x86/kvm/mmu/mmu.c          |  10 ++-
 arch/x86/kvm/mmu/mmu_internal.h |  12 ++-
 arch/x86/kvm/vmx/tdx.c          | 135 ++++++++++++++++++++++++++------
 arch/x86/kvm/vmx/tdx.h          |   7 ++
 4 files changed, 134 insertions(+), 30 deletions(-)

Comments

Edgecombe, Rick P Jan. 14, 2025, 10:27 p.m. UTC | #1
We should probably...

On Mon, 2025-01-13 at 10:09 +0800, Yan Zhao wrote:
>   KVM: TDX: Return -EBUSY when tdh_mem_page_add() encounters
>     TDX_OPERAND_BUSY

Squash this in "KVM: TDX: Add an ioctl to create initial guest
memory" in kvm-coco-queue.

>   KVM: x86/mmu: Return RET_PF* instead of 1 in kvm_mmu_page_fault()

Recommend this to go through kvm/x86 tree separately?

>   KVM: TDX: Retry locally in TDX EPT violation handler on RET_PF_RETRY
>   KVM: TDX: Kick off vCPUs when SEAMCALL is busy during TD page removal

Include these in kvm-coco-queue and drop "x86/virt/tdx: Retry seamcall when
TDX_OPERAND_BUSY with operand SEPT"

>   fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
>     mirror page table
>   fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
>     mirror page table

Squash these into kvm-coco-queue.

>   fixup! KVM: TDX: Implement TDX vcpu enter/exit path

Add this in the next version of "TDX vcpu enter/exit path".

How does that sound?
Paolo Bonzini Jan. 15, 2025, 4:43 p.m. UTC | #2
On 1/13/25 03:09, Yan Zhao wrote:
> This series aims to provide a clean solution to avoid the blind retries in
> the previous hack [1] in "TDX MMU Part 2," following the initial
> discussions to [2], further discussions in the RFC, and the PUCK [3].
> 
> A full analysis of the lock status for each SEAMCALL relevant to KVM is
> available at [4].
> 
> This series categorizes the SEPT-related SEAMCALLs (used for page
> installation and uninstallation) into three groups:
> 
> Group 1: tdh_mem_page_add().
>         - Invoked only during TD build time.
>         - Proposal: Return -EBUSY on TDX_OPERAND_BUSY.
>         - Patch 1.
> 
> Group 2: tdh_mem_sept_add(), tdh_mem_page_aug().
>         - Invoked for TD runtime page installation.
>         - Proposal: Retry locally in the TDX EPT violation handler for
>           RET_PF_RETRY.
>         - Patches 2-3.
> 
> Group 3: tdh_mem_range_block(), tdh_mem_track(), tdh_mem_page_remove().
>         - Invoked for page uninstallation, with KVM mmu_lock held for write.
>         - Proposal: Kick off vCPUs and no vCPU entry on TDX_OPERAND_BUSY.
>         - Patch 4.
> 
> Patches 5/6/7 are fixup patches:
> Patch 5: Return -EBUSY instead of -EAGAIN when tdh_mem_sept_add() is busy.
> Patch 6: Remove the retry loop for tdh_phymem_page_wbinvd_hkid().
> Patch 7: Warn on force_immediate_exit in tdx_vcpu_run().
> 
> Code base: kvm-coco-queue 2f30b837bf7b.
> Applies to the tail since the dependence on
> commit 8e801e55ba8f ("KVM: TDX: Handle EPT violation/misconfig exit"),
> 
> Thanks
> Yan
> 
> RFC --> v1:
> - Split patch 1 in RFC into patches 1,2,3,5, and add new fixup patches 6/7.
> - Add contention analysis of tdh_mem_page_add() in patch 1 log.
> - Provide justification in patch 2 log and add checks for RET_PF_CONTINUE.
> - Use "a per-VM flag wait_for_sept_zap + KVM_REQ_OUTSIDE_GUEST_MODE"
>    instead of a arch-specific request to prevent vCPUs from TD entry in patch 4
>    (Sean).
> 
> RFC: https://lore.kernel.org/all/20241121115139.26338-1-yan.y.zhao@intel.com
> [1] https://lore.kernel.org/all/20241112073909.22326-1-yan.y.zhao@intel.com
> [2] https://lore.kernel.org/kvm/20240904030751.117579-10-rick.p.edgecombe@intel.com/
> [3] https://drive.google.com/drive/folders/1k0qOarKuZXpzRsKDtVeC5Lpl9-amJ6AJ?resourcekey=0-l9uVpVEBC34Uar1ReaqisQ
> [4] https://lore.kernel.org/kvm/ZuP5eNXFCljzRgWo@yzhao56-desk.sh.intel.com

Thanks, I applied this to kvm-coco-queue and patch 2 to kvm/queue.  It 
is spread all over the branch to make the dependencies clearer, so 
here's some ideas on how to include these.

Patches 6 and 7 should be squashed into the respective bases, as they 
have essentially no functional change.

For the rest, patch 1 can be treated as a fixup too, and I have two 
proposals.

First possibility, separate series:
* patches 1+5 are merged into a single patch.
* patches 3+4 become two more patches in this separate series

Second possibility, squash everything:
* patches 1+5 are squashed into the respective bases
* patches 3+4 are included in the EPT violation series

On the PUCK call I said that I prefer the first, mostly to keep track of 
who needs to handle TDX_OPERAND_BUSY, but if it makes it easier for Yan 
then feel free to go for the second.

Paolo

> Yan Zhao (7):
>    KVM: TDX: Return -EBUSY when tdh_mem_page_add() encounters
>      TDX_OPERAND_BUSY
>    KVM: x86/mmu: Return RET_PF* instead of 1 in kvm_mmu_page_fault()
>    KVM: TDX: Retry locally in TDX EPT violation handler on RET_PF_RETRY
>    KVM: TDX: Kick off vCPUs when SEAMCALL is busy during TD page removal
>    fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
>      mirror page table
>    fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
>      mirror page table
>    fixup! KVM: TDX: Implement TDX vcpu enter/exit path
> 
>   arch/x86/kvm/mmu/mmu.c          |  10 ++-
>   arch/x86/kvm/mmu/mmu_internal.h |  12 ++-
>   arch/x86/kvm/vmx/tdx.c          | 135 ++++++++++++++++++++++++++------
>   arch/x86/kvm/vmx/tdx.h          |   7 ++
>   4 files changed, 134 insertions(+), 30 deletions(-)
>
Yan Zhao Jan. 16, 2025, 12:52 a.m. UTC | #3
On Wed, Jan 15, 2025 at 05:43:51PM +0100, Paolo Bonzini wrote:
> On 1/13/25 03:09, Yan Zhao wrote:
> > This series aims to provide a clean solution to avoid the blind retries in
> > the previous hack [1] in "TDX MMU Part 2," following the initial
> > discussions to [2], further discussions in the RFC, and the PUCK [3].
> > 
> > A full analysis of the lock status for each SEAMCALL relevant to KVM is
> > available at [4].
> > 
> > This series categorizes the SEPT-related SEAMCALLs (used for page
> > installation and uninstallation) into three groups:
> > 
> > Group 1: tdh_mem_page_add().
> >         - Invoked only during TD build time.
> >         - Proposal: Return -EBUSY on TDX_OPERAND_BUSY.
> >         - Patch 1.
> > 
> > Group 2: tdh_mem_sept_add(), tdh_mem_page_aug().
> >         - Invoked for TD runtime page installation.
> >         - Proposal: Retry locally in the TDX EPT violation handler for
> >           RET_PF_RETRY.
> >         - Patches 2-3.
> > 
> > Group 3: tdh_mem_range_block(), tdh_mem_track(), tdh_mem_page_remove().
> >         - Invoked for page uninstallation, with KVM mmu_lock held for write.
> >         - Proposal: Kick off vCPUs and no vCPU entry on TDX_OPERAND_BUSY.
> >         - Patch 4.
> > 
> > Patches 5/6/7 are fixup patches:
> > Patch 5: Return -EBUSY instead of -EAGAIN when tdh_mem_sept_add() is busy.
> > Patch 6: Remove the retry loop for tdh_phymem_page_wbinvd_hkid().
> > Patch 7: Warn on force_immediate_exit in tdx_vcpu_run().
> > 
> > Code base: kvm-coco-queue 2f30b837bf7b.
> > Applies to the tail since the dependence on
> > commit 8e801e55ba8f ("KVM: TDX: Handle EPT violation/misconfig exit"),
> > 
> > Thanks
> > Yan
> > 
> > RFC --> v1:
> > - Split patch 1 in RFC into patches 1,2,3,5, and add new fixup patches 6/7.
> > - Add contention analysis of tdh_mem_page_add() in patch 1 log.
> > - Provide justification in patch 2 log and add checks for RET_PF_CONTINUE.
> > - Use "a per-VM flag wait_for_sept_zap + KVM_REQ_OUTSIDE_GUEST_MODE"
> >    instead of a arch-specific request to prevent vCPUs from TD entry in patch 4
> >    (Sean).
> > 
> > RFC: https://lore.kernel.org/all/20241121115139.26338-1-yan.y.zhao@intel.com
> > [1] https://lore.kernel.org/all/20241112073909.22326-1-yan.y.zhao@intel.com
> > [2] https://lore.kernel.org/kvm/20240904030751.117579-10-rick.p.edgecombe@intel.com/
> > [3] https://drive.google.com/drive/folders/1k0qOarKuZXpzRsKDtVeC5Lpl9-amJ6AJ?resourcekey=0-l9uVpVEBC34Uar1ReaqisQ
> > [4] https://lore.kernel.org/kvm/ZuP5eNXFCljzRgWo@yzhao56-desk.sh.intel.com
> 
> Thanks, I applied this to kvm-coco-queue and patch 2 to kvm/queue.  It is
> spread all over the branch to make the dependencies clearer, so here's some
> ideas on how to include these.
> 
> Patches 6 and 7 should be squashed into the respective bases, as they have
> essentially no functional change.
> 
> For the rest, patch 1 can be treated as a fixup too, and I have two
> proposals.
> 
> First possibility, separate series:
> * patches 1+5 are merged into a single patch.
> * patches 3+4 become two more patches in this separate series
> 
> Second possibility, squash everything:
> * patches 1+5 are squashed into the respective bases
> * patches 3+4 are included in the EPT violation series
> 
> On the PUCK call I said that I prefer the first, mostly to keep track of who
> needs to handle TDX_OPERAND_BUSY, but if it makes it easier for Yan then
> feel free to go for the second.
Thank you, Paolo.

I'm ok with either way.

For the first, hmm, a bad thing is that though
tdh_mem_sept_add()/tdh_mem_page_aug()/tdh_mem_sept_add() all need to handle
TDX_OPERAND_BUSY, the one for tdh_mem_page_aug() has already been squashed
into the MMU part 2.

If you like, maybe I can extract the one for tdh_mem_page_aug() and merge it
with 1+5.

Then since 3 depends on EPT violation, for the first, 3+4 can also be included
in the EPT violation series, right?

> 
> Paolo
> 
> > Yan Zhao (7):
> >    KVM: TDX: Return -EBUSY when tdh_mem_page_add() encounters
> >      TDX_OPERAND_BUSY
> >    KVM: x86/mmu: Return RET_PF* instead of 1 in kvm_mmu_page_fault()
> >    KVM: TDX: Retry locally in TDX EPT violation handler on RET_PF_RETRY
> >    KVM: TDX: Kick off vCPUs when SEAMCALL is busy during TD page removal
> >    fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
> >      mirror page table
> >    fixup! KVM: TDX: Implement hooks to propagate changes of TDP MMU
> >      mirror page table
> >    fixup! KVM: TDX: Implement TDX vcpu enter/exit path
> > 
> >   arch/x86/kvm/mmu/mmu.c          |  10 ++-
> >   arch/x86/kvm/mmu/mmu_internal.h |  12 ++-
> >   arch/x86/kvm/vmx/tdx.c          | 135 ++++++++++++++++++++++++++------
> >   arch/x86/kvm/vmx/tdx.h          |   7 ++
> >   4 files changed, 134 insertions(+), 30 deletions(-)
> > 
> 
>
Paolo Bonzini Jan. 16, 2025, 11:07 a.m. UTC | #4
On Thu, Jan 16, 2025 at 1:53 AM Yan Zhao <yan.y.zhao@intel.com> wrote:
> For the first, hmm, a bad thing is that though
> tdh_mem_sept_add()/tdh_mem_page_aug()/tdh_mem_sept_add() all need to handle
> TDX_OPERAND_BUSY, the one for tdh_mem_page_aug() has already been squashed
> into the MMU part 2.
>
> If you like, maybe I can extract the one for tdh_mem_page_aug() and merge it
> with 1+5.

That works for me, but if it's easier for you to merge the fixups in
the respective base patches that's okay too.

Paolo
Yan Zhao Jan. 17, 2025, 9:52 a.m. UTC | #5
On Thu, Jan 16, 2025 at 12:07:00PM +0100, Paolo Bonzini wrote:
> On Thu, Jan 16, 2025 at 1:53 AM Yan Zhao <yan.y.zhao@intel.com> wrote:
> > For the first, hmm, a bad thing is that though
> > tdh_mem_sept_add()/tdh_mem_page_aug()/tdh_mem_sept_add() all need to handle
> > TDX_OPERAND_BUSY, the one for tdh_mem_page_aug() has already been squashed
> > into the MMU part 2.
> >
> > If you like, maybe I can extract the one for tdh_mem_page_aug() and merge it
> > with 1+5.
> 
> That works for me, but if it's easier for you to merge the fixups in
> the respective base patches that's okay too.
Then I'll merge them into the respective base patches to save effort and make
the latter cleaner :)

Thank you, Paolo!