diff mbox series

[v5,6/6] x86/kvm: Add kexec support for SEV Live Migration.

Message ID 8fce27b8477073b9c7750f7cfc0c68f7ebd3a97d.1624978790.git.ashish.kalra@amd.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Kalra, Ashish June 29, 2021, 3:14 p.m. UTC
From: Ashish Kalra <ashish.kalra@amd.com>

Reset the host's shared pages list related to kernel
specific page encryption status settings before we load a
new kernel by kexec. We cannot reset the complete
shared pages list here as we need to retain the
UEFI/OVMF firmware specific settings.

The host's shared pages list is maintained for the
guest to keep track of all unencrypted guest memory regions,
therefore we need to explicitly mark all shared pages as
encrypted again before rebooting into the new guest kernel.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/kernel/kvm.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Steve Rutherford Aug. 5, 2021, 1:31 a.m. UTC | #1
On Tue, Jun 29, 2021 at 8:14 AM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> Reset the host's shared pages list related to kernel
> specific page encryption status settings before we load a
> new kernel by kexec. We cannot reset the complete
> shared pages list here as we need to retain the
> UEFI/OVMF firmware specific settings.
>
> The host's shared pages list is maintained for the
> guest to keep track of all unencrypted guest memory regions,
> therefore we need to explicitly mark all shared pages as
> encrypted again before rebooting into the new guest kernel.
>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  arch/x86/kernel/kvm.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index a014c9bb5066..a55712ee58a1 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -869,10 +869,35 @@ static void __init kvm_init_platform(void)
>         if (sev_active() &&
>             kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) {
>                 unsigned long nr_pages;
> +               int i;
>
>                 pv_ops.mmu.notify_page_enc_status_changed =
>                         kvm_sev_hc_page_enc_status;
>
> +               /*
> +                * Reset the host's shared pages list related to kernel
> +                * specific page encryption status settings before we load a
> +                * new kernel by kexec. Reset the page encryption status
> +                * during early boot intead of just before kexec to avoid SMP
> +                * races during kvm_pv_guest_cpu_reboot().
> +                * NOTE: We cannot reset the complete shared pages list
> +                * here as we need to retain the UEFI/OVMF firmware
> +                * specific settings.
> +                */
> +
> +               for (i = 0; i < e820_table->nr_entries; i++) {
> +                       struct e820_entry *entry = &e820_table->entries[i];
> +
> +                       if (entry->type != E820_TYPE_RAM)
> +                               continue;
> +
> +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> +
> +                       kvm_hypercall3(KVM_HC_MAP_GPA_RANGE, entry->addr,
> +                                      nr_pages,
> +                                      KVM_MAP_GPA_RANGE_ENCRYPTED | KVM_MAP_GPA_RANGE_PAGE_SZ_4K);
> +               }
> +
>                 /*
>                  * Ensure that _bss_decrypted section is marked as decrypted in the
>                  * shared pages list.
> --
> 2.17.1
>

Re-reading things, I've convinced myself that kexec is fine. I was
previously concerned with KEXEC_PRESERVE_CONTEXT. In particular, since
the guest does not re-call the encryption status hypercalls after it
jumps back, the host will be out of the loop, and continue believing
the guest is in the same state as it was while running the
intermediate kernel. As a result, it would have an inaccurate list of
which pages are shared/private after a kexecing and jumping back.

The bit that I neglected is that the new kernel (just like the
original kernel) will disable live migration before jumping back (if
it enabled live migration at all). And the original kernel will never
re-enable. This is sub-optimal, since a VM that previously supported
migration can reach a state where it will stop supporting live
migration. But that's unavoidable. Kexecing a kernel that does not
support live migration does the same thing.

This looks good to me.

Reviewed-by: Steve Rutherford<srutherford@google.com>
Steve Rutherford Aug. 17, 2021, 10:50 p.m. UTC | #2
On Tue, Jun 29, 2021 at 8:14 AM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
>
> From: Ashish Kalra <ashish.kalra@amd.com>
>
> Reset the host's shared pages list related to kernel
> specific page encryption status settings before we load a
> new kernel by kexec. We cannot reset the complete
> shared pages list here as we need to retain the
> UEFI/OVMF firmware specific settings.
>
> The host's shared pages list is maintained for the
> guest to keep track of all unencrypted guest memory regions,
> therefore we need to explicitly mark all shared pages as
> encrypted again before rebooting into the new guest kernel.
>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  arch/x86/kernel/kvm.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index a014c9bb5066..a55712ee58a1 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -869,10 +869,35 @@ static void __init kvm_init_platform(void)
>         if (sev_active() &&
>             kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) {
>                 unsigned long nr_pages;
> +               int i;
>
>                 pv_ops.mmu.notify_page_enc_status_changed =
>                         kvm_sev_hc_page_enc_status;
>
> +               /*
> +                * Reset the host's shared pages list related to kernel
> +                * specific page encryption status settings before we load a
> +                * new kernel by kexec. Reset the page encryption status
> +                * during early boot intead of just before kexec to avoid SMP
> +                * races during kvm_pv_guest_cpu_reboot().
> +                * NOTE: We cannot reset the complete shared pages list
> +                * here as we need to retain the UEFI/OVMF firmware
> +                * specific settings.
> +                */
> +
> +               for (i = 0; i < e820_table->nr_entries; i++) {
> +                       struct e820_entry *entry = &e820_table->entries[i];
> +
> +                       if (entry->type != E820_TYPE_RAM)
> +                               continue;
> +
> +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> +
> +                       kvm_hypercall3(KVM_HC_MAP_GPA_RANGE, entry->addr,
> +                                      nr_pages,
> +                                      KVM_MAP_GPA_RANGE_ENCRYPTED | KVM_MAP_GPA_RANGE_PAGE_SZ_4K);
> +               }
> +
>                 /*
>                  * Ensure that _bss_decrypted section is marked as decrypted in the
>                  * shared pages list.
> --
> 2.17.1
>
I believe this entire series has been reviewed. Is there any appetite
to queue these for 915?
They may need to be resent, since I'm not sure there is a single patch
series that contains all the patches.

--Steve
Kalra, Ashish Aug. 18, 2021, 12:04 a.m. UTC | #3
Hello Steve,

On Tue, Aug 17, 2021 at 03:50:22PM -0700, Steve Rutherford wrote:
> On Tue, Jun 29, 2021 at 8:14 AM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> >
> > From: Ashish Kalra <ashish.kalra@amd.com>
> >
> > Reset the host's shared pages list related to kernel
> > specific page encryption status settings before we load a
> > new kernel by kexec. We cannot reset the complete
> > shared pages list here as we need to retain the
> > UEFI/OVMF firmware specific settings.
> >
> > The host's shared pages list is maintained for the
> > guest to keep track of all unencrypted guest memory regions,
> > therefore we need to explicitly mark all shared pages as
> > encrypted again before rebooting into the new guest kernel.
> >
> > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > ---
> >  arch/x86/kernel/kvm.c | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> >
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index a014c9bb5066..a55712ee58a1 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -869,10 +869,35 @@ static void __init kvm_init_platform(void)
> >         if (sev_active() &&
> >             kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) {
> >                 unsigned long nr_pages;
> > +               int i;
> >
> >                 pv_ops.mmu.notify_page_enc_status_changed =
> >                         kvm_sev_hc_page_enc_status;
> >
> > +               /*
> > +                * Reset the host's shared pages list related to kernel
> > +                * specific page encryption status settings before we load a
> > +                * new kernel by kexec. Reset the page encryption status
> > +                * during early boot intead of just before kexec to avoid SMP
> > +                * races during kvm_pv_guest_cpu_reboot().
> > +                * NOTE: We cannot reset the complete shared pages list
> > +                * here as we need to retain the UEFI/OVMF firmware
> > +                * specific settings.
> > +                */
> > +
> > +               for (i = 0; i < e820_table->nr_entries; i++) {
> > +                       struct e820_entry *entry = &e820_table->entries[i];
> > +
> > +                       if (entry->type != E820_TYPE_RAM)
> > +                               continue;
> > +
> > +                       nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
> > +
> > +                       kvm_hypercall3(KVM_HC_MAP_GPA_RANGE, entry->addr,
> > +                                      nr_pages,
> > +                                      KVM_MAP_GPA_RANGE_ENCRYPTED | KVM_MAP_GPA_RANGE_PAGE_SZ_4K);
> > +               }
> > +
> >                 /*
> >                  * Ensure that _bss_decrypted section is marked as decrypted in the
> >                  * shared pages list.
> > --
> > 2.17.1
> >
> I believe this entire series has been reviewed. Is there any appetite
> to queue these for 915?
> They may need to be resent, since I'm not sure there is a single patch
> series that contains all the patches.
> 

I believe that there are couple or more patches still pending an ACK.

Patch#1 of this series which basically inverts the KVM hypercall from
VMCALL to VMMCALL probably needs an ack from either Sean or 
Paolo.

While Patch #4 was ack'ed by Boris, it still has a dependency on Patch
#1. 

Patch #5 & #6 need to be ack'ed by Boris, again have dependency on Patch
#1.

I am more than happy to send this guest kernel and guest API patch-set
for live migration again.

Please note that this guest kernel and guest API patch-set which is
essentially guest hypercall invocations are needed for both in-guest
migration and PSP based migration as they are used for tracking guest
page(s) encryption status.

Thanks,
Ashish
diff mbox series

Patch

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index a014c9bb5066..a55712ee58a1 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -869,10 +869,35 @@  static void __init kvm_init_platform(void)
 	if (sev_active() &&
 	    kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) {
 		unsigned long nr_pages;
+		int i;
 
 		pv_ops.mmu.notify_page_enc_status_changed =
 			kvm_sev_hc_page_enc_status;
 
+		/*
+		 * Reset the host's shared pages list related to kernel
+		 * specific page encryption status settings before we load a
+		 * new kernel by kexec. Reset the page encryption status
+		 * during early boot intead of just before kexec to avoid SMP
+		 * races during kvm_pv_guest_cpu_reboot().
+		 * NOTE: We cannot reset the complete shared pages list
+		 * here as we need to retain the UEFI/OVMF firmware
+		 * specific settings.
+		 */
+
+		for (i = 0; i < e820_table->nr_entries; i++) {
+			struct e820_entry *entry = &e820_table->entries[i];
+
+			if (entry->type != E820_TYPE_RAM)
+				continue;
+
+			nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
+
+			kvm_hypercall3(KVM_HC_MAP_GPA_RANGE, entry->addr,
+				       nr_pages,
+				       KVM_MAP_GPA_RANGE_ENCRYPTED | KVM_MAP_GPA_RANGE_PAGE_SZ_4K);
+		}
+
 		/*
 		 * Ensure that _bss_decrypted section is marked as decrypted in the
 		 * shared pages list.