Message ID | 20211122154714.xaoxok3fpk5bgznz@linutronix.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again. | expand |
On 22/11/21 16:47, Sebastian Andrzej Siewior wrote: > From: "Longpeng(Mike)" <longpeng2@huawei.com> > > A CPU will not show up in virtualized environment which includes an > Enclave. The VM splits its resources into a primary VM and a Enclave > VM. While the Enclave is active, the hypervisor will ignore all requests > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. > The kernel will wait up to ten seconds for CPU to show up > (do_boot_cpu()) and then rollback the hotplug state back to > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. > For my own education, this is talking about *host* CPU hotplug, right? > After the Enclave VM terminates, the primary VM can bring up the CPU > again. > > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. > > [bigeasy: Rewrite commit description.] > > Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Link: https://lore.kernel.org/r/20210901051143.2752-1-longpeng2@huawei.com > --- > > For XEN: this changes the behaviour as it allows to invoke > cpu_initialize_context() again should it have have earlier. I *think* > this is okay and would to bring up the CPU again should the memory > allocation in cpu_initialize_context() fail. Virt stuff notwithstanding, that looks OK to me. Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> That said, AFAICT only powerpc makes actual use of the state being set to CPU_UP_PREPARE, it looks to be needless bookkeeping for everyone else (and there's archs that seem happy using only CPU_DEAD / CPU_POST_DEAD). > > kernel/smpboot.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/kernel/smpboot.c b/kernel/smpboot.c > index f6bc0bc8a2aab..34958d7fe2c1c 100644 > --- a/kernel/smpboot.c > +++ b/kernel/smpboot.c > @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu) > */ > return -EAGAIN; > > + case CPU_UP_PREPARE: > + /* > + * Timeout while waiting for the CPU to show up. Allow to try > + * again later. > + */ > + return 0; > + > default: > > /* Should not happen. Famous last words. */ > -- > 2.33.1
Tested-by: Dongli Zhang <dongli.zhang@oracle.com> The bug fixed by commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") may leave the cpu_hotplug_state at CPU_UP_PREPARE. As a result, to online this CPU again (even after removal) is always failed. I have tested that this patch works well to workaround the issue, by introducing either a mdeley(11000) or while(1); to start_secondary(). That is, to online the same CPU again is successful even after initial do_boot_cpu() failure. 1. add mdelay(11000) or while(1); to the start_secondary(). 2. to online CPU is failed at do_boot_cpu(). 3. to online CPU again is failed without this patch. # echo 1 > /sys/devices/system/cpu/cpu4/online -su: echo: write error: Input/output error 4. to online CPU again is successful with this patch. Thank you very much! Dongli Zhang On 11/22/21 7:47 AM, Sebastian Andrzej Siewior wrote: > From: "Longpeng(Mike)" <longpeng2@huawei.com> > > A CPU will not show up in virtualized environment which includes an > Enclave. The VM splits its resources into a primary VM and a Enclave > VM. While the Enclave is active, the hypervisor will ignore all requests > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. > The kernel will wait up to ten seconds for CPU to show up > (do_boot_cpu()) and then rollback the hotplug state back to > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. > > After the Enclave VM terminates, the primary VM can bring up the CPU > again. > > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. > > [bigeasy: Rewrite commit description.] > > Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Link: https://urldefense.com/v3/__https://lore.kernel.org/r/20210901051143.2752-1-longpeng2@huawei.com__;!!ACWV5N9M2RV99hQ!d4sCCXMQV7ekFwpd21vo1_9K-m5h4VZ-gE8Z62PLL58DT4VJ6StH57TR_KpBdbwhBE0$ > --- > > For XEN: this changes the behaviour as it allows to invoke > cpu_initialize_context() again should it have have earlier. I *think* > this is okay and would to bring up the CPU again should the memory > allocation in cpu_initialize_context() fail. > > kernel/smpboot.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/kernel/smpboot.c b/kernel/smpboot.c > index f6bc0bc8a2aab..34958d7fe2c1c 100644 > --- a/kernel/smpboot.c > +++ b/kernel/smpboot.c > @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu) > */ > return -EAGAIN; > > + case CPU_UP_PREPARE: > + /* > + * Timeout while waiting for the CPU to show up. Allow to try > + * again later. > + */ > + return 0; > + > default: > > /* Should not happen. Famous last words. */ >
> -----Original Message----- > From: Dongli Zhang [mailto:dongli.zhang@oracle.com] > Sent: Wednesday, November 24, 2021 5:22 AM > To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>; Longpeng (Mike, Cloud > Infrastructure Service Product Dept.) <longpeng2@huawei.com> > Cc: linux-kernel@vger.kernel.org; Gonglei (Arei) <arei.gonglei@huawei.com>; > x86@kernel.org; xen-devel@lists.xenproject.org; Peter Zijlstra > <peterz@infradead.org>; Ingo Molnar <mingo@kernel.org>; Valentin Schneider > <valentin.schneider@arm.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>; > Juergen Gross <jgross@suse.com>; Stefano Stabellini <sstabellini@kernel.org>; > Thomas Gleixner <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; Borislav > Petkov <bp@alien8.de>; Dave Hansen <dave.hansen@linux.intel.com>; H. Peter > Anvin <hpa@zytor.com> > Subject: Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be > brought up again. > > Tested-by: Dongli Zhang <dongli.zhang@oracle.com> > > > The bug fixed by commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to > monotonic raw clock") may leave the cpu_hotplug_state at CPU_UP_PREPARE. As a > result, to online this CPU again (even after removal) is always failed. > > I have tested that this patch works well to workaround the issue, by introducing > either a mdeley(11000) or while(1); to start_secondary(). That is, to online > the > same CPU again is successful even after initial do_boot_cpu() failure. > > 1. add mdelay(11000) or while(1); to the start_secondary(). > > 2. to online CPU is failed at do_boot_cpu(). > Thanks for your testing :) Does the cpu4 spin in wait_for_master_cpu() in your case ? > 3. to online CPU again is failed without this patch. > > # echo 1 > /sys/devices/system/cpu/cpu4/online > -su: echo: write error: Input/output error > > 4. to online CPU again is successful with this patch. > > Thank you very much! > > Dongli Zhang > > On 11/22/21 7:47 AM, Sebastian Andrzej Siewior wrote: > > From: "Longpeng(Mike)" <longpeng2@huawei.com> > > > > A CPU will not show up in virtualized environment which includes an > > Enclave. The VM splits its resources into a primary VM and a Enclave > > VM. While the Enclave is active, the hypervisor will ignore all requests > > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. > > The kernel will wait up to ten seconds for CPU to show up > > (do_boot_cpu()) and then rollback the hotplug state back to > > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is > > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. > > > > After the Enclave VM terminates, the primary VM can bring up the CPU > > again. > > > > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. > > > > [bigeasy: Rewrite commit description.] > > > > Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > Link: > https://urldefense.com/v3/__https://lore.kernel.org/r/20210901051143.2752-1 > -longpeng2@huawei.com__;!!ACWV5N9M2RV99hQ!d4sCCXMQV7ekFwpd21vo1_9K-m5h4VZ-g > E8Z62PLL58DT4VJ6StH57TR_KpBdbwhBE0$ > > --- > > > > For XEN: this changes the behaviour as it allows to invoke > > cpu_initialize_context() again should it have have earlier. I *think* > > this is okay and would to bring up the CPU again should the memory > > allocation in cpu_initialize_context() fail. > > > > kernel/smpboot.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/kernel/smpboot.c b/kernel/smpboot.c > > index f6bc0bc8a2aab..34958d7fe2c1c 100644 > > --- a/kernel/smpboot.c > > +++ b/kernel/smpboot.c > > @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu) > > */ > > return -EAGAIN; > > > > + case CPU_UP_PREPARE: > > + /* > > + * Timeout while waiting for the CPU to show up. Allow to try > > + * again later. > > + */ > > + return 0; > > + > > default: > > > > /* Should not happen. Famous last words. */ > >
> -----Original Message----- > From: Valentin Schneider [mailto:valentin.schneider@arm.com] > Sent: Wednesday, November 24, 2021 2:14 AM > To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>; Longpeng (Mike, Cloud > Infrastructure Service Product Dept.) <longpeng2@huawei.com> > Cc: linux-kernel@vger.kernel.org; Gonglei (Arei) <arei.gonglei@huawei.com>; > x86@kernel.org; xen-devel@lists.xenproject.org; Peter Zijlstra > <peterz@infradead.org>; Ingo Molnar <mingo@kernel.org>; Boris Ostrovsky > <boris.ostrovsky@oracle.com>; Juergen Gross <jgross@suse.com>; Stefano > Stabellini <sstabellini@kernel.org>; Thomas Gleixner <tglx@linutronix.de>; > Ingo Molnar <mingo@redhat.com>; Borislav Petkov <bp@alien8.de>; Dave Hansen > <dave.hansen@linux.intel.com>; H. Peter Anvin <hpa@zytor.com> > Subject: Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be > brought up again. > > On 22/11/21 16:47, Sebastian Andrzej Siewior wrote: > > From: "Longpeng(Mike)" <longpeng2@huawei.com> > > > > A CPU will not show up in virtualized environment which includes an > > Enclave. The VM splits its resources into a primary VM and a Enclave > > VM. While the Enclave is active, the hypervisor will ignore all requests > > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. > > The kernel will wait up to ten seconds for CPU to show up > > (do_boot_cpu()) and then rollback the hotplug state back to > > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is > > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. > > > > For my own education, this is talking about *host* CPU hotplug, right? > It's about the *guest* CPU hotplug. 1. Users in Primary VM: Split out vcpuX (offline from Primary VM) for Enclave VM 2. Hypervisor: Set a flag for vcpuX, all requests from Primary VM to bring up vcpuX will be ignore. 3. Users in Primary VM: echo 1 > .../vcpuX/online would fail and leave the CPU state of vcpuX in CPU_UP_PREPARE. 4. Users in Primary VM terminate the Enclave VM: Hypervisor should clear the flag (set in step 2) of vcpuX, so the vcpuX can continue to receive requests. 5. Users in Primary VM: Try to online the vcpuX again (expect success), but it's always failed. > > After the Enclave VM terminates, the primary VM can bring up the CPU > > again. > > > > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. > > > > [bigeasy: Rewrite commit description.] > > > > Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > Link: https://lore.kernel.org/r/20210901051143.2752-1-longpeng2@huawei.com > > --- > > > > For XEN: this changes the behaviour as it allows to invoke > > cpu_initialize_context() again should it have have earlier. I *think* > > this is okay and would to bring up the CPU again should the memory > > allocation in cpu_initialize_context() fail. > > Virt stuff notwithstanding, that looks OK to me. > Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> > > That said, AFAICT only powerpc makes actual use of the state being set to > CPU_UP_PREPARE, it looks to be needless bookkeeping for everyone else (and > there's archs that seem happy using only CPU_DEAD / CPU_POST_DEAD). > > > > > kernel/smpboot.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/kernel/smpboot.c b/kernel/smpboot.c > > index f6bc0bc8a2aab..34958d7fe2c1c 100644 > > --- a/kernel/smpboot.c > > +++ b/kernel/smpboot.c > > @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu) > > */ > > return -EAGAIN; > > > > + case CPU_UP_PREPARE: > > + /* > > + * Timeout while waiting for the CPU to show up. Allow to try > > + * again later. > > + */ > > + return 0; > > + > > default: > > > > /* Should not happen. Famous last words. */ > > -- > > 2.33.1
On 11/23/21 3:50 PM, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote: > > >> -----Original Message----- >> From: Dongli Zhang [mailto:dongli.zhang@oracle.com] >> Sent: Wednesday, November 24, 2021 5:22 AM >> To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>; Longpeng (Mike, Cloud >> Infrastructure Service Product Dept.) <longpeng2@huawei.com> >> Cc: linux-kernel@vger.kernel.org; Gonglei (Arei) <arei.gonglei@huawei.com>; >> x86@kernel.org; xen-devel@lists.xenproject.org; Peter Zijlstra >> <peterz@infradead.org>; Ingo Molnar <mingo@kernel.org>; Valentin Schneider >> <valentin.schneider@arm.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>; >> Juergen Gross <jgross@suse.com>; Stefano Stabellini <sstabellini@kernel.org>; >> Thomas Gleixner <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; Borislav >> Petkov <bp@alien8.de>; Dave Hansen <dave.hansen@linux.intel.com>; H. Peter >> Anvin <hpa@zytor.com> >> Subject: Re: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be >> brought up again. >> >> Tested-by: Dongli Zhang <dongli.zhang@oracle.com> >> >> >> The bug fixed by commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to >> monotonic raw clock") may leave the cpu_hotplug_state at CPU_UP_PREPARE. As a >> result, to online this CPU again (even after removal) is always failed. >> >> I have tested that this patch works well to workaround the issue, by introducing >> either a mdeley(11000) or while(1); to start_secondary(). That is, to online >> the >> same CPU again is successful even after initial do_boot_cpu() failure. >> >> 1. add mdelay(11000) or while(1); to the start_secondary(). >> >> 2. to online CPU is failed at do_boot_cpu(). >> > > Thanks for your testing :) > > Does the cpu4 spin in wait_for_master_cpu() in your case ? I did two tests. TEST 1. I added "mdelay(11000);" as the first line in start_secondary(). Once the issue was encountered, the RIP of CPU=4 was ffffffff8c242021 (from QEMU's "info registers -a") which was in the range of wait_for_master_cpu(). # cat /proc/kallsyms | grep ffffffff8c2420 ffffffff8c242010 t wait_for_master_cpu ffffffff8c242030 T load_fixmap_gdt ffffffff8c242060 T native_write_cr4 ffffffff8c2420c0 T cr4_init TEST 2. I added "while(true);" as the first line in start_secondary(). Once the issue was encountered, the RIP of CPU=4 was ffffffff91654c0a (from QEMU's "info registers -a") which was in the range of start_secondary(). # cat /proc/kallsyms | grep ffffffff91654c0 ffffffff91654c00 t start_secondary Dongli Zhang > >> 3. to online CPU again is failed without this patch. >> >> # echo 1 > /sys/devices/system/cpu/cpu4/online >> -su: echo: write error: Input/output error >> >> 4. to online CPU again is successful with this patch. >> >> Thank you very much! >> >> Dongli Zhang >> >> On 11/22/21 7:47 AM, Sebastian Andrzej Siewior wrote: >>> From: "Longpeng(Mike)" <longpeng2@huawei.com> >>> >>> A CPU will not show up in virtualized environment which includes an >>> Enclave. The VM splits its resources into a primary VM and a Enclave >>> VM. While the Enclave is active, the hypervisor will ignore all requests >>> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. >>> The kernel will wait up to ten seconds for CPU to show up >>> (do_boot_cpu()) and then rollback the hotplug state back to >>> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is >>> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. >>> >>> After the Enclave VM terminates, the primary VM can bring up the CPU >>> again. >>> >>> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. >>> >>> [bigeasy: Rewrite commit description.] >>> >>> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> >>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> >>> Link: >> https://urldefense.com/v3/__https://lore.kernel.org/r/20210901051143.2752-1 >> -longpeng2@huawei.com__;!!ACWV5N9M2RV99hQ!d4sCCXMQV7ekFwpd21vo1_9K-m5h4VZ-g >> E8Z62PLL58DT4VJ6StH57TR_KpBdbwhBE0$ >>> --- >>> >>> For XEN: this changes the behaviour as it allows to invoke >>> cpu_initialize_context() again should it have have earlier. I *think* >>> this is okay and would to bring up the CPU again should the memory >>> allocation in cpu_initialize_context() fail. >>> >>> kernel/smpboot.c | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/kernel/smpboot.c b/kernel/smpboot.c >>> index f6bc0bc8a2aab..34958d7fe2c1c 100644 >>> --- a/kernel/smpboot.c >>> +++ b/kernel/smpboot.c >>> @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu) >>> */ >>> return -EAGAIN; >>> >>> + case CPU_UP_PREPARE: >>> + /* >>> + * Timeout while waiting for the CPU to show up. Allow to try >>> + * again later. >>> + */ >>> + return 0; >>> + >>> default: >>> >>> /* Should not happen. Famous last words. */ >>>
Hi, > -----Original Message----- > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of > Sebastian Andrzej Siewior > Sent: Monday, November 22, 2021 11:47 PM > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.) > <longpeng2@huawei.com> > Cc: linux-kernel@vger.kernel.org; Gonglei (Arei) <arei.gonglei@huawei.com>; > x86@kernel.org; xen-devel@lists.xenproject.org; Peter Zijlstra > <peterz@infradead.org>; Ingo Molnar <mingo@kernel.org>; Valentin > Schneider <Valentin.Schneider@arm.com>; Boris Ostrovsky > <boris.ostrovsky@oracle.com>; Juergen Gross <jgross@suse.com>; Stefano > Stabellini <sstabellini@kernel.org>; Thomas Gleixner <tglx@linutronix.de>; > Ingo Molnar <mingo@redhat.com>; Borislav Petkov <bp@alien8.de>; Dave > Hansen <dave.hansen@linux.intel.com>; H. Peter Anvin <hpa@zytor.com> > Subject: [PATCH] cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be > brought up again. > > From: "Longpeng(Mike)" <longpeng2@huawei.com> > > A CPU will not show up in virtualized environment which includes an > Enclave. The VM splits its resources into a primary VM and a Enclave > VM. While the Enclave is active, the hypervisor will ignore all requests > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. > The kernel will wait up to ten seconds for CPU to show up > (do_boot_cpu()) and then rollback the hotplug state back to > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. > > After the Enclave VM terminates, the primary VM can bring up the CPU > again. > > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. > > [bigeasy: Rewrite commit description.] > > Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Link: https://lore.kernel.org/r/20210901051143.2752-1- > longpeng2@huawei.com > --- > > For XEN: this changes the behaviour as it allows to invoke > cpu_initialize_context() again should it have have earlier. I *think* > this is okay and would to bring up the CPU again should the memory > allocation in cpu_initialize_context() fail. > > kernel/smpboot.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/kernel/smpboot.c b/kernel/smpboot.c > index f6bc0bc8a2aab..34958d7fe2c1c 100644 > --- a/kernel/smpboot.c > +++ b/kernel/smpboot.c > @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu) > */ > return -EAGAIN; > > + case CPU_UP_PREPARE: > + /* > + * Timeout while waiting for the CPU to show up. Allow to try > + * again later. > + */ > + return 0; > + > default: > > /* Should not happen. Famous last words. */ > -- > 2.33.1 > Reviewed-by: Henry Wang <Henry.Wang@arm.com> Kind regards, Henry
On 24/11/21 00:19, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote: >> -----Original Message----- >> From: Valentin Schneider [mailto:valentin.schneider@arm.com] >> For my own education, this is talking about *host* CPU hotplug, right? >> > > It's about the *guest* CPU hotplug. > > 1. Users in Primary VM: > Split out vcpuX (offline from Primary VM) for Enclave VM > > 2. Hypervisor: > Set a flag for vcpuX, all requests from Primary VM to bring up vcpuX > will be ignore. > > 3. Users in Primary VM: > echo 1 > .../vcpuX/online would fail and leave the CPU state of vcpuX > in CPU_UP_PREPARE. > > 4. Users in Primary VM terminate the Enclave VM: > Hypervisor should clear the flag (set in step 2) of vcpuX, so the vcpuX > can continue to receive requests. > > 5. Users in Primary VM: > Try to online the vcpuX again (expect success), but it's always failed. > If I followed the rabbit hole in the right direction, this is about: ff8a4d3e3a99 ("nitro_enclaves: Add logic for setting an enclave vCPU") So there's a 1:1 CPU:vCPU mapping and an Enclave carves a chunk out of the Primary VM... Thanks for the detailed explanation!
On Mon, Nov 22 2021 at 16:47, Sebastian Andrzej Siewior wrote: > From: "Longpeng(Mike)" <longpeng2@huawei.com> > > A CPU will not show up in virtualized environment which includes an > Enclave. The VM splits its resources into a primary VM and a Enclave > VM. While the Enclave is active, the hypervisor will ignore all requests > to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. > The kernel will wait up to ten seconds for CPU to show up > (do_boot_cpu()) and then rollback the hotplug state back to > CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is > set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. > > After the Enclave VM terminates, the primary VM can bring up the CPU > again. > > Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. > > [bigeasy: Rewrite commit description.] > > Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Link: https://lore.kernel.org/r/20210901051143.2752-1-longpeng2@huawei.com > --- > > For XEN: this changes the behaviour as it allows to invoke > cpu_initialize_context() again should it have have earlier. I *think* > this is okay and would to bring up the CPU again should the memory > allocation in cpu_initialize_context() fail. Any comment from XEN folks? Thanks, tglx
On 11/24/21 5:54 PM, Thomas Gleixner wrote: > On Mon, Nov 22 2021 at 16:47, Sebastian Andrzej Siewior wrote: >> From: "Longpeng(Mike)" <longpeng2@huawei.com> >> >> A CPU will not show up in virtualized environment which includes an >> Enclave. The VM splits its resources into a primary VM and a Enclave >> VM. While the Enclave is active, the hypervisor will ignore all requests >> to bring up a CPU and this CPU will remain in CPU_UP_PREPARE state. >> The kernel will wait up to ten seconds for CPU to show up >> (do_boot_cpu()) and then rollback the hotplug state back to >> CPUHP_OFFLINE leaving the CPU state in CPU_UP_PREPARE. The CPU state is >> set back to CPUHP_TEARDOWN_CPU during the CPU_POST_DEAD stage. >> >> After the Enclave VM terminates, the primary VM can bring up the CPU >> again. >> >> Allow to bring up the CPU if it is in the CPU_UP_PREPARE state. >> >> [bigeasy: Rewrite commit description.] >> >> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> >> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> >> Link: https://lore.kernel.org/r/20210901051143.2752-1-longpeng2@huawei.com >> --- >> >> For XEN: this changes the behaviour as it allows to invoke >> cpu_initialize_context() again should it have have earlier. I *think* >> this is okay and would to bring up the CPU again should the memory >> allocation in cpu_initialize_context() fail. > Any comment from XEN folks? If memory allocation in cpu_initialize_context() fails we will not be able to bring up the VCPU because xen_cpu_initialized_map bit at the top of that routine will already have been set. We will BUG in xen_pv_cpu_up() on second (presumably successful) attempt because nothing for that VCPU would be initialized. This can in principle be fixed by moving allocation to the top of the routine and freeing context if the bit in the bitmap is already set. Having said that, allocation really should not fail: for PV guests we first bring max number of VCPUs up and then offline them down to however many need to run. I think if we fail allocation during boot we are going to have a really bad day anyway. -boris
On 2021-11-24 21:17:34 [-0500], Boris Ostrovsky wrote: > > On 11/24/21 5:54 PM, Thomas Gleixner wrote: > > Any comment from XEN folks? > > > If memory allocation in cpu_initialize_context() fails we will not be > able to bring up the VCPU because xen_cpu_initialized_map bit at the > top of that routine will already have been set. We will BUG in > xen_pv_cpu_up() on second (presumably successful) attempt because > nothing for that VCPU would be initialized. This can in principle be > fixed by moving allocation to the top of the routine and freeing > context if the bit in the bitmap is already set. > > > Having said that, allocation really should not fail: for PV guests we > first bring max number of VCPUs up and then offline them down to > however many need to run. I think if we fail allocation during boot we > are going to have a really bad day anyway. > So can we keep the patch as-is or are some changes needed? > -boris Sebastian
On 12/6/21 6:25 AM, Sebastian Andrzej Siewior wrote: > On 2021-11-24 21:17:34 [-0500], Boris Ostrovsky wrote: >> On 11/24/21 5:54 PM, Thomas Gleixner wrote: >>> Any comment from XEN folks? >> >> If memory allocation in cpu_initialize_context() fails we will not be >> able to bring up the VCPU because xen_cpu_initialized_map bit at the >> top of that routine will already have been set. We will BUG in >> xen_pv_cpu_up() on second (presumably successful) attempt because >> nothing for that VCPU would be initialized. This can in principle be >> fixed by moving allocation to the top of the routine and freeing >> context if the bit in the bitmap is already set. >> >> >> Having said that, allocation really should not fail: for PV guests we >> first bring max number of VCPUs up and then offline them down to >> however many need to run. I think if we fail allocation during boot we >> are going to have a really bad day anyway. >> > So can we keep the patch as-is or are some changes needed? I think for the sake of completeness we could add diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c index 6a8f3b53ab83..86368fcef466 100644 --- a/arch/x86/xen/smp_pv.c +++ b/arch/x86/xen/smp_pv.c @@ -277,8 +277,11 @@ cpu_initialize_context(unsigned int cpu, struct task_struct *idle) return 0; ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL); - if (ctxt == NULL) + if (ctxt == NULL) { + cpumask_clear_cpu(cpu, xen_cpu_initialized_map); + cpumask_clear_cpu(cpu, cpu_callout_mask); return -ENOMEM; + } gdt = get_cpu_gdt_rw(cpu); -boris
diff --git a/kernel/smpboot.c b/kernel/smpboot.c index f6bc0bc8a2aab..34958d7fe2c1c 100644 --- a/kernel/smpboot.c +++ b/kernel/smpboot.c @@ -392,6 +392,13 @@ int cpu_check_up_prepare(int cpu) */ return -EAGAIN; + case CPU_UP_PREPARE: + /* + * Timeout while waiting for the CPU to show up. Allow to try + * again later. + */ + return 0; + default: /* Should not happen. Famous last words. */