diff mbox

[7/8] arm64/kexec: Add checks for KVM

Message ID a3353cf2ac47f49c20bbbb88405301cbaa359805.1421449714.git.geoff@infradead.org (mailing list archive)
State New, archived
Headers show

Commit Message

Geoff Levand Jan. 17, 2015, 12:23 a.m. UTC
Add runtime checks that fail the arm64 kexec syscall for situations that would
result in system instability do to problems in the KVM kernel support.
These checks should be removed when the KVM problems are resolved fixed.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Mark Rutland Jan. 26, 2015, 7:19 p.m. UTC | #1
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
> Add runtime checks that fail the arm64 kexec syscall for situations that would
> result in system instability do to problems in the KVM kernel support.
> These checks should be removed when the KVM problems are resolved fixed.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index 3d84759..a36459d 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -16,6 +16,9 @@
>  #include <asm/cacheflush.h>
>  #include <asm/system_misc.h>
>  
> +/* TODO: Remove this include when KVM can support a kexec reboot. */
> +#include <asm/virt.h>
> +
>  /* Global variables for the relocate_kernel routine. */
>  extern const unsigned char relocate_new_kernel[];
>  extern const unsigned long relocate_new_kernel_size;
> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
>  
>  	kexec_image_info(image);
>  
> +	/* TODO: Remove this message when KVM can support a kexec reboot. */
> +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
> +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
> +			__func__);
> +		return -ENOSYS;
> +	}

If you really don't want to implement KVM teardown, surely this should
be at the start of the series, so we don't have a point in the middle
where things may explode in this case?

Mark.
Christoffer Dall Jan. 26, 2015, 8:39 p.m. UTC | #2
On Mon, Jan 26, 2015 at 8:19 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
>> Add runtime checks that fail the arm64 kexec syscall for situations that would
>> result in system instability do to problems in the KVM kernel support.
>> These checks should be removed when the KVM problems are resolved fixed.
>>
>> Signed-off-by: Geoff Levand <geoff@infradead.org>
>> ---
>>  arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index 3d84759..a36459d 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -16,6 +16,9 @@
>>  #include <asm/cacheflush.h>
>>  #include <asm/system_misc.h>
>>
>> +/* TODO: Remove this include when KVM can support a kexec reboot. */
>> +#include <asm/virt.h>
>> +
>>  /* Global variables for the relocate_kernel routine. */
>>  extern const unsigned char relocate_new_kernel[];
>>  extern const unsigned long relocate_new_kernel_size;
>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
>>
>>       kexec_image_info(image);
>>
>> +     /* TODO: Remove this message when KVM can support a kexec reboot. */
>> +     if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
>> +             pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
>> +                     __func__);
>> +             return -ENOSYS;
>> +     }
>
> If you really don't want to implement KVM teardown, surely this should
> be at the start of the series, so we don't have a point in the middle
> where things may explode in this case?
>
So this caters to support systems that don't support KVM (don't boot
in EL2) but is configured with both KVM and KEXEC?

Why not just make the kexec config dependent on !KVM ?

-Christoffer
Geoff Levand Jan. 26, 2015, 8:58 p.m. UTC | #3
On Mon, 2015-01-26 at 21:39 +0100, Christoffer Dall wrote:
> On Mon, Jan 26, 2015 at 8:19 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
> >> Add runtime checks that fail the arm64 kexec syscall for situations that would
> >> result in system instability do to problems in the KVM kernel support.
> >> These checks should be removed when the KVM problems are resolved fixed.
> >>
> >> Signed-off-by: Geoff Levand <geoff@infradead.org>
> >> ---
> >>  arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> >> index 3d84759..a36459d 100644
> >> --- a/arch/arm64/kernel/machine_kexec.c
> >> +++ b/arch/arm64/kernel/machine_kexec.c
> >> @@ -16,6 +16,9 @@
> >>  #include <asm/cacheflush.h>
> >>  #include <asm/system_misc.h>
> >>
> >> +/* TODO: Remove this include when KVM can support a kexec reboot. */
> >> +#include <asm/virt.h>
> >> +
> >>  /* Global variables for the relocate_kernel routine. */
> >>  extern const unsigned char relocate_new_kernel[];
> >>  extern const unsigned long relocate_new_kernel_size;
> >> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
> >>
> >>       kexec_image_info(image);
> >>
> >> +     /* TODO: Remove this message when KVM can support a kexec reboot. */
> >> +     if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
> >> +             pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
> >> +                     __func__);
> >> +             return -ENOSYS;
> >> +     }
> >
> > If you really don't want to implement KVM teardown, surely this should
> > be at the start of the series, so we don't have a point in the middle
> > where things may explode in this case?
> >
> So this caters to support systems that don't support KVM (don't boot
> in EL2) but is configured with both KVM and KEXEC?
> 
> Why not just make the kexec config dependent on !KVM ?

Sure, that would work.  I put it this way so we can get build testing of
kexec since the arm64 defconfig has KVM set.

-Geoff
Geoff Levand Jan. 26, 2015, 9 p.m. UTC | #4
Hi Mark,

On Mon, 2015-01-26 at 19:19 +0000, Mark Rutland wrote:
> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
> > Add runtime checks that fail the arm64 kexec syscall for situations that would
> > result in system instability do to problems in the KVM kernel support.
> > These checks should be removed when the KVM problems are resolved fixed.
> > 
> > Signed-off-by: Geoff Levand <geoff@infradead.org>
> > ---
> >  arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> > index 3d84759..a36459d 100644
> > --- a/arch/arm64/kernel/machine_kexec.c
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -16,6 +16,9 @@
> >  #include <asm/cacheflush.h>
> >  #include <asm/system_misc.h>
> >  
> > +/* TODO: Remove this include when KVM can support a kexec reboot. */
> > +#include <asm/virt.h>
> > +
> >  /* Global variables for the relocate_kernel routine. */
> >  extern const unsigned char relocate_new_kernel[];
> >  extern const unsigned long relocate_new_kernel_size;
> > @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
> >  
> >  	kexec_image_info(image);
> >  
> > +	/* TODO: Remove this message when KVM can support a kexec reboot. */
> > +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
> > +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
> > +			__func__);
> > +		return -ENOSYS;
> > +	}
> 
> If you really don't want to implement KVM teardown, surely this should
> be at the start of the series, so we don't have a point in the middle
> where things may explode in this case?

Yes, you're right.  I'm hoping we can get the KVM fix done soon so this
is not needed.

-Geoff
AKASHI Takahiro Jan. 29, 2015, 9:36 a.m. UTC | #5
Hello,

On 01/27/2015 04:19 AM, Mark Rutland wrote:
> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
>> Add runtime checks that fail the arm64 kexec syscall for situations that would
>> result in system instability do to problems in the KVM kernel support.
>> These checks should be removed when the KVM problems are resolved fixed.
>>
>> Signed-off-by: Geoff Levand <geoff@infradead.org>
>> ---
>>   arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index 3d84759..a36459d 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -16,6 +16,9 @@
>>   #include <asm/cacheflush.h>
>>   #include <asm/system_misc.h>
>>
>> +/* TODO: Remove this include when KVM can support a kexec reboot. */
>> +#include <asm/virt.h>
>> +
>>   /* Global variables for the relocate_kernel routine. */
>>   extern const unsigned char relocate_new_kernel[];
>>   extern const unsigned long relocate_new_kernel_size;
>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
>>
>>   	kexec_image_info(image);
>>
>> +	/* TODO: Remove this message when KVM can support a kexec reboot. */
>> +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
>> +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
>> +			__func__);
>> +		return -ENOSYS;
>> +	}
>
> If you really don't want to implement KVM teardown, surely this should
> be at the start of the series, so we don't have a point in the middle
> where things may explode in this case?

I'm going to fix this KVM issue (teardown) in cooperation with Geoff.

Looking into kvm init code, kvm_arch_init() in particular,
I guess that teardown function (kvm_arch_exit()) should do
   (reverting kvm_timer_hyp_init() per cpu)
- stop arch timer

   (reverting cpu_init_hyp_mode() per cpu)
- flush TLB
- jump into identical mapping (using boot_hyp_pgd?)
- disable MMU?
- restore vbar_el2 to __hyp_stub_vectors (or NULL?)

   (reverting kvm_mmu_init())
- Do we need to free page tables and a bounce(trampoline) page?

Is this good enough for safely shutting down kvm?
Do I miss anything essential, or can I skip anyhing?

I really appreciate your comments.
-Takahiro AKASHI

> Mark.
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
AKASHI Takahiro Jan. 29, 2015, 9:57 a.m. UTC | #6
Hello,

On 01/27/2015 04:19 AM, Mark Rutland wrote:
> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
>> Add runtime checks that fail the arm64 kexec syscall for situations that would
>> result in system instability do to problems in the KVM kernel support.
>> These checks should be removed when the KVM problems are resolved fixed.
>>
>> Signed-off-by: Geoff Levand <geoff@infradead.org>
>> ---
>>   arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index 3d84759..a36459d 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -16,6 +16,9 @@
>>   #include <asm/cacheflush.h>
>>   #include <asm/system_misc.h>
>>
>> +/* TODO: Remove this include when KVM can support a kexec reboot. */
>> +#include <asm/virt.h>
>> +
>>   /* Global variables for the relocate_kernel routine. */
>>   extern const unsigned char relocate_new_kernel[];
>>   extern const unsigned long relocate_new_kernel_size;
>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
>>
>>   	kexec_image_info(image);
>>
>> +	/* TODO: Remove this message when KVM can support a kexec reboot. */
>> +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
>> +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
>> +			__func__);
>> +		return -ENOSYS;
>> +	}
>
> If you really don't want to implement KVM teardown, surely this should
> be at the start of the series, so we don't have a point in the middle
> where things may explode in this case?

I'm going to fix this KVM issue (teardown) in cooperation with Geoff.

Looking into kvm init code, kvm_arch_init() in particular,
I guess that teardown function (kvm_arch_exit()) should do
   (reverting kvm_timer_hyp_init() per cpu)
- stop arch timer

   (reverting cpu_init_hyp_mode() per cpu)
- flush TLB
- jump into identical mapping (using boot_hyp_pgd?)
- disable MMU?
- restore vbar_el2 to __hyp_stub_vectors (or NULL?)

   (reverting kvm_mmu_init())
- Do we need to free page tables and a bounce(trampoline) page?

Is this good enough for safely shutting down kvm?
Do I miss anything essential, or can I skip anyhing?

I really appreciate your comments.
-Takahiro AKASHI

> Mark.
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Marc Zyngier Jan. 29, 2015, 10:59 a.m. UTC | #7
On 29/01/15 09:57, AKASHI Takahiro wrote:
> Hello,
> 
> On 01/27/2015 04:19 AM, Mark Rutland wrote:
>> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
>>> Add runtime checks that fail the arm64 kexec syscall for situations that would
>>> result in system instability do to problems in the KVM kernel support.
>>> These checks should be removed when the KVM problems are resolved fixed.
>>>
>>> Signed-off-by: Geoff Levand <geoff@infradead.org>
>>> ---
>>>   arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
>>>   1 file changed, 10 insertions(+)
>>>
>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>> index 3d84759..a36459d 100644
>>> --- a/arch/arm64/kernel/machine_kexec.c
>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>> @@ -16,6 +16,9 @@
>>>   #include <asm/cacheflush.h>
>>>   #include <asm/system_misc.h>
>>>
>>> +/* TODO: Remove this include when KVM can support a kexec reboot. */
>>> +#include <asm/virt.h>
>>> +
>>>   /* Global variables for the relocate_kernel routine. */
>>>   extern const unsigned char relocate_new_kernel[];
>>>   extern const unsigned long relocate_new_kernel_size;
>>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
>>>
>>>   	kexec_image_info(image);
>>>
>>> +	/* TODO: Remove this message when KVM can support a kexec reboot. */
>>> +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
>>> +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
>>> +			__func__);
>>> +		return -ENOSYS;
>>> +	}
>>
>> If you really don't want to implement KVM teardown, surely this should
>> be at the start of the series, so we don't have a point in the middle
>> where things may explode in this case?
> 
> I'm going to fix this KVM issue (teardown) in cooperation with Geoff.
> 
> Looking into kvm init code, kvm_arch_init() in particular,
> I guess that teardown function (kvm_arch_exit()) should do
>    (reverting kvm_timer_hyp_init() per cpu)
> - stop arch timer

No need, there shouldn't be any guest running at that point.

>    (reverting cpu_init_hyp_mode() per cpu)
> - flush TLB
> - jump into identical mapping (using boot_hyp_pgd?)
> - disable MMU?

Yes, and for that, you need to go back to an idmap

> - restore vbar_el2 to __hyp_stub_vectors (or NULL?)

It doesn't matter, you want to stay in HYP for the next kernel.

>    (reverting kvm_mmu_init())
> - Do we need to free page tables and a bounce(trampoline) page?

I don't think that's useful, you've killed the kernel already.

> Is this good enough for safely shutting down kvm?
> Do I miss anything essential, or can I skip anyhing?

I've outlined the steps a couple of days ago there:
http://www.spinics.net/lists/arm-kernel/msg395177.html

and the outline above should give you a nice list of things to look at.
All in all, it is pretty straightforward, provided that you're careful
enough about (commit 5a677ce044f has most of the information, just read
it backward... ;-).

Thanks,

	M.
Mark Rutland Jan. 29, 2015, 6:47 p.m. UTC | #8
On Thu, Jan 29, 2015 at 10:59:45AM +0000, Marc Zyngier wrote:
> 
> On 29/01/15 09:57, AKASHI Takahiro wrote:
> > Hello,
> > 
> > On 01/27/2015 04:19 AM, Mark Rutland wrote:
> >> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
> >>> Add runtime checks that fail the arm64 kexec syscall for situations that would
> >>> result in system instability do to problems in the KVM kernel support.
> >>> These checks should be removed when the KVM problems are resolved fixed.
> >>>
> >>> Signed-off-by: Geoff Levand <geoff@infradead.org>
> >>> ---
> >>>   arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
> >>>   1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> >>> index 3d84759..a36459d 100644
> >>> --- a/arch/arm64/kernel/machine_kexec.c
> >>> +++ b/arch/arm64/kernel/machine_kexec.c
> >>> @@ -16,6 +16,9 @@
> >>>   #include <asm/cacheflush.h>
> >>>   #include <asm/system_misc.h>
> >>>
> >>> +/* TODO: Remove this include when KVM can support a kexec reboot. */
> >>> +#include <asm/virt.h>
> >>> +
> >>>   /* Global variables for the relocate_kernel routine. */
> >>>   extern const unsigned char relocate_new_kernel[];
> >>>   extern const unsigned long relocate_new_kernel_size;
> >>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
> >>>
> >>>   	kexec_image_info(image);
> >>>
> >>> +	/* TODO: Remove this message when KVM can support a kexec reboot. */
> >>> +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
> >>> +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
> >>> +			__func__);
> >>> +		return -ENOSYS;
> >>> +	}
> >>
> >> If you really don't want to implement KVM teardown, surely this should
> >> be at the start of the series, so we don't have a point in the middle
> >> where things may explode in this case?
> > 
> > I'm going to fix this KVM issue (teardown) in cooperation with Geoff.
> > 
> > Looking into kvm init code, kvm_arch_init() in particular,
> > I guess that teardown function (kvm_arch_exit()) should do
> >    (reverting kvm_timer_hyp_init() per cpu)
> > - stop arch timer
> 
> No need, there shouldn't be any guest running at that point.
> 
> >    (reverting cpu_init_hyp_mode() per cpu)
> > - flush TLB
> > - jump into identical mapping (using boot_hyp_pgd?)
> > - disable MMU?
> 
> Yes, and for that, you need to go back to an idmap
> 
> > - restore vbar_el2 to __hyp_stub_vectors (or NULL?)
> 
> It doesn't matter, you want to stay in HYP for the next kernel.

Well, it depends on how the teardown is implemented. I'd imagined we'd
have KVM tear itself down (restoring the hyp stub) as part of shut down.
Later kexec/soft_restart would call the stub to get back to EL2.

That way kexec doesn't need to know anything about KVM, and it has one
path that works regardless of whether KVM is compiled into the kernel.

Mark.
AKASHI Takahiro Jan. 30, 2015, 6:10 a.m. UTC | #9
Hello Marc, Mark

Thank you for your useful comments.
I need to look at Marc's commit(5a677ce044f) more carefully
about trampoline code.

On 01/30/2015 03:47 AM, Mark Rutland wrote:
> On Thu, Jan 29, 2015 at 10:59:45AM +0000, Marc Zyngier wrote:
>>
>> On 29/01/15 09:57, AKASHI Takahiro wrote:
>>> Hello,
>>>
>>> On 01/27/2015 04:19 AM, Mark Rutland wrote:
>>>> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
>>>>> Add runtime checks that fail the arm64 kexec syscall for situations that would
>>>>> result in system instability do to problems in the KVM kernel support.
>>>>> These checks should be removed when the KVM problems are resolved fixed.
>>>>>
>>>>> Signed-off-by: Geoff Levand <geoff@infradead.org>
>>>>> ---
>>>>>    arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
>>>>>    1 file changed, 10 insertions(+)
>>>>>
>>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>>>> index 3d84759..a36459d 100644
>>>>> --- a/arch/arm64/kernel/machine_kexec.c
>>>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>>>> @@ -16,6 +16,9 @@
>>>>>    #include <asm/cacheflush.h>
>>>>>    #include <asm/system_misc.h>
>>>>>
>>>>> +/* TODO: Remove this include when KVM can support a kexec reboot. */
>>>>> +#include <asm/virt.h>
>>>>> +
>>>>>    /* Global variables for the relocate_kernel routine. */
>>>>>    extern const unsigned char relocate_new_kernel[];
>>>>>    extern const unsigned long relocate_new_kernel_size;
>>>>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
>>>>>
>>>>>    	kexec_image_info(image);
>>>>>
>>>>> +	/* TODO: Remove this message when KVM can support a kexec reboot. */
>>>>> +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
>>>>> +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
>>>>> +			__func__);
>>>>> +		return -ENOSYS;
>>>>> +	}
>>>>
>>>> If you really don't want to implement KVM teardown, surely this should
>>>> be at the start of the series, so we don't have a point in the middle
>>>> where things may explode in this case?
>>>
>>> I'm going to fix this KVM issue (teardown) in cooperation with Geoff.
>>>
>>> Looking into kvm init code, kvm_arch_init() in particular,
>>> I guess that teardown function (kvm_arch_exit()) should do
>>>     (reverting kvm_timer_hyp_init() per cpu)
>>> - stop arch timer
>>
>> No need, there shouldn't be any guest running at that point.
>>
>>>     (reverting cpu_init_hyp_mode() per cpu)
>>> - flush TLB
>>> - jump into identical mapping (using boot_hyp_pgd?)
>>> - disable MMU?
>>
>> Yes, and for that, you need to go back to an idmap
>>
>>> - restore vbar_el2 to __hyp_stub_vectors (or NULL?)
>>
>> It doesn't matter, you want to stay in HYP for the next kernel.
>
> Well, it depends on how the teardown is implemented. I'd imagined we'd
> have KVM tear itself down (restoring the hyp stub) as part of shut down.
> Later kexec/soft_restart would call the stub to get back to EL2.
>
> That way kexec doesn't need to know anything about KVM, and it has one
> path that works regardless of whether KVM is compiled into the kernel.

Initially, I thought that we would define kvm_arch_exit() and call it
somewhere in the middle of kexec path (no idea yet).
But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??),
and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by
machine_shutdown() from kernel_kexec().
(As you know, a hook is already there for cpu online in kvm/arm.c.)

Is this the right place to put teardown function?
(I'm not sure if this hook will be called also on a boot cpu.)

Thanks,
-Takahiro AKASHI



> Mark.
>
Mark Rutland Jan. 30, 2015, 12:14 p.m. UTC | #10
On Fri, Jan 30, 2015 at 06:10:53AM +0000, AKASHI Takahiro wrote:
> Hello Marc, Mark
> 
> Thank you for your useful comments.
> I need to look at Marc's commit(5a677ce044f) more carefully
> about trampoline code.
> 
> On 01/30/2015 03:47 AM, Mark Rutland wrote:
> > On Thu, Jan 29, 2015 at 10:59:45AM +0000, Marc Zyngier wrote:
> >>
> >> On 29/01/15 09:57, AKASHI Takahiro wrote:
> >>> Hello,
> >>>
> >>> On 01/27/2015 04:19 AM, Mark Rutland wrote:
> >>>> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote:
> >>>>> Add runtime checks that fail the arm64 kexec syscall for situations that would
> >>>>> result in system instability do to problems in the KVM kernel support.
> >>>>> These checks should be removed when the KVM problems are resolved fixed.
> >>>>>
> >>>>> Signed-off-by: Geoff Levand <geoff@infradead.org>
> >>>>> ---
> >>>>>    arch/arm64/kernel/machine_kexec.c | 10 ++++++++++
> >>>>>    1 file changed, 10 insertions(+)
> >>>>>
> >>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> >>>>> index 3d84759..a36459d 100644
> >>>>> --- a/arch/arm64/kernel/machine_kexec.c
> >>>>> +++ b/arch/arm64/kernel/machine_kexec.c
> >>>>> @@ -16,6 +16,9 @@
> >>>>>    #include <asm/cacheflush.h>
> >>>>>    #include <asm/system_misc.h>
> >>>>>
> >>>>> +/* TODO: Remove this include when KVM can support a kexec reboot. */
> >>>>> +#include <asm/virt.h>
> >>>>> +
> >>>>>    /* Global variables for the relocate_kernel routine. */
> >>>>>    extern const unsigned char relocate_new_kernel[];
> >>>>>    extern const unsigned long relocate_new_kernel_size;
> >>>>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image)
> >>>>>
> >>>>>    	kexec_image_info(image);
> >>>>>
> >>>>> +	/* TODO: Remove this message when KVM can support a kexec reboot. */
> >>>>> +	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
> >>>>> +		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
> >>>>> +			__func__);
> >>>>> +		return -ENOSYS;
> >>>>> +	}
> >>>>
> >>>> If you really don't want to implement KVM teardown, surely this should
> >>>> be at the start of the series, so we don't have a point in the middle
> >>>> where things may explode in this case?
> >>>
> >>> I'm going to fix this KVM issue (teardown) in cooperation with Geoff.
> >>>
> >>> Looking into kvm init code, kvm_arch_init() in particular,
> >>> I guess that teardown function (kvm_arch_exit()) should do
> >>>     (reverting kvm_timer_hyp_init() per cpu)
> >>> - stop arch timer
> >>
> >> No need, there shouldn't be any guest running at that point.
> >>
> >>>     (reverting cpu_init_hyp_mode() per cpu)
> >>> - flush TLB
> >>> - jump into identical mapping (using boot_hyp_pgd?)
> >>> - disable MMU?
> >>
> >> Yes, and for that, you need to go back to an idmap
> >>
> >>> - restore vbar_el2 to __hyp_stub_vectors (or NULL?)
> >>
> >> It doesn't matter, you want to stay in HYP for the next kernel.
> >
> > Well, it depends on how the teardown is implemented. I'd imagined we'd
> > have KVM tear itself down (restoring the hyp stub) as part of shut down.
> > Later kexec/soft_restart would call the stub to get back to EL2.
> >
> > That way kexec doesn't need to know anything about KVM, and it has one
> > path that works regardless of whether KVM is compiled into the kernel.
> 
> Initially, I thought that we would define kvm_arch_exit() and call it
> somewhere in the middle of kexec path (no idea yet).
> But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??),
> and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by
> machine_shutdown() from kernel_kexec().
> (As you know, a hook is already there for cpu online in kvm/arm.c.)

I think it would make far more sense to have the KVM teardown entirely
contained within KVM (e.g. in kvm_arch_exit). That way there's no need
for an interdependency between KVM and kexec, and we'd be a step closer
to KVM as a module.

I'm not keen on having KVM teardown in the middle of the kexec path.

Mark.
Geoff Levand Jan. 30, 2015, 7:48 p.m. UTC | #11
Hi Takahiro.

On Fri, 2015-01-30 at 15:10 +0900, AKASHI Takahiro wrote:
> Initially, I thought that we would define kvm_arch_exit() and call it
> somewhere in the middle of kexec path (no idea yet).
> But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??),
> and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by
> machine_shutdown() from kernel_kexec().

As an initial implementation we can hook into the CPU_DYING_FROZEN
notifier sent to hyp_init_cpu_notify().  The longer term solution
should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable().

The calls to cpu_notifier(CPU_DYING_FROZEN) are part of cpu hot
plug, and independent of kexec.  If someone were to add spin-table
cpu un-plug, then it would be used for that also.  It seems we should
be able to test without kexec by using cpu hot plug.

To tear down KVM you need to get back to hyp mode, and hence
the need for HVC_CPU_SHUTDOWN.  The sequence I envisioned would
be like this:

cpu_notifier(CPU_DYING_FROZEN)
 -> kvm_cpu_shutdown() 
    prepare for hvc
    -> HVC_CPU_SHUTDOWN
       now in hyp mode, do KVM tear down, restore default exception vectors

Once the default exception vectors are restored soft_restart()
can then execute the cpu_reset routine in EL2.

Some notes are here for those with access:  https://cards.linaro.org/browse/KWG-611

-Geoff
AKASHI Takahiro Feb. 2, 2015, 8:18 a.m. UTC | #12
Geoff,

On 01/31/2015 04:48 AM, Geoff Levand wrote:
> Hi Takahiro.
>
> On Fri, 2015-01-30 at 15:10 +0900, AKASHI Takahiro wrote:
>> Initially, I thought that we would define kvm_arch_exit() and call it
>> somewhere in the middle of kexec path (no idea yet).
>> But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??),
>> and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by
>> machine_shutdown() from kernel_kexec().
>
> As an initial implementation we can hook into the CPU_DYING_FROZEN
> notifier sent to hyp_init_cpu_notify().  The longer term solution
> should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable().

Are these two different approaches? I mean,
kexec will initiate cpu hotplug:
kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu()
    -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...)

On the other hand, kvm already has a hook into kvm_arch_hardware_disable():
   (ignoring kvm_usage_count here)
kvm_cpu_hotplug(CPU_DYING) -> hardware_disable()
    -> hardware_disable_nolock() -> kvm_arch_hardware_disable()

So it seems that we don't have to add a new hook at hyp_init_cpu_notify()
if kvm_arch_hardware_disable() is properly implemented.
disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and
we should handle it in a separate way though.

Do I misunderstand anything here?


-Takahiro AKASHI

> The calls to cpu_notifier(CPU_DYING_FROZEN) are part of cpu hot
> plug, and independent of kexec.  If someone were to add spin-table
> cpu un-plug, then it would be used for that also.  It seems we should
> be able to test without kexec by using cpu hot plug.
>
> To tear down KVM you need to get back to hyp mode, and hence
> the need for HVC_CPU_SHUTDOWN.  The sequence I envisioned would
> be like this:
>
> cpu_notifier(CPU_DYING_FROZEN)
>   -> kvm_cpu_shutdown()
>      prepare for hvc
>      -> HVC_CPU_SHUTDOWN
>         now in hyp mode, do KVM tear down, restore default exception vectors
>
> Once the default exception vectors are restored soft_restart()
> can then execute the cpu_reset routine in EL2.
>
> Some notes are here for those with access:  https://cards.linaro.org/browse/KWG-611
>
> -Geoff
>
Geoff Levand Feb. 6, 2015, 12:11 a.m. UTC | #13
Hi Takahiro,

On Mon, 2015-02-02 at 17:18 +0900, AKASHI Takahiro wrote:
> On 01/31/2015 04:48 AM, Geoff Levand wrote:
> > As an initial implementation we can hook into the CPU_DYING_FROZEN
> > notifier sent to hyp_init_cpu_notify().  The longer term solution
> > should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable().
> 
> Are these two different approaches? 

Yes, these are two different solutions,  One initial work-around, and a
more involved proper solution.  Hooking into the CPU_DYING_FROZEN
notifier would be a initial fix.  The proper solution would be to move
the KVM setup to kvm_arch_hardware_enable(), and the shutdown to
kvm_arch_hardware_disable().


> kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu()
>     -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...)
> 
> On the other hand, kvm already has a hook into kvm_arch_hardware_disable():
>    (ignoring kvm_usage_count here)
> kvm_cpu_hotplug(CPU_DYING) -> hardware_disable()
>     -> hardware_disable_nolock() -> kvm_arch_hardware_disable()
> 
> So it seems that we don't have to add a new hook at hyp_init_cpu_notify()
> if kvm_arch_hardware_disable() is properly implemented.

Yes, that is correct.  But, as above, you would also need to update the
KVM startup to use kvm_arch_hardware_enable().

> disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and
> we should handle it in a separate way though.

IIRC, the secondary cpus go through PSCI on shutdown, and that path
is working OK.  Maybe I am mistaken though.

The primary cpu shutdown (hyp stubs restored) is what is missing.  The
primary cpu goes through cpu_soft_restart(), and that is what is
currently failing.

-Geoff
AKASHI Takahiro Feb. 6, 2015, 4:18 a.m. UTC | #14
On 02/06/2015 09:11 AM, Geoff Levand wrote:
> Hi Takahiro,
>
> On Mon, 2015-02-02 at 17:18 +0900, AKASHI Takahiro wrote:
>> On 01/31/2015 04:48 AM, Geoff Levand wrote:
>>> As an initial implementation we can hook into the CPU_DYING_FROZEN
>>> notifier sent to hyp_init_cpu_notify().  The longer term solution
>>> should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable().
>>
>> Are these two different approaches?
>
> Yes, these are two different solutions,  One initial work-around, and a
> more involved proper solution.  Hooking into the CPU_DYING_FROZEN
> notifier would be a initial fix.  The proper solution would be to move
> the KVM setup to kvm_arch_hardware_enable(), and the shutdown to
> kvm_arch_hardware_disable().
>
>
>> kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu()
>>      -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...)
>>
>> On the other hand, kvm already has a hook into kvm_arch_hardware_disable():
>>     (ignoring kvm_usage_count here)
>> kvm_cpu_hotplug(CPU_DYING) -> hardware_disable()
>>      -> hardware_disable_nolock() -> kvm_arch_hardware_disable()
>>
>> So it seems that we don't have to add a new hook at hyp_init_cpu_notify()
>> if kvm_arch_hardware_disable() is properly implemented.
>
> Yes, that is correct.  But, as above, you would also need to update the
> KVM startup to use kvm_arch_hardware_enable().
>
>> disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and
>> we should handle it in a separate way though.
>
> IIRC, the secondary cpus go through PSCI on shutdown, and that path
> is working OK.  Maybe I am mistaken though.

If so, why should we add a hook at hyp_init_cpu_notify() as initial work-around?

> The primary cpu shutdown (hyp stubs restored) is what is missing.  The
> primary cpu goes through cpu_soft_restart(), and that is what is
> currently failing.

Yeah, we will call teardown function manually in soft_restart();

-Takahiro AKASHI
>
> -Geoff
>
Geoff Levand Feb. 6, 2015, 7:06 a.m. UTC | #15
On Fri, 2015-02-06 at 13:18 +0900, AKASHI Takahiro wrote:
> On 02/06/2015 09:11 AM, Geoff Levand wrote:
> > Hi Takahiro,
> >
> > On Mon, 2015-02-02 at 17:18 +0900, AKASHI Takahiro wrote:
> >> On 01/31/2015 04:48 AM, Geoff Levand wrote:
> >>> As an initial implementation we can hook into the CPU_DYING_FROZEN
> >>> notifier sent to hyp_init_cpu_notify().  The longer term solution
> >>> should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable().
> >>
> >> Are these two different approaches?
> >
> > Yes, these are two different solutions,  One initial work-around, and a
> > more involved proper solution.  Hooking into the CPU_DYING_FROZEN
> > notifier would be a initial fix.  The proper solution would be to move
> > the KVM setup to kvm_arch_hardware_enable(), and the shutdown to
> > kvm_arch_hardware_disable().
> >
> >
> >> kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu()
> >>      -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...)
> >>
> >> On the other hand, kvm already has a hook into kvm_arch_hardware_disable():
> >>     (ignoring kvm_usage_count here)
> >> kvm_cpu_hotplug(CPU_DYING) -> hardware_disable()
> >>      -> hardware_disable_nolock() -> kvm_arch_hardware_disable()
> >>
> >> So it seems that we don't have to add a new hook at hyp_init_cpu_notify()
> >> if kvm_arch_hardware_disable() is properly implemented.
> >
> > Yes, that is correct.  But, as above, you would also need to update the
> > KVM startup to use kvm_arch_hardware_enable().
> >
> >> disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and
> >> we should handle it in a separate way though.
> >
> > IIRC, the secondary cpus go through PSCI on shutdown, and that path
> > is working OK.  Maybe I am mistaken though.
> 
> If so, why should we add a hook at hyp_init_cpu_notify() as initial work-around?

To tear down KVM on the primary cpu.

> > The primary cpu shutdown (hyp stubs restored) is what is missing.  The
> > primary cpu goes through cpu_soft_restart(), and that is what is
> > currently failing.
> 
> Yeah, we will call teardown function manually in soft_restart();

I think the KVM tear down should happen independent of soft_restart().
When soft_restart() is called, KVM should have already been torn down.

-Geoff
diff mbox

Patch

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 3d84759..a36459d 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -16,6 +16,9 @@ 
 #include <asm/cacheflush.h>
 #include <asm/system_misc.h>
 
+/* TODO: Remove this include when KVM can support a kexec reboot. */
+#include <asm/virt.h>
+
 /* Global variables for the relocate_kernel routine. */
 extern const unsigned char relocate_new_kernel[];
 extern const unsigned long relocate_new_kernel_size;
@@ -100,6 +103,13 @@  int machine_kexec_prepare(struct kimage *image)
 
 	kexec_image_info(image);
 
+	/* TODO: Remove this message when KVM can support a kexec reboot. */
+	if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) {
+		pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n",
+			__func__);
+		return -ENOSYS;
+	}
+
 	return 0;
 }