diff mbox series

[XEN,5/9] x86/smp: call x2apic_ap_setup() earlier

Message ID 7c13554e60cc76516922992b7faf911b91f99a2a.1699982111.git.krystian.hebel@3mdeb.com (mailing list archive)
State New, archived
Headers show
Series x86: parallelize AP bring-up during boot | expand

Commit Message

Krystian Hebel Nov. 14, 2023, 5:50 p.m. UTC
It used to be called from smp_callin(), however BUG_ON() was invoked on
multiple occasions before that. It may end up calling machine_restart()
which tries to get APIC ID for CPU running this code. If BSP detected
that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
Enabling x2APIC on secondary CPUs earlier protects against an endless
loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
MSR while x2APIC is disabled in IA32_APIC_BASE.

Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
---
 xen/arch/x86/smpboot.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Jan Beulich Feb. 7, 2024, 5:02 p.m. UTC | #1
On 14.11.2023 18:50, Krystian Hebel wrote:
> It used to be called from smp_callin(), however BUG_ON() was invoked on
> multiple occasions before that. It may end up calling machine_restart()
> which tries to get APIC ID for CPU running this code. If BSP detected
> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
> Enabling x2APIC on secondary CPUs earlier protects against an endless
> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
> MSR while x2APIC is disabled in IA32_APIC_BASE.
> 
> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
> ---
>  xen/arch/x86/smpboot.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> index 8ae65ab1769f..a3895dafa267 100644
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -184,7 +184,6 @@ static void smp_callin(void)
>       * update until we finish. We are free to set up this CPU: first the APIC.
>       */
>      Dprintk("CALLIN, before setup_local_APIC().\n");
> -    x2apic_ap_setup();
>      setup_local_APIC(false);
>  
>      /* Save our processor parameters. */
> @@ -351,6 +350,14 @@ void start_secondary(void *unused)
>      get_cpu_info()->xen_cr3 = 0;
>      get_cpu_info()->pv_cr3 = 0;
>  
> +    /*
> +     * BUG_ON() used in load_system_tables() and later code may end up calling
> +     * machine_restart() which tries to get APIC ID for CPU running this code.
> +     * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
> +     * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
> +     * with endless #GP loop.
> +     */
> +    x2apic_ap_setup();
>      load_system_tables();

While I find the argument convincing, I seem to recall that there was a
firm plan to have load_system_tables() as early as possible. Andrew?

Jan
Krystian Hebel March 12, 2024, 4:02 p.m. UTC | #2
On 7.02.2024 18:02, Jan Beulich wrote:
> On 14.11.2023 18:50, Krystian Hebel wrote:
>> It used to be called from smp_callin(), however BUG_ON() was invoked on
>> multiple occasions before that. It may end up calling machine_restart()
>> which tries to get APIC ID for CPU running this code. If BSP detected
>> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
>> Enabling x2APIC on secondary CPUs earlier protects against an endless
>> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
>> MSR while x2APIC is disabled in IA32_APIC_BASE.
>>
>> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
>> ---
>>   xen/arch/x86/smpboot.c | 9 ++++++++-
>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>> index 8ae65ab1769f..a3895dafa267 100644
>> --- a/xen/arch/x86/smpboot.c
>> +++ b/xen/arch/x86/smpboot.c
>> @@ -184,7 +184,6 @@ static void smp_callin(void)
>>        * update until we finish. We are free to set up this CPU: first the APIC.
>>        */
>>       Dprintk("CALLIN, before setup_local_APIC().\n");
>> -    x2apic_ap_setup();
>>       setup_local_APIC(false);
>>   
>>       /* Save our processor parameters. */
>> @@ -351,6 +350,14 @@ void start_secondary(void *unused)
>>       get_cpu_info()->xen_cr3 = 0;
>>       get_cpu_info()->pv_cr3 = 0;
>>   
>> +    /*
>> +     * BUG_ON() used in load_system_tables() and later code may end up calling
>> +     * machine_restart() which tries to get APIC ID for CPU running this code.
>> +     * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
>> +     * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
>> +     * with endless #GP loop.
>> +     */
>> +    x2apic_ap_setup();
>>       load_system_tables();
> While I find the argument convincing, I seem to recall that there was a
> firm plan to have load_system_tables() as early as possible. Andrew?
This is where the code failed for me during testing. How about moving
x2apic_ap_setup() into load_system_tables(), just before BUG_ON? Or maybe
move those BUG_ON one level higher, after load_system_tables() returns?
Either way some code will end up in place it doesn't belong, but I'd 
argue that
BUG_ON is only useful if it itself doesn't crash.
>
> Jan
Jan Beulich March 13, 2024, 1:05 p.m. UTC | #3
On 12.03.2024 17:02, Krystian Hebel wrote:
> 
> On 7.02.2024 18:02, Jan Beulich wrote:
>> On 14.11.2023 18:50, Krystian Hebel wrote:
>>> It used to be called from smp_callin(), however BUG_ON() was invoked on
>>> multiple occasions before that. It may end up calling machine_restart()
>>> which tries to get APIC ID for CPU running this code. If BSP detected
>>> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs.
>>> Enabling x2APIC on secondary CPUs earlier protects against an endless
>>> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID
>>> MSR while x2APIC is disabled in IA32_APIC_BASE.
>>>
>>> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com>
>>> ---
>>>   xen/arch/x86/smpboot.c | 9 ++++++++-
>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>>> index 8ae65ab1769f..a3895dafa267 100644
>>> --- a/xen/arch/x86/smpboot.c
>>> +++ b/xen/arch/x86/smpboot.c
>>> @@ -184,7 +184,6 @@ static void smp_callin(void)
>>>        * update until we finish. We are free to set up this CPU: first the APIC.
>>>        */
>>>       Dprintk("CALLIN, before setup_local_APIC().\n");
>>> -    x2apic_ap_setup();
>>>       setup_local_APIC(false);
>>>   
>>>       /* Save our processor parameters. */
>>> @@ -351,6 +350,14 @@ void start_secondary(void *unused)
>>>       get_cpu_info()->xen_cr3 = 0;
>>>       get_cpu_info()->pv_cr3 = 0;
>>>   
>>> +    /*
>>> +     * BUG_ON() used in load_system_tables() and later code may end up calling
>>> +     * machine_restart() which tries to get APIC ID for CPU running this code.
>>> +     * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
>>> +     * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
>>> +     * with endless #GP loop.
>>> +     */
>>> +    x2apic_ap_setup();
>>>       load_system_tables();
>> While I find the argument convincing, I seem to recall that there was a
>> firm plan to have load_system_tables() as early as possible. Andrew?
> This is where the code failed for me during testing. How about moving
> x2apic_ap_setup() into load_system_tables(),

How does a call to x2apic_ap_setup() fit in a function named
load_system_tables()?

> just before BUG_ON? Or maybe
> move those BUG_ON one level higher, after load_system_tables() returns?

But they're there for a reason.

> Either way some code will end up in place it doesn't belong, but I'd 
> argue that
> BUG_ON is only useful if it itself doesn't crash.

I guess I don't understand this: That BUG_ON() is already guarded by a
system_state check, to prevent it uselessly hanging the system.

In any event - besides you still wanting to get input from Andrew, it
ought to be clear that anything unusual / unexpected will require extra
justification in the description.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 8ae65ab1769f..a3895dafa267 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -184,7 +184,6 @@  static void smp_callin(void)
      * update until we finish. We are free to set up this CPU: first the APIC.
      */
     Dprintk("CALLIN, before setup_local_APIC().\n");
-    x2apic_ap_setup();
     setup_local_APIC(false);
 
     /* Save our processor parameters. */
@@ -351,6 +350,14 @@  void start_secondary(void *unused)
     get_cpu_info()->xen_cr3 = 0;
     get_cpu_info()->pv_cr3 = 0;
 
+    /*
+     * BUG_ON() used in load_system_tables() and later code may end up calling
+     * machine_restart() which tries to get APIC ID for CPU running this code.
+     * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it
+     * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up
+     * with endless #GP loop.
+     */
+    x2apic_ap_setup();
     load_system_tables();
 
     /* Full exception support from here on in. */