Message ID | 7c13554e60cc76516922992b7faf911b91f99a2a.1699982111.git.krystian.hebel@3mdeb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: parallelize AP bring-up during boot | expand |
On 14.11.2023 18:50, Krystian Hebel wrote: > It used to be called from smp_callin(), however BUG_ON() was invoked on > multiple occasions before that. It may end up calling machine_restart() > which tries to get APIC ID for CPU running this code. If BSP detected > that x2APIC is enabled, get_apic_id() will try to use it for all CPUs. > Enabling x2APIC on secondary CPUs earlier protects against an endless > loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID > MSR while x2APIC is disabled in IA32_APIC_BASE. > > Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com> > --- > xen/arch/x86/smpboot.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c > index 8ae65ab1769f..a3895dafa267 100644 > --- a/xen/arch/x86/smpboot.c > +++ b/xen/arch/x86/smpboot.c > @@ -184,7 +184,6 @@ static void smp_callin(void) > * update until we finish. We are free to set up this CPU: first the APIC. > */ > Dprintk("CALLIN, before setup_local_APIC().\n"); > - x2apic_ap_setup(); > setup_local_APIC(false); > > /* Save our processor parameters. */ > @@ -351,6 +350,14 @@ void start_secondary(void *unused) > get_cpu_info()->xen_cr3 = 0; > get_cpu_info()->pv_cr3 = 0; > > + /* > + * BUG_ON() used in load_system_tables() and later code may end up calling > + * machine_restart() which tries to get APIC ID for CPU running this code. > + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it > + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up > + * with endless #GP loop. > + */ > + x2apic_ap_setup(); > load_system_tables(); While I find the argument convincing, I seem to recall that there was a firm plan to have load_system_tables() as early as possible. Andrew? Jan
On 7.02.2024 18:02, Jan Beulich wrote: > On 14.11.2023 18:50, Krystian Hebel wrote: >> It used to be called from smp_callin(), however BUG_ON() was invoked on >> multiple occasions before that. It may end up calling machine_restart() >> which tries to get APIC ID for CPU running this code. If BSP detected >> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs. >> Enabling x2APIC on secondary CPUs earlier protects against an endless >> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID >> MSR while x2APIC is disabled in IA32_APIC_BASE. >> >> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com> >> --- >> xen/arch/x86/smpboot.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c >> index 8ae65ab1769f..a3895dafa267 100644 >> --- a/xen/arch/x86/smpboot.c >> +++ b/xen/arch/x86/smpboot.c >> @@ -184,7 +184,6 @@ static void smp_callin(void) >> * update until we finish. We are free to set up this CPU: first the APIC. >> */ >> Dprintk("CALLIN, before setup_local_APIC().\n"); >> - x2apic_ap_setup(); >> setup_local_APIC(false); >> >> /* Save our processor parameters. */ >> @@ -351,6 +350,14 @@ void start_secondary(void *unused) >> get_cpu_info()->xen_cr3 = 0; >> get_cpu_info()->pv_cr3 = 0; >> >> + /* >> + * BUG_ON() used in load_system_tables() and later code may end up calling >> + * machine_restart() which tries to get APIC ID for CPU running this code. >> + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it >> + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up >> + * with endless #GP loop. >> + */ >> + x2apic_ap_setup(); >> load_system_tables(); > While I find the argument convincing, I seem to recall that there was a > firm plan to have load_system_tables() as early as possible. Andrew? This is where the code failed for me during testing. How about moving x2apic_ap_setup() into load_system_tables(), just before BUG_ON? Or maybe move those BUG_ON one level higher, after load_system_tables() returns? Either way some code will end up in place it doesn't belong, but I'd argue that BUG_ON is only useful if it itself doesn't crash. > > Jan
On 12.03.2024 17:02, Krystian Hebel wrote: > > On 7.02.2024 18:02, Jan Beulich wrote: >> On 14.11.2023 18:50, Krystian Hebel wrote: >>> It used to be called from smp_callin(), however BUG_ON() was invoked on >>> multiple occasions before that. It may end up calling machine_restart() >>> which tries to get APIC ID for CPU running this code. If BSP detected >>> that x2APIC is enabled, get_apic_id() will try to use it for all CPUs. >>> Enabling x2APIC on secondary CPUs earlier protects against an endless >>> loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID >>> MSR while x2APIC is disabled in IA32_APIC_BASE. >>> >>> Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com> >>> --- >>> xen/arch/x86/smpboot.c | 9 ++++++++- >>> 1 file changed, 8 insertions(+), 1 deletion(-) >>> >>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c >>> index 8ae65ab1769f..a3895dafa267 100644 >>> --- a/xen/arch/x86/smpboot.c >>> +++ b/xen/arch/x86/smpboot.c >>> @@ -184,7 +184,6 @@ static void smp_callin(void) >>> * update until we finish. We are free to set up this CPU: first the APIC. >>> */ >>> Dprintk("CALLIN, before setup_local_APIC().\n"); >>> - x2apic_ap_setup(); >>> setup_local_APIC(false); >>> >>> /* Save our processor parameters. */ >>> @@ -351,6 +350,14 @@ void start_secondary(void *unused) >>> get_cpu_info()->xen_cr3 = 0; >>> get_cpu_info()->pv_cr3 = 0; >>> >>> + /* >>> + * BUG_ON() used in load_system_tables() and later code may end up calling >>> + * machine_restart() which tries to get APIC ID for CPU running this code. >>> + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it >>> + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up >>> + * with endless #GP loop. >>> + */ >>> + x2apic_ap_setup(); >>> load_system_tables(); >> While I find the argument convincing, I seem to recall that there was a >> firm plan to have load_system_tables() as early as possible. Andrew? > This is where the code failed for me during testing. How about moving > x2apic_ap_setup() into load_system_tables(), How does a call to x2apic_ap_setup() fit in a function named load_system_tables()? > just before BUG_ON? Or maybe > move those BUG_ON one level higher, after load_system_tables() returns? But they're there for a reason. > Either way some code will end up in place it doesn't belong, but I'd > argue that > BUG_ON is only useful if it itself doesn't crash. I guess I don't understand this: That BUG_ON() is already guarded by a system_state check, to prevent it uselessly hanging the system. In any event - besides you still wanting to get input from Andrew, it ought to be clear that anything unusual / unexpected will require extra justification in the description. Jan
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 8ae65ab1769f..a3895dafa267 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -184,7 +184,6 @@ static void smp_callin(void) * update until we finish. We are free to set up this CPU: first the APIC. */ Dprintk("CALLIN, before setup_local_APIC().\n"); - x2apic_ap_setup(); setup_local_APIC(false); /* Save our processor parameters. */ @@ -351,6 +350,14 @@ void start_secondary(void *unused) get_cpu_info()->xen_cr3 = 0; get_cpu_info()->pv_cr3 = 0; + /* + * BUG_ON() used in load_system_tables() and later code may end up calling + * machine_restart() which tries to get APIC ID for CPU running this code. + * If BSP detected that x2APIC is enabled, get_apic_id() will try to use it + * for _all_ CPUs. Enable x2APIC on secondary CPUs now so we won't end up + * with endless #GP loop. + */ + x2apic_ap_setup(); load_system_tables(); /* Full exception support from here on in. */
It used to be called from smp_callin(), however BUG_ON() was invoked on multiple occasions before that. It may end up calling machine_restart() which tries to get APIC ID for CPU running this code. If BSP detected that x2APIC is enabled, get_apic_id() will try to use it for all CPUs. Enabling x2APIC on secondary CPUs earlier protects against an endless loop of #GP exceptions caused by attempts to read IA32_X2APIC_APICID MSR while x2APIC is disabled in IA32_APIC_BASE. Signed-off-by: Krystian Hebel <krystian.hebel@3mdeb.com> --- xen/arch/x86/smpboot.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)