Message ID | 1566177928-19114-16-git-send-email-chao.gao@intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | improve late microcode loading | expand |
On 19/08/2019 02:25, Chao Gao wrote: > register an nmi callback. And this callback does busy-loop on threads > which are waiting for loading completion. Control threads send NMI to > slave threads to prevent NMI acceptance during ucode loading. > > Signed-off-by: Chao Gao <chao.gao@intel.com> > --- > Changes in v9: > - control threads send NMI to all other threads. Slave threads will > stay in the NMI handling to prevent NMI acceptance during ucode > loading. Note that self-nmi is invalid according to SDM. To me this looks like a half-measure: why keep only slave threads in the NMI handler, when master threads can update the microcode from inside the NMI handler as well? You mention that self-nmi is invalid, but Xen has self_nmi() which is used for apply_alternatives() during boot, so can be trusted to work. I experimented a bit with the following approach: after loading_state becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous via cpu_callin_map into LOADING_ENTER to do a ucode update directly in the NMI handler. And it seems to work. Separate question is about the safety of this approach: can we be sure that a ucode update would not reset the status of the NMI latch? I.e. can it cause another NMI to be delivered while Xen already handles one? > - s/rep_nop/cpu_relax > - remove debug message in microcode_nmi_callback(). Printing debug > message would take long times and control thread may timeout. > - rebase and fix conflicts Thanks, Sergey
On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote: >On 19/08/2019 02:25, Chao Gao wrote: >> register an nmi callback. And this callback does busy-loop on threads >> which are waiting for loading completion. Control threads send NMI to >> slave threads to prevent NMI acceptance during ucode loading. >> >> Signed-off-by: Chao Gao <chao.gao@intel.com> >> --- >> Changes in v9: >> - control threads send NMI to all other threads. Slave threads will >> stay in the NMI handling to prevent NMI acceptance during ucode >> loading. Note that self-nmi is invalid according to SDM. > >To me this looks like a half-measure: why keep only slave threads in >the NMI handler, when master threads can update the microcode from >inside the NMI handler as well? No special reason. Because the issue we want to address is that slave threads might go to handle NMI and access MSRs when master thread is loading ucode. So we only keep slave threads in the NMI handler. > >You mention that self-nmi is invalid, but Xen has self_nmi() which is >used for apply_alternatives() during boot, so can be trusted to work. Sorry, I meant using self shorthand to send self-nmi. I tried to use self shorthand but got APIC error. And I agree that it is better to make slave thread call self_nmi() itself. > >I experimented a bit with the following approach: after loading_state >becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous >via cpu_callin_map into LOADING_ENTER to do a ucode update directly in >the NMI handler. And it seems to work. > >Separate question is about the safety of this approach: can we be sure >that a ucode update would not reset the status of the NMI latch? I.e. >can it cause another NMI to be delivered while Xen already handles one? Ashok, what's your opinion on Sergey's approach and his concern? Thanks Chao
On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote: >On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote: >>On 19/08/2019 02:25, Chao Gao wrote: >>> register an nmi callback. And this callback does busy-loop on threads >>> which are waiting for loading completion. Control threads send NMI to >>> slave threads to prevent NMI acceptance during ucode loading. >>> >>> Signed-off-by: Chao Gao <chao.gao@intel.com> >>> --- >>> Changes in v9: >>> - control threads send NMI to all other threads. Slave threads will >>> stay in the NMI handling to prevent NMI acceptance during ucode >>> loading. Note that self-nmi is invalid according to SDM. >> >>To me this looks like a half-measure: why keep only slave threads in >>the NMI handler, when master threads can update the microcode from >>inside the NMI handler as well? > >No special reason. Because the issue we want to address is that slave >threads might go to handle NMI and access MSRs when master thread is >loading ucode. So we only keep slave threads in the NMI handler. > >> >>You mention that self-nmi is invalid, but Xen has self_nmi() which is >>used for apply_alternatives() during boot, so can be trusted to work. > >Sorry, I meant using self shorthand to send self-nmi. I tried to use >self shorthand but got APIC error. And I agree that it is better to >make slave thread call self_nmi() itself. > >> >>I experimented a bit with the following approach: after loading_state >>becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous >>via cpu_callin_map into LOADING_ENTER to do a ucode update directly in >>the NMI handler. And it seems to work. >> >>Separate question is about the safety of this approach: can we be sure >>that a ucode update would not reset the status of the NMI latch? I.e. >>can it cause another NMI to be delivered while Xen already handles one? > >Ashok, what's your opinion on Sergey's approach and his concern? Hi Sergey, I talked with Ashok. We think your approach is better. I will follow your approach in v10. It would be much helpful if you post your patch so that I can just rebase it onto other patches. Thanks Chao
On 27/08/2019 05:52, Chao Gao wrote: > On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote: >> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote: >>> On 19/08/2019 02:25, Chao Gao wrote: >>>> register an nmi callback. And this callback does busy-loop on threads >>>> which are waiting for loading completion. Control threads send NMI to >>>> slave threads to prevent NMI acceptance during ucode loading. >>>> >>>> Signed-off-by: Chao Gao <chao.gao@intel.com> >>>> --- >>>> Changes in v9: >>>> - control threads send NMI to all other threads. Slave threads will >>>> stay in the NMI handling to prevent NMI acceptance during ucode >>>> loading. Note that self-nmi is invalid according to SDM. >>> >>> To me this looks like a half-measure: why keep only slave threads in >>> the NMI handler, when master threads can update the microcode from >>> inside the NMI handler as well? >> >> No special reason. Because the issue we want to address is that slave >> threads might go to handle NMI and access MSRs when master thread is >> loading ucode. So we only keep slave threads in the NMI handler. >> >>> >>> You mention that self-nmi is invalid, but Xen has self_nmi() which is >>> used for apply_alternatives() during boot, so can be trusted to work. >> >> Sorry, I meant using self shorthand to send self-nmi. I tried to use >> self shorthand but got APIC error. And I agree that it is better to >> make slave thread call self_nmi() itself. >> >>> >>> I experimented a bit with the following approach: after loading_state >>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous >>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in >>> the NMI handler. And it seems to work. >>> >>> Separate question is about the safety of this approach: can we be sure >>> that a ucode update would not reset the status of the NMI latch? I.e. >>> can it cause another NMI to be delivered while Xen already handles one? >> >> Ashok, what's your opinion on Sergey's approach and his concern? > > Hi Sergey, > > I talked with Ashok. We think your approach is better. I will follow > your approach in v10. It would be much helpful if you post your patch > so that I can just rebase it onto other patches. Sure thing. The below code is my first attempt at improving the original patch. It can benefit from some further refactoring. --- xen/arch/x86/microcode.c | 108 ++++++++++++++++++++++++++++----------- 1 file changed, 79 insertions(+), 29 deletions(-) diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c index 91f9e811f8..ba2363406f 100644 --- a/xen/arch/x86/microcode.c +++ b/xen/arch/x86/microcode.c @@ -36,8 +36,10 @@ #include <xen/earlycpio.h> #include <xen/watchdog.h> +#include <asm/apic.h> #include <asm/delay.h> #include <asm/msr.h> +#include <asm/nmi.h> #include <asm/processor.h> #include <asm/setup.h> #include <asm/microcode.h> @@ -232,6 +234,7 @@ DEFINE_PER_CPU(struct cpu_signature, cpu_sig); */ static cpumask_t cpu_callin_map; static atomic_t cpu_out, cpu_updated; +struct microcode_patch *nmi_patch; /* * Return a patch that covers current CPU. If there are multiple patches, @@ -337,15 +340,25 @@ static int microcode_update_cpu(const struct microcode_patch *patch) return err; } +static void slave_thread_work(void) +{ + /* Do nothing, just wait */ + while ( loading_state != LOADING_EXIT ) + cpu_relax(); +} + static int slave_thread_fn(void) { - unsigned int cpu = smp_processor_id(); unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask)); while ( loading_state != LOADING_CALLIN ) + { + if ( loading_state == LOADING_EXIT ) + return 0; cpu_relax(); + } - cpumask_set_cpu(cpu, &cpu_callin_map); + self_nmi(); while ( loading_state != LOADING_EXIT ) cpu_relax(); @@ -356,30 +369,35 @@ static int slave_thread_fn(void) return 0; } -static int master_thread_fn(const struct microcode_patch *patch) +static void master_thread_work(void) { - unsigned int cpu = smp_processor_id(); - int ret = 0; - - while ( loading_state != LOADING_CALLIN ) - cpu_relax(); - - cpumask_set_cpu(cpu, &cpu_callin_map); + int ret; while ( loading_state != LOADING_ENTER ) + { + if ( loading_state == LOADING_EXIT ) + return; cpu_relax(); + } - /* - * If an error happened, control thread would set 'loading_state' - * to LOADING_EXIT. Don't perform ucode loading for this case - */ - if ( loading_state == LOADING_EXIT ) - return ret; - - ret = microcode_ops->apply_microcode(patch); + ret = microcode_ops->apply_microcode(nmi_patch); if ( !ret ) atomic_inc(&cpu_updated); atomic_inc(&cpu_out); +} + +static int master_thread_fn(const struct microcode_patch *patch) +{ + int ret = 0; + + while ( loading_state != LOADING_CALLIN ) + { + if ( loading_state == LOADING_EXIT ) + return ret; + cpu_relax(); + } + + self_nmi(); while ( loading_state != LOADING_EXIT ) cpu_relax(); @@ -387,35 +405,40 @@ static int master_thread_fn(const struct microcode_patch *patch) return ret; } -static int control_thread_fn(const struct microcode_patch *patch) +static void control_thread_work(void) { - unsigned int cpu = smp_processor_id(), done; - unsigned long tick; int ret; - /* Allow threads to call in */ - loading_state = LOADING_CALLIN; - smp_mb(); - - cpumask_set_cpu(cpu, &cpu_callin_map); - /* Waiting for all threads calling in */ ret = wait_for_condition(wait_cpu_callin, (void *)(unsigned long)num_online_cpus(), MICROCODE_CALLIN_TIMEOUT_US); if ( ret ) { loading_state = LOADING_EXIT; - return ret; + return; } /* Let master threads load the given ucode update */ loading_state = LOADING_ENTER; smp_mb(); - ret = microcode_ops->apply_microcode(patch); + ret = microcode_ops->apply_microcode(nmi_patch); if ( !ret ) atomic_inc(&cpu_updated); atomic_inc(&cpu_out); +} + +static int control_thread_fn(const struct microcode_patch *patch) +{ + unsigned int done; + unsigned long tick; + int ret; + + /* Allow threads to call in */ + loading_state = LOADING_CALLIN; + smp_mb(); + + self_nmi(); tick = rdtsc_ordered(); /* Waiting for master threads finishing update */ @@ -481,12 +504,35 @@ static int do_microcode_update(void *patch) return ret; } +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu) +{ + unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask)); + unsigned int controller = cpumask_first(&cpu_online_map); + + /* System-generated NMI, will be ignored */ + if ( loading_state == LOADING_PREPARE ) + return 0; + + ASSERT(loading_state == LOADING_CALLIN); + cpumask_set_cpu(cpu, &cpu_callin_map); + + if ( cpu == controller ) + control_thread_work(); + else if ( cpu == master ) + master_thread_work(); + else + slave_thread_work(); + + return 0; +} + int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len) { int ret; void *buffer; unsigned int cpu, updated; struct microcode_patch *patch; + nmi_callback_t *saved_nmi_callback; if ( len != (uint32_t)len ) return -E2BIG; @@ -551,6 +597,9 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len) * watchdog timeout. */ watchdog_disable(); + + nmi_patch = patch; + saved_nmi_callback = set_nmi_callback(microcode_nmi_callback); /* * Late loading dance. Why the heavy-handed stop_machine effort? * @@ -563,6 +612,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len) * conservative and good. */ ret = stop_machine_run(do_microcode_update, patch, NR_CPUS); + set_nmi_callback(saved_nmi_callback); watchdog_enable(); updated = atomic_read(&cpu_updated);
On 27.08.2019 06:52, Chao Gao wrote: > On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote: >> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote: >>> On 19/08/2019 02:25, Chao Gao wrote: >>>> register an nmi callback. And this callback does busy-loop on threads >>>> which are waiting for loading completion. Control threads send NMI to >>>> slave threads to prevent NMI acceptance during ucode loading. >>>> >>>> Signed-off-by: Chao Gao <chao.gao@intel.com> >>>> --- >>>> Changes in v9: >>>> - control threads send NMI to all other threads. Slave threads will >>>> stay in the NMI handling to prevent NMI acceptance during ucode >>>> loading. Note that self-nmi is invalid according to SDM. >>> >>> To me this looks like a half-measure: why keep only slave threads in >>> the NMI handler, when master threads can update the microcode from >>> inside the NMI handler as well? >> >> No special reason. Because the issue we want to address is that slave >> threads might go to handle NMI and access MSRs when master thread is >> loading ucode. So we only keep slave threads in the NMI handler. >> >>> >>> You mention that self-nmi is invalid, but Xen has self_nmi() which is >>> used for apply_alternatives() during boot, so can be trusted to work. >> >> Sorry, I meant using self shorthand to send self-nmi. I tried to use >> self shorthand but got APIC error. And I agree that it is better to >> make slave thread call self_nmi() itself. >> >>> >>> I experimented a bit with the following approach: after loading_state >>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous >>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in >>> the NMI handler. And it seems to work. >>> >>> Separate question is about the safety of this approach: can we be sure >>> that a ucode update would not reset the status of the NMI latch? I.e. >>> can it cause another NMI to be delivered while Xen already handles one? >> >> Ashok, what's your opinion on Sergey's approach and his concern? > > I talked with Ashok. We think your approach is better. I will follow > your approach in v10. It would be much helpful if you post your patch > so that I can just rebase it onto other patches. Doing the actual ucode update inside an NMI handler seems rather risky to me. Even if Ashok confirmed it would not be an issue on past and current Intel CPUs - what about future ones, or ones from other vendors? Jan
On 19.08.2019 03:25, Chao Gao wrote: > @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch) > return ret; > } > > +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu) > +{ > + /* The first thread of a core is to load an update. Don't block it. */ > + if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) || > + loading_state != LOADING_CALLIN ) > + return 0; > + > + cpumask_set_cpu(cpu, &cpu_callin_map); > + > + while ( loading_state != LOADING_EXIT ) > + cpu_relax(); > + > + return 0; > +} By returning 0 you tell do_nmi() to continue processing the NMI. Since you can't tell whether a non-IPI NMI has surfaced at about the same time this is generally the right thing imo, but how do you prevent unknown_nmi_error() from getting entered when do_nmi() ends up setting handle_unknown to true? (The question is mostly rhetorical, but there's a disconnect between do_nmi() checking "cpu == 0" and the control thread running on cpumask_first(&cpu_online_map), i.e. you introduce a well hidden dependency on CPU 0 never going offline. IOW my request is to at least make this less well hidden, such that it can be noticed if and when someone endeavors to remove said limitation.) Jan
On Thu, Aug 29, 2019 at 02:22:47PM +0200, Jan Beulich wrote: >On 19.08.2019 03:25, Chao Gao wrote: >> @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch) >> return ret; >> } >> >> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu) >> +{ >> + /* The first thread of a core is to load an update. Don't block it. */ >> + if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) || >> + loading_state != LOADING_CALLIN ) >> + return 0; >> + >> + cpumask_set_cpu(cpu, &cpu_callin_map); >> + >> + while ( loading_state != LOADING_EXIT ) >> + cpu_relax(); >> + >> + return 0; >> +} > >By returning 0 you tell do_nmi() to continue processing the NMI. >Since you can't tell whether a non-IPI NMI has surfaced at about >the same time this is generally the right thing imo, but how do >you prevent unknown_nmi_error() from getting entered when do_nmi() >ends up setting handle_unknown to true? (The question is mostly >rhetorical, but there's a disconnect between do_nmi() checking >"cpu == 0" and the control thread running on >cpumask_first(&cpu_online_map), i.e. you introduce a well hidden >dependency on CPU 0 never going offline. IOW my request is to at >least make this less well hidden, such that it can be noticed if >and when someone endeavors to remove said limitation.) Seems the issue is that we couldn't send IPI NMI to BSP, otherwise unknown_nmi_error() would be trigger. And loading ucode after rendezvousing all CPUs in NMI handler expects all CPUs to receive IPI NMI. So this approach always has such issue. Considering self_nmi is called at another place, could we provide a way to temporarily suppress or (force) ignore unknown nmi error? Thanks Chao
On Thu, Aug 29, 2019 at 02:11:10PM +0200, Jan Beulich wrote: >On 27.08.2019 06:52, Chao Gao wrote: >> On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote: >>> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote: >>>> On 19/08/2019 02:25, Chao Gao wrote: >>>>> register an nmi callback. And this callback does busy-loop on threads >>>>> which are waiting for loading completion. Control threads send NMI to >>>>> slave threads to prevent NMI acceptance during ucode loading. >>>>> >>>>> Signed-off-by: Chao Gao <chao.gao@intel.com> >>>>> --- >>>>> Changes in v9: >>>>> - control threads send NMI to all other threads. Slave threads will >>>>> stay in the NMI handling to prevent NMI acceptance during ucode >>>>> loading. Note that self-nmi is invalid according to SDM. >>>> >>>> To me this looks like a half-measure: why keep only slave threads in >>>> the NMI handler, when master threads can update the microcode from >>>> inside the NMI handler as well? >>> >>> No special reason. Because the issue we want to address is that slave >>> threads might go to handle NMI and access MSRs when master thread is >>> loading ucode. So we only keep slave threads in the NMI handler. >>> >>>> >>>> You mention that self-nmi is invalid, but Xen has self_nmi() which is >>>> used for apply_alternatives() during boot, so can be trusted to work. >>> >>> Sorry, I meant using self shorthand to send self-nmi. I tried to use >>> self shorthand but got APIC error. And I agree that it is better to >>> make slave thread call self_nmi() itself. >>> >>>> >>>> I experimented a bit with the following approach: after loading_state >>>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous >>>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in >>>> the NMI handler. And it seems to work. >>>> >>>> Separate question is about the safety of this approach: can we be sure >>>> that a ucode update would not reset the status of the NMI latch? I.e. >>>> can it cause another NMI to be delivered while Xen already handles one? >>> >>> Ashok, what's your opinion on Sergey's approach and his concern? >> >> I talked with Ashok. We think your approach is better. I will follow >> your approach in v10. It would be much helpful if you post your patch >> so that I can just rebase it onto other patches. > >Doing the actual ucode update inside an NMI handler seems rather risky >to me. Even if Ashok confirmed it would not be an issue on past and >current Intel CPUs - what about future ones, or ones from other vendors? Will confirm these with Ashok. Thanks Chao
On 30.08.2019 08:33, Chao Gao wrote: > On Thu, Aug 29, 2019 at 02:22:47PM +0200, Jan Beulich wrote: >> On 19.08.2019 03:25, Chao Gao wrote: >>> @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch) >>> return ret; >>> } >>> >>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu) >>> +{ >>> + /* The first thread of a core is to load an update. Don't block it. */ >>> + if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) || >>> + loading_state != LOADING_CALLIN ) >>> + return 0; >>> + >>> + cpumask_set_cpu(cpu, &cpu_callin_map); >>> + >>> + while ( loading_state != LOADING_EXIT ) >>> + cpu_relax(); >>> + >>> + return 0; >>> +} >> >> By returning 0 you tell do_nmi() to continue processing the NMI. >> Since you can't tell whether a non-IPI NMI has surfaced at about >> the same time this is generally the right thing imo, but how do >> you prevent unknown_nmi_error() from getting entered when do_nmi() >> ends up setting handle_unknown to true? (The question is mostly >> rhetorical, but there's a disconnect between do_nmi() checking >> "cpu == 0" and the control thread running on >> cpumask_first(&cpu_online_map), i.e. you introduce a well hidden >> dependency on CPU 0 never going offline. IOW my request is to at >> least make this less well hidden, such that it can be noticed if >> and when someone endeavors to remove said limitation.) > > Seems the issue is that we couldn't send IPI NMI to BSP, otherwise > unknown_nmi_error() would be trigger. And loading ucode after > rendezvousing all CPUs in NMI handler expects all CPUs to receive IPI > NMI. So this approach always has such issue. Not really, I don't think: If both sides agreed (explicitly!) on which CPU leads this effort, then it would be clear that the one CPU handling NMIs coming from the platform should not be sent an NMI, and hence it should be this one to lead the effort. FAOD - my remark really was because of the new hidden(!) dependency you introduce on CPU 0 always being this "special" CPU. I don't expect you to change the code, but I'd like you to make the currently hidden dependency explicit. > Considering self_nmi is called at another place, could we provide a > way to temporarily suppress or (force) ignore unknown nmi error? I'm afraid any attempt at doing so will leave room for missing an actual (platform) NMI. Jan
On Fri, Aug 30, 2019 at 02:35:06PM +0800, Chao Gao wrote: >On Thu, Aug 29, 2019 at 02:11:10PM +0200, Jan Beulich wrote: >>On 27.08.2019 06:52, Chao Gao wrote: >>> On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote: >>>> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote: >>>>> On 19/08/2019 02:25, Chao Gao wrote: >>>>>> register an nmi callback. And this callback does busy-loop on threads >>>>>> which are waiting for loading completion. Control threads send NMI to >>>>>> slave threads to prevent NMI acceptance during ucode loading. >>>>>> >>>>>> Signed-off-by: Chao Gao <chao.gao@intel.com> >>>>>> --- >>>>>> Changes in v9: >>>>>> - control threads send NMI to all other threads. Slave threads will >>>>>> stay in the NMI handling to prevent NMI acceptance during ucode >>>>>> loading. Note that self-nmi is invalid according to SDM. >>>>> >>>>> To me this looks like a half-measure: why keep only slave threads in >>>>> the NMI handler, when master threads can update the microcode from >>>>> inside the NMI handler as well? >>>> >>>> No special reason. Because the issue we want to address is that slave >>>> threads might go to handle NMI and access MSRs when master thread is >>>> loading ucode. So we only keep slave threads in the NMI handler. >>>> >>>>> >>>>> You mention that self-nmi is invalid, but Xen has self_nmi() which is >>>>> used for apply_alternatives() during boot, so can be trusted to work. >>>> >>>> Sorry, I meant using self shorthand to send self-nmi. I tried to use >>>> self shorthand but got APIC error. And I agree that it is better to >>>> make slave thread call self_nmi() itself. >>>> >>>>> >>>>> I experimented a bit with the following approach: after loading_state >>>>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous >>>>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in >>>>> the NMI handler. And it seems to work. >>>>> >>>>> Separate question is about the safety of this approach: can we be sure >>>>> that a ucode update would not reset the status of the NMI latch? I.e. >>>>> can it cause another NMI to be delivered while Xen already handles one? >>>> >>>> Ashok, what's your opinion on Sergey's approach and his concern? >>> >>> I talked with Ashok. We think your approach is better. I will follow >>> your approach in v10. It would be much helpful if you post your patch >>> so that I can just rebase it onto other patches. >> >>Doing the actual ucode update inside an NMI handler seems rather risky >>to me. Even if Ashok confirmed it would not be an issue on past and >>current Intel CPUs - what about future ones, or ones from other vendors? > Intel SDM doesn't say that loading ucode isn't allowed inside an NMI handler. So it is allowed implicitly. If future CPUs cannot load ucode in NMI handler, SDM should document it and at that time, we can move ucode loading out of NMI handler for new CPUS. As to AMD, if someone objects to this approach, let's use this approach for Intel only. Thanks Chao
On 09.09.2019 07:52, Chao Gao wrote: > On Fri, Aug 30, 2019 at 02:35:06PM +0800, Chao Gao wrote: >>> Doing the actual ucode update inside an NMI handler seems rather risky >>> to me. Even if Ashok confirmed it would not be an issue on past and >>> current Intel CPUs - what about future ones, or ones from other vendors? >> > > Intel SDM doesn't say that loading ucode isn't allowed inside an NMI > handler. So it is allowed implicitly. Well, if the SDM was complete and correct everywhere else, I'd agree to such an interpretation / implication. > If future CPUs cannot load ucode > in NMI handler, SDM should document it and at that time, we can move > ucode loading out of NMI handler for new CPUS. As to AMD, if someone > objects to this approach, let's use this approach for Intel only. Getting a definitive statement may turn out difficult. But I guess if you support both approaches anyway, having a command line option to override the default behavior wouldn't be much more than a 1 line addition? Jan
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c index 91f9e81..d943835 100644 --- a/xen/arch/x86/microcode.c +++ b/xen/arch/x86/microcode.c @@ -38,6 +38,7 @@ #include <asm/delay.h> #include <asm/msr.h> +#include <asm/nmi.h> #include <asm/processor.h> #include <asm/setup.h> #include <asm/microcode.h> @@ -339,14 +340,8 @@ static int microcode_update_cpu(const struct microcode_patch *patch) static int slave_thread_fn(void) { - unsigned int cpu = smp_processor_id(); unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask)); - while ( loading_state != LOADING_CALLIN ) - cpu_relax(); - - cpumask_set_cpu(cpu, &cpu_callin_map); - while ( loading_state != LOADING_EXIT ) cpu_relax(); @@ -399,6 +394,8 @@ static int control_thread_fn(const struct microcode_patch *patch) cpumask_set_cpu(cpu, &cpu_callin_map); + smp_send_nmi_allbutself(); + /* Waiting for all threads calling in */ ret = wait_for_condition(wait_cpu_callin, (void *)(unsigned long)num_online_cpus(), @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch) return ret; } +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu) +{ + /* The first thread of a core is to load an update. Don't block it. */ + if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) || + loading_state != LOADING_CALLIN ) + return 0; + + cpumask_set_cpu(cpu, &cpu_callin_map); + + while ( loading_state != LOADING_EXIT ) + cpu_relax(); + + return 0; +} + int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len) { int ret; void *buffer; unsigned int cpu, updated; struct microcode_patch *patch; + nmi_callback_t *saved_nmi_callback; if ( len != (uint32_t)len ) return -E2BIG; @@ -551,6 +564,8 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len) * watchdog timeout. */ watchdog_disable(); + + saved_nmi_callback = set_nmi_callback(microcode_nmi_callback); /* * Late loading dance. Why the heavy-handed stop_machine effort? * @@ -563,6 +578,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len) * conservative and good. */ ret = stop_machine_run(do_microcode_update, patch, NR_CPUS); + set_nmi_callback(saved_nmi_callback); watchdog_enable(); updated = atomic_read(&cpu_updated);
register an nmi callback. And this callback does busy-loop on threads which are waiting for loading completion. Control threads send NMI to slave threads to prevent NMI acceptance during ucode loading. Signed-off-by: Chao Gao <chao.gao@intel.com> --- Changes in v9: - control threads send NMI to all other threads. Slave threads will stay in the NMI handling to prevent NMI acceptance during ucode loading. Note that self-nmi is invalid according to SDM. - s/rep_nop/cpu_relax - remove debug message in microcode_nmi_callback(). Printing debug message would take long times and control thread may timeout. - rebase and fix conflicts Changes in v8: - new --- xen/arch/x86/microcode.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-)