Message ID | 20240111201129.4010175-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1-alt] xen/livepatch: Make check_for_livepatch_work() faster in the common case | expand |
On 11.01.2024 21:11, Andrew Cooper wrote: > When livepatching is enabled, this function is used all the time. Really do > check the fastpath first, and annotate it likely() as this is the right answer > 100% of the time (to many significant figures). This cuts out 3 pointer > dereferences in the "nothing to do path". > > However, GCC still needs some help to persuade it not to set the full stack > frame (6 spilled registers, 3 slots of locals) even on the fastpath. > > Create a new check_for_livepatch_work() with the fastpath only, and make the > "new" do_livepatch_work() noinline. This causes the fastpath to need no stack > frame, making it faster still. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c index 1209fea2566c..2c4b84382798 100644 --- a/xen/common/livepatch.c +++ b/xen/common/livepatch.c @@ -1693,7 +1693,7 @@ static int livepatch_spin(atomic_t *counter, s_time_t timeout, * The main function which manages the work of quiescing the system and * patching code. */ -void check_for_livepatch_work(void) +static void noinline do_livepatch_work(void) { #define ACTION(x) [LIVEPATCH_ACTION_##x] = #x static const char *const names[] = { @@ -1711,10 +1711,6 @@ void check_for_livepatch_work(void) !is_idle_domain(current->sched_unit->domain) ) return; - /* Fast path: no work to do. */ - if ( !per_cpu(work_to_do, cpu ) ) - return; - smp_rmb(); /* In case we aborted, other CPUs can skip right away. */ if ( !livepatch_work.do_work ) @@ -1864,6 +1860,17 @@ void check_for_livepatch_work(void) } } +void check_for_livepatch_work(void) +{ + unsigned int cpu = smp_processor_id(); + + /* Fast path: no work to do. */ + if ( likely(!per_cpu(work_to_do, cpu)) ) + return; + + do_livepatch_work(); +} + /* * Only allow dependent payload is applied on top of the correct * build-id.
When livepatching is enabled, this function is used all the time. Really do check the fastpath first, and annotate it likely() as this is the right answer 100% of the time (to many significant figures). This cuts out 3 pointer dereferences in the "nothing to do path". However, GCC still needs some help to persuade it not to set the full stack frame (6 spilled registers, 3 slots of locals) even on the fastpath. Create a new check_for_livepatch_work() with the fastpath only, and make the "new" do_livepatch_work() noinline. This causes the fastpath to need no stack frame, making it faster still. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Ross Lagerwall <ross.lagerwall@citrix.com> CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> v1-alt: * Manually split the functions. Experimenting with __attribute__((cold)) was disappointing. Vs this patch, it creates an extra check_for_livepatch_work.cold function(and section) which is just `jmp do_livepatch_work`. --- xen/common/livepatch.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)