Message ID | 20190914064217.4877-1-jgross@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] xen/sched: rework and rename vcpu_force_reschedule() | expand |
On 14.09.2019 08:42, Juergen Gross wrote: > vcpu_force_reschedule() is only used for modifying the periodic timer > of a vcpu. Forcing a vcpu to give up the physical cpu for that purpose > is kind of brutal. > > So instead of doing the reschedule dance just operate on the timer > directly. By protecting periodic timer modifications against concurrent > timer activation via a per-vcpu lock it is even no longer required to > bother the target vcpu at all for updating its timer. > > Rename the function to vcpu_set_periodic_timer() as this now reflects > the functionality. > > Signed-off-by: Juergen Gross <jgross@suse.com> I continue to be unhappy about there being no word at all about ... > @@ -724,24 +725,6 @@ static void vcpu_migrate_finish(struct vcpu *v) > vcpu_wake(v); > } > > -/* > - * Force a VCPU through a deschedule/reschedule path. > - * For example, using this when setting the periodic timer period means that > - * most periodic-timer state need only be touched from within the scheduler > - * which can thus be done without need for synchronisation. > - */ > -void vcpu_force_reschedule(struct vcpu *v) ... the originally intended synchronization-free handling. Forcing the vCPU through the scheduler may seem harsh (and quite some overhead), yes, but I don't think the above was written (and decided) without consideration. One effect of this can be seen by you ... > +void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value) > +{ > + spin_lock(&v->periodic_timer_lock); > + > + stop_timer(&v->periodic_timer); ... introducing a new stop_timer() here, i.e. which doesn't replace any existing one. The implication is that other than before the periodic timer may now not run (for a brief moment) despite it being supposed to run - after all it has been active so far whenever a vCPU was running. Then again, looking at the involved code paths yet again, I wonder whether this has been working right at all: There's an early exit from schedule() when prev == next, which bypasses vcpu_periodic_timer_work(). And I can't seem to be able to spot anything on the vcpu_force_reschedule() path which would guarantee this shortcut to not be taken. Jan
On 16.09.19 11:20, Jan Beulich wrote: > On 14.09.2019 08:42, Juergen Gross wrote: >> vcpu_force_reschedule() is only used for modifying the periodic timer >> of a vcpu. Forcing a vcpu to give up the physical cpu for that purpose >> is kind of brutal. >> >> So instead of doing the reschedule dance just operate on the timer >> directly. By protecting periodic timer modifications against concurrent >> timer activation via a per-vcpu lock it is even no longer required to >> bother the target vcpu at all for updating its timer. >> >> Rename the function to vcpu_set_periodic_timer() as this now reflects >> the functionality. >> >> Signed-off-by: Juergen Gross <jgross@suse.com> > > I continue to be unhappy about there being no word at all about ... > >> @@ -724,24 +725,6 @@ static void vcpu_migrate_finish(struct vcpu *v) >> vcpu_wake(v); >> } >> >> -/* >> - * Force a VCPU through a deschedule/reschedule path. >> - * For example, using this when setting the periodic timer period means that >> - * most periodic-timer state need only be touched from within the scheduler >> - * which can thus be done without need for synchronisation. >> - */ >> -void vcpu_force_reschedule(struct vcpu *v) > > ... the originally intended synchronization-free handling. Forcing > the vCPU through the scheduler may seem harsh (and quite some > overhead), yes, but I don't think the above was written (and > decided) without consideration. One effect of this can be seen by > you ... > >> +void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value) >> +{ >> + spin_lock(&v->periodic_timer_lock); >> + >> + stop_timer(&v->periodic_timer); > > ... introducing a new stop_timer() here, i.e. which doesn't replace > any existing one. The implication is that other than before the > periodic timer may now not run (for a brief moment) despite it > being supposed to run - after all it has been active so far > whenever a vCPU was running. > > Then again, looking at the involved code paths yet again, I wonder > whether this has been working right at all: There's an early exit > from schedule() when prev == next, which bypasses > vcpu_periodic_timer_work(). And I can't seem to be able to spot > anything on the vcpu_force_reschedule() path which would guarantee > this shortcut to not be taken. First, the current "synchronization-free" handling is not existing. The synchronization is just hidden in the calls of vcpu_migrate_*() and it is done via the scheduler lock. Yes, I'm adding a stop_timer(), but the related stop_timer() call in the old code was in schedule(). So statically you are right, but dynamically there is no new stop_timer() call involved. And last: the case prev == next would not occur today, as the migrate flag being set in vcpu->pause_flags would cause the vcpu to be taken away from the cpu. So it is working today, but setting the periodic timer requires two scheduling events in case the target vcpu is currently running. Juergen
On 16.09.2019 14:49, Juergen Gross wrote: > On 16.09.19 11:20, Jan Beulich wrote: >> On 14.09.2019 08:42, Juergen Gross wrote: >>> vcpu_force_reschedule() is only used for modifying the periodic timer >>> of a vcpu. Forcing a vcpu to give up the physical cpu for that purpose >>> is kind of brutal. >>> >>> So instead of doing the reschedule dance just operate on the timer >>> directly. By protecting periodic timer modifications against concurrent >>> timer activation via a per-vcpu lock it is even no longer required to >>> bother the target vcpu at all for updating its timer. >>> >>> Rename the function to vcpu_set_periodic_timer() as this now reflects >>> the functionality. >>> >>> Signed-off-by: Juergen Gross <jgross@suse.com> >> >> I continue to be unhappy about there being no word at all about ... >> >>> @@ -724,24 +725,6 @@ static void vcpu_migrate_finish(struct vcpu *v) >>> vcpu_wake(v); >>> } >>> >>> -/* >>> - * Force a VCPU through a deschedule/reschedule path. >>> - * For example, using this when setting the periodic timer period means that >>> - * most periodic-timer state need only be touched from within the scheduler >>> - * which can thus be done without need for synchronisation. >>> - */ >>> -void vcpu_force_reschedule(struct vcpu *v) >> >> ... the originally intended synchronization-free handling. Forcing >> the vCPU through the scheduler may seem harsh (and quite some >> overhead), yes, but I don't think the above was written (and >> decided) without consideration. One effect of this can be seen by >> you ... >> >>> +void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value) >>> +{ >>> + spin_lock(&v->periodic_timer_lock); >>> + >>> + stop_timer(&v->periodic_timer); >> >> ... introducing a new stop_timer() here, i.e. which doesn't replace >> any existing one. The implication is that other than before the >> periodic timer may now not run (for a brief moment) despite it >> being supposed to run - after all it has been active so far >> whenever a vCPU was running. >> >> Then again, looking at the involved code paths yet again, I wonder >> whether this has been working right at all: There's an early exit >> from schedule() when prev == next, which bypasses >> vcpu_periodic_timer_work(). And I can't seem to be able to spot >> anything on the vcpu_force_reschedule() path which would guarantee >> this shortcut to not be taken. > > First, the current "synchronization-free" handling is not existing. The > synchronization is just hidden in the calls of vcpu_migrate_*() and it > is done via the scheduler lock. Sure, but the scheduler lock needs to be taken during scheduling of the vCPU anyway. There was no "extra" synchronization involved. > Yes, I'm adding a stop_timer(), but the related stop_timer() call in > the old code was in schedule(). So statically you are right, but > dynamically there is no new stop_timer() call involved. I did specifically check that my comment is not just about the "static" part (as you call it). As said - there was no stop_timer() before behind a running vCPU's back. This is what worries me. > And last: the case prev == next would not occur today, as the migrate > flag being set in vcpu->pause_flags would cause the vcpu to be taken > away from the cpu. So it is working today, but setting the periodic > timer requires two scheduling events in case the target vcpu is > currently running. I'm not going to claim I fully understood the code when looking at it in the morning, but I couldn't find the place(s) guaranteeing that by the time the migration of the vCPU is over it wouldn't be runnable again right away, and hence potentially re-chosen as the vCPU to run on the pCPU is was running on before. Jan
On 16.09.19 16:39, Jan Beulich wrote: > On 16.09.2019 14:49, Juergen Gross wrote: >> On 16.09.19 11:20, Jan Beulich wrote: >>> On 14.09.2019 08:42, Juergen Gross wrote: >>>> vcpu_force_reschedule() is only used for modifying the periodic timer >>>> of a vcpu. Forcing a vcpu to give up the physical cpu for that purpose >>>> is kind of brutal. >>>> >>>> So instead of doing the reschedule dance just operate on the timer >>>> directly. By protecting periodic timer modifications against concurrent >>>> timer activation via a per-vcpu lock it is even no longer required to >>>> bother the target vcpu at all for updating its timer. >>>> >>>> Rename the function to vcpu_set_periodic_timer() as this now reflects >>>> the functionality. >>>> >>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>> >>> I continue to be unhappy about there being no word at all about ... >>> >>>> @@ -724,24 +725,6 @@ static void vcpu_migrate_finish(struct vcpu *v) >>>> vcpu_wake(v); >>>> } >>>> >>>> -/* >>>> - * Force a VCPU through a deschedule/reschedule path. >>>> - * For example, using this when setting the periodic timer period means that >>>> - * most periodic-timer state need only be touched from within the scheduler >>>> - * which can thus be done without need for synchronisation. >>>> - */ >>>> -void vcpu_force_reschedule(struct vcpu *v) >>> >>> ... the originally intended synchronization-free handling. Forcing >>> the vCPU through the scheduler may seem harsh (and quite some >>> overhead), yes, but I don't think the above was written (and >>> decided) without consideration. One effect of this can be seen by >>> you ... >>> >>>> +void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value) >>>> +{ >>>> + spin_lock(&v->periodic_timer_lock); >>>> + >>>> + stop_timer(&v->periodic_timer); >>> >>> ... introducing a new stop_timer() here, i.e. which doesn't replace >>> any existing one. The implication is that other than before the >>> periodic timer may now not run (for a brief moment) despite it >>> being supposed to run - after all it has been active so far >>> whenever a vCPU was running. >>> >>> Then again, looking at the involved code paths yet again, I wonder >>> whether this has been working right at all: There's an early exit >>> from schedule() when prev == next, which bypasses >>> vcpu_periodic_timer_work(). And I can't seem to be able to spot >>> anything on the vcpu_force_reschedule() path which would guarantee >>> this shortcut to not be taken. >> >> First, the current "synchronization-free" handling is not existing. The >> synchronization is just hidden in the calls of vcpu_migrate_*() and it >> is done via the scheduler lock. > > Sure, but the scheduler lock needs to be taken during scheduling > of the vCPU anyway. There was no "extra" synchronization involved. Of course there was. The scheduling path was forced to happen, this resulted in the extra synchronization (twice in fact, once for de-scheduling and once for scheduling in again). > >> Yes, I'm adding a stop_timer(), but the related stop_timer() call in >> the old code was in schedule(). So statically you are right, but >> dynamically there is no new stop_timer() call involved. > > I did specifically check that my comment is not just about the > "static" part (as you call it). As said - there was no stop_timer() > before behind a running vCPU's back. This is what worries me. There is a stop_timer() for the periodic timer happening each time the vcpu is de-scheduled (look into schedule()). And as setting the periodic timer today results in a de-scheduling of the vcpu the stop_timer() is happening. I assume here that the de-scheduling would not have happened without setting the timer, of course. In case the vcpu is not running when the timer is being set, my patch will call stop_timer() for a timer already being stopped. > >> And last: the case prev == next would not occur today, as the migrate >> flag being set in vcpu->pause_flags would cause the vcpu to be taken >> away from the cpu. So it is working today, but setting the periodic >> timer requires two scheduling events in case the target vcpu is >> currently running. > > I'm not going to claim I fully understood the code when looking at > it in the morning, but I couldn't find the place(s) guaranteeing > that by the time the migration of the vCPU is over it wouldn't be > runnable again right away, and hence potentially re-chosen as the > vCPU to run on the pCPU is was running on before. vcpu_migrate_start() is setting _VPF_migrating in the vcpu's pause_flags and then initiates a scheduling event via vcpu_sleep_nosync(). vcpu_migrate_finish() will only reset the flag if the vcpu is not running (so either it wasn't running before, or the scheduling event already has happened). If it is still running it will not reset the flag, but this will only be done via context_saved(), which is called when the vcpus has been de-scheduled. The vcpu can only be selected to be running again when the migrated flag is not set. Juergen
On Sat, 2019-09-14 at 08:42 +0200, Juergen Gross wrote: > vcpu_force_reschedule() is only used for modifying the periodic timer > of a vcpu. Forcing a vcpu to give up the physical cpu for that > purpose > is kind of brutal. > > So instead of doing the reschedule dance just operate on the timer > directly. By protecting periodic timer modifications against > concurrent > timer activation via a per-vcpu lock it is even no longer required to > bother the target vcpu at all for updating its timer. > > Rename the function to vcpu_set_periodic_timer() as this now reflects > the functionality. > Personally, I'm rather happy to see the code which was doing that very weird "let's go through rescheduling" dance going away. I, FWIW, never understood why periodic timer handling was implemented that way (and looking back at relevant changelogs does not help). The code, as it results after applying this patch, is a lot better, and easier to understand. Performance and scalability wise, I don't have benchmarks for this specific patch (but the ones I did included it, as it back then was part of the core-scheduling series), but I agree with Juergen. I.e., I think the patch is either neutral or, if it does something, it improves things. Furthermore, periodic timer is *not* used any longer (and since quite some time/kernel versions). Basically, all we do with the periodic timer is to disable it during boot. At least for Linux, but I think this is the case for FreeBSD too. So, even if the patch would have a negative impact (which again I don't think it's the case), we probably won't see them. On this grounds (and, of course, on the one that I've looked at the code, and think it's correct), for the scheduling part: > Signed-off-by: Juergen Gross <jgross@suse.com> > Reviewed-by: Dario Faggioli <dfaggioli@suse.com> Then, if some words about the outcome of the discussion in this thread, e.g., a mention to the fact that the old code wasn't really lockless, and that the new code is a lot more straightforward, it'd be even better. But my Rev-by stands, with or without this. Regards
diff --git a/xen/arch/x86/pv/shim.c b/xen/arch/x86/pv/shim.c index 324ca27f93..5edbcd9ac5 100644 --- a/xen/arch/x86/pv/shim.c +++ b/xen/arch/x86/pv/shim.c @@ -410,7 +410,7 @@ int pv_shim_shutdown(uint8_t reason) unmap_vcpu_info(v); /* Reset the periodic timer to the default value. */ - v->periodic_period = MILLISECS(10); + vcpu_set_periodic_timer(v, MILLISECS(10)); /* Stop the singleshot timer. */ stop_timer(&v->singleshot_timer); @@ -419,8 +419,6 @@ int pv_shim_shutdown(uint8_t reason) if ( v != current ) vcpu_unpause_by_systemcontroller(v); - else - vcpu_force_reschedule(v); } return 0; diff --git a/xen/common/domain.c b/xen/common/domain.c index 9a48b2504b..0cff749bbe 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1494,15 +1494,13 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) if ( set.period_ns > STIME_DELTA_MAX ) return -EINVAL; - v->periodic_period = set.period_ns; - vcpu_force_reschedule(v); + vcpu_set_periodic_timer(v, set.period_ns); break; } case VCPUOP_stop_periodic_timer: - v->periodic_period = 0; - vcpu_force_reschedule(v); + vcpu_set_periodic_timer(v, 0); break; case VCPUOP_set_singleshot_timer: diff --git a/xen/common/schedule.c b/xen/common/schedule.c index fdeec10c3b..13b5ffc7cf 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -312,6 +312,7 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor) v->processor = processor; /* Initialise the per-vcpu timers. */ + spin_lock_init(&v->periodic_timer_lock); init_timer(&v->periodic_timer, vcpu_periodic_timer_fn, v, v->processor); init_timer(&v->singleshot_timer, vcpu_singleshot_timer_fn, @@ -724,24 +725,6 @@ static void vcpu_migrate_finish(struct vcpu *v) vcpu_wake(v); } -/* - * Force a VCPU through a deschedule/reschedule path. - * For example, using this when setting the periodic timer period means that - * most periodic-timer state need only be touched from within the scheduler - * which can thus be done without need for synchronisation. - */ -void vcpu_force_reschedule(struct vcpu *v) -{ - spinlock_t *lock = vcpu_schedule_lock_irq(v); - - if ( v->is_running ) - vcpu_migrate_start(v); - - vcpu_schedule_unlock_irq(lock, v); - - vcpu_migrate_finish(v); -} - void restore_vcpu_affinity(struct domain *d) { unsigned int cpu = smp_processor_id(); @@ -1458,14 +1441,11 @@ long sched_adjust_global(struct xen_sysctl_scheduler_op *op) return rc; } -static void vcpu_periodic_timer_work(struct vcpu *v) +static void vcpu_periodic_timer_work_locked(struct vcpu *v) { s_time_t now; s_time_t periodic_next_event; - if ( v->periodic_period == 0 ) - return; - now = NOW(); periodic_next_event = v->periodic_last_event + v->periodic_period; @@ -1476,10 +1456,37 @@ static void vcpu_periodic_timer_work(struct vcpu *v) periodic_next_event = now + v->periodic_period; } - migrate_timer(&v->periodic_timer, smp_processor_id()); + migrate_timer(&v->periodic_timer, v->processor); set_timer(&v->periodic_timer, periodic_next_event); } +static void vcpu_periodic_timer_work(struct vcpu *v) +{ + if ( v->periodic_period == 0 ) + return; + + spin_lock(&v->periodic_timer_lock); + if ( v->periodic_period ) + vcpu_periodic_timer_work_locked(v); + spin_unlock(&v->periodic_timer_lock); +} + +/* + * Set the periodic timer of a vcpu. + */ +void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value) +{ + spin_lock(&v->periodic_timer_lock); + + stop_timer(&v->periodic_timer); + + v->periodic_period = value; + if ( value ) + vcpu_periodic_timer_work_locked(v); + + spin_unlock(&v->periodic_timer_lock); +} + /* * The main function * - deschedule the current domain (scheduler independent). diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index e3601c1935..40097ff334 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -153,6 +153,7 @@ struct vcpu struct vcpu *next_in_list; + spinlock_t periodic_timer_lock; s_time_t periodic_period; s_time_t periodic_last_event; struct timer periodic_timer; @@ -864,7 +865,7 @@ struct scheduler *scheduler_get_default(void); struct scheduler *scheduler_alloc(unsigned int sched_id, int *perr); void scheduler_free(struct scheduler *sched); int schedule_cpu_switch(unsigned int cpu, struct cpupool *c); -void vcpu_force_reschedule(struct vcpu *v); +void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value); int cpu_disable_scheduler(unsigned int cpu); /* We need it in dom0_setup_vcpu */ void sched_set_affinity(struct vcpu *v, const cpumask_t *hard,
vcpu_force_reschedule() is only used for modifying the periodic timer of a vcpu. Forcing a vcpu to give up the physical cpu for that purpose is kind of brutal. So instead of doing the reschedule dance just operate on the timer directly. By protecting periodic timer modifications against concurrent timer activation via a per-vcpu lock it is even no longer required to bother the target vcpu at all for updating its timer. Rename the function to vcpu_set_periodic_timer() as this now reflects the functionality. Signed-off-by: Juergen Gross <jgross@suse.com> --- - Carved out from my core scheduling series - Reworked to avoid deadlock when 2 vcpus are trying to modify each others periodic timers, leading to address all comments by Jan Beulich. V2: - test periodic_period again in vcpu_periodic_timer_work() when lock obtained (Jan Beulich) --- xen/arch/x86/pv/shim.c | 4 +--- xen/common/domain.c | 6 ++---- xen/common/schedule.c | 53 ++++++++++++++++++++++++++++--------------------- xen/include/xen/sched.h | 3 ++- 4 files changed, 35 insertions(+), 31 deletions(-)