Message ID | 20190914085251.18816-34-jgross@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen: add core scheduling support | expand |
On 14.09.2019 10:52, Juergen Gross wrote: > --- a/xen/common/schedule.c > +++ b/xen/common/schedule.c > @@ -71,6 +71,7 @@ static void poll_timer_fn(void *data); > /* This is global for now so that private implementations can reach it */ > DEFINE_PER_CPU(struct scheduler *, scheduler); > DEFINE_PER_CPU_READ_MOSTLY(struct sched_resource *, sched_res); > +static DEFINE_PER_CPU_READ_MOSTLY(unsigned int, sched_res_idx); It's of course not very helpful that this variable (right here) doesn't ever get written to, and hence one can't judge where / how this is to be done (without going look elsewhere). But I guess that calculation can't be moved into this patch (accepting that it would always yield zero for now)? > @@ -135,6 +136,12 @@ static struct scheduler sched_idle_ops = { > .switch_sched = sched_idle_switch_sched, > }; > > +static inline struct vcpu *sched_unit2vcpu_cpu(struct sched_unit *unit, const (on the parameter)? > + unsigned int cpu) > +{ > + return unit->domain->vcpu[unit->unit_id + per_cpu(sched_res_idx, cpu)]; > +} > + > static inline struct scheduler *dom_scheduler(const struct domain *d) > { > if ( likely(d->cpupool != NULL) ) > @@ -2008,7 +2015,7 @@ static void sched_slave(void) > > pcpu_schedule_unlock_irq(lock, cpu); > > - sched_context_switch(vprev, next->vcpu_list, now); > + sched_context_switch(vprev, sched_unit2vcpu_cpu(next, cpu), now); > } By the end of the series this will be sched_context_switch(vprev, sched_unit2vcpu_cpu(next, cpu), is_idle_unit(next) && !is_idle_unit(prev), now); clarifying (I think) that "next" can indeed be an idle unit. I'm having difficulty seeing how can produce the correct result in all three cases - all vCPU-s in the unit belong to a real domain - all vCPU-s in the unit belong to the idle domain - there's a mix of "real" and (filler) "idle" vCPU-s My main question is what "next" is in the last of the 3 cases above. Considering that scheduling happens in terms of units, I'd expect it to be a real domain's unit, yet then I don't see how you'd get at the correct (idle) vCPU when calling sched_unit2vcpu_cpu(next, cpu). Jan
On 24.09.19 12:05, Jan Beulich wrote: > On 14.09.2019 10:52, Juergen Gross wrote: >> --- a/xen/common/schedule.c >> +++ b/xen/common/schedule.c >> @@ -71,6 +71,7 @@ static void poll_timer_fn(void *data); >> /* This is global for now so that private implementations can reach it */ >> DEFINE_PER_CPU(struct scheduler *, scheduler); >> DEFINE_PER_CPU_READ_MOSTLY(struct sched_resource *, sched_res); >> +static DEFINE_PER_CPU_READ_MOSTLY(unsigned int, sched_res_idx); > > It's of course not very helpful that this variable (right here) doesn't > ever get written to, and hence one can't judge where / how this is to > be done (without going look elsewhere). But I guess that calculation > can't be moved into this patch (accepting that it would always yield > zero for now)? Not easily. I can set sched_res_idx to 0 explicitly when initializing a new cpu if you like that better. > >> @@ -135,6 +136,12 @@ static struct scheduler sched_idle_ops = { >> .switch_sched = sched_idle_switch_sched, >> }; >> >> +static inline struct vcpu *sched_unit2vcpu_cpu(struct sched_unit *unit, > > const (on the parameter)? Okay. > >> + unsigned int cpu) >> +{ >> + return unit->domain->vcpu[unit->unit_id + per_cpu(sched_res_idx, cpu)]; >> +} >> + >> static inline struct scheduler *dom_scheduler(const struct domain *d) >> { >> if ( likely(d->cpupool != NULL) ) >> @@ -2008,7 +2015,7 @@ static void sched_slave(void) >> >> pcpu_schedule_unlock_irq(lock, cpu); >> >> - sched_context_switch(vprev, next->vcpu_list, now); >> + sched_context_switch(vprev, sched_unit2vcpu_cpu(next, cpu), now); >> } > > By the end of the series this will be > > sched_context_switch(vprev, sched_unit2vcpu_cpu(next, cpu), > is_idle_unit(next) && !is_idle_unit(prev), now); > > clarifying (I think) that "next" can indeed be an idle unit. I'm having > difficulty seeing how can produce the correct result in all three cases > - all vCPU-s in the unit belong to a real domain > - all vCPU-s in the unit belong to the idle domain > - there's a mix of "real" and (filler) "idle" vCPU-s > My main question is what "next" is in the last of the 3 cases above. next points to the real domain's unit. > Considering that scheduling happens in terms of units, I'd expect it to > be a real domain's unit, yet then I don't see how you'd get at the > correct (idle) vCPU when calling sched_unit2vcpu_cpu(next, cpu). What is the problem? sched_unit2vcpu_cpu() will first get the real domain's vcpu for that cpu (if exisiting) and replace it by the idle vcpu of that cpu if the real domain's vcpu is either not existing or won't be running. The main trick is that we use idle_vcpu[cpu] (which is existing today already). Juergen
On 24.09.2019 17:20, Jürgen Groß wrote: > On 24.09.19 12:05, Jan Beulich wrote: >> On 14.09.2019 10:52, Juergen Gross wrote: >>> --- a/xen/common/schedule.c >>> +++ b/xen/common/schedule.c >>> @@ -71,6 +71,7 @@ static void poll_timer_fn(void *data); >>> /* This is global for now so that private implementations can reach it */ >>> DEFINE_PER_CPU(struct scheduler *, scheduler); >>> DEFINE_PER_CPU_READ_MOSTLY(struct sched_resource *, sched_res); >>> +static DEFINE_PER_CPU_READ_MOSTLY(unsigned int, sched_res_idx); >> >> It's of course not very helpful that this variable (right here) doesn't >> ever get written to, and hence one can't judge where / how this is to >> be done (without going look elsewhere). But I guess that calculation >> can't be moved into this patch (accepting that it would always yield >> zero for now)? > > Not easily. I can set sched_res_idx to 0 explicitly when initializing > a new cpu if you like that better. No, not really. Then better leave it as is. >>> @@ -2008,7 +2015,7 @@ static void sched_slave(void) >>> >>> pcpu_schedule_unlock_irq(lock, cpu); >>> >>> - sched_context_switch(vprev, next->vcpu_list, now); >>> + sched_context_switch(vprev, sched_unit2vcpu_cpu(next, cpu), now); >>> } >> >> By the end of the series this will be >> >> sched_context_switch(vprev, sched_unit2vcpu_cpu(next, cpu), >> is_idle_unit(next) && !is_idle_unit(prev), now); >> >> clarifying (I think) that "next" can indeed be an idle unit. I'm having >> difficulty seeing how can produce the correct result in all three cases >> - all vCPU-s in the unit belong to a real domain >> - all vCPU-s in the unit belong to the idle domain >> - there's a mix of "real" and (filler) "idle" vCPU-s >> My main question is what "next" is in the last of the 3 cases above. > > next points to the real domain's unit. > >> Considering that scheduling happens in terms of units, I'd expect it to >> be a real domain's unit, yet then I don't see how you'd get at the >> correct (idle) vCPU when calling sched_unit2vcpu_cpu(next, cpu). > > What is the problem? > > sched_unit2vcpu_cpu() will first get the real domain's vcpu for that cpu > (if exisiting) and replace it by the idle vcpu of that cpu if the real > domain's vcpu is either not existing or won't be running. The main trick > is that we use idle_vcpu[cpu] (which is existing today already). Yeah, I guess this has become more clear by looking at the next patch. Without seeing the final shape sched_unit2vcpu_cpu() will take (i.e. including the trick you talk about) it wasn't really clear to me what is (supposed to be) going on here. (Makes me wonder once again whether that function couldn't have taken its final shape right away. But anyway.) Jan
diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 5e34008ca8..246ad38c7d 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -71,6 +71,7 @@ static void poll_timer_fn(void *data); /* This is global for now so that private implementations can reach it */ DEFINE_PER_CPU(struct scheduler *, scheduler); DEFINE_PER_CPU_READ_MOSTLY(struct sched_resource *, sched_res); +static DEFINE_PER_CPU_READ_MOSTLY(unsigned int, sched_res_idx); /* Scratch space for cpumasks. */ DEFINE_PER_CPU(cpumask_t, cpumask_scratch); @@ -135,6 +136,12 @@ static struct scheduler sched_idle_ops = { .switch_sched = sched_idle_switch_sched, }; +static inline struct vcpu *sched_unit2vcpu_cpu(struct sched_unit *unit, + unsigned int cpu) +{ + return unit->domain->vcpu[unit->unit_id + per_cpu(sched_res_idx, cpu)]; +} + static inline struct scheduler *dom_scheduler(const struct domain *d) { if ( likely(d->cpupool != NULL) ) @@ -2008,7 +2015,7 @@ static void sched_slave(void) pcpu_schedule_unlock_irq(lock, cpu); - sched_context_switch(vprev, next->vcpu_list, now); + sched_context_switch(vprev, sched_unit2vcpu_cpu(next, cpu), now); } /* @@ -2067,7 +2074,7 @@ static void schedule(void) pcpu_schedule_unlock_irq(lock, cpu); - vnext = next->vcpu_list; + vnext = sched_unit2vcpu_cpu(next, cpu); sched_context_switch(vprev, vnext, now); }
Add a percpu variable holding the index of the cpu in the current sched_resource structure. This index is used to get the correct vcpu of a sched_unit on a specific cpu. For now this index will be zero for all cpus, but with core scheduling it will be possible to have higher values, too. Signed-off-by: Juergen Gross <jgross@suse.com> --- RFC V2: new patch (carved out from RFC V1 patch 49) --- xen/common/schedule.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)