Message ID | 148969985491.18518.5789656764002800021.stgit@Palanthas.fritz.box (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 16/03/17 22:30, Dario Faggioli wrote: > Within context_saved(), we call the context_saved hook, > and we use VCPU2OP() to determine from what scheduler. > VCPU2OP uses DOM2OP, which uses d->cpupool, which is > NULL when d is the idle domain. And in that case, > DOM2OP just returns ops, the scheduler of cpupool0. > > Therefore, if: > - cpupool0's scheduler defines context_saved (like > Credit2 and RTDS do), > - we are not in cpupool0 (i.e., our scheduler is > not ops), > - we are context switching from idle, > > we call VCPU2OP(idle_vcpu), which means > DOM2OP(idle->cpupool), which is ops. > > Therefore, we both: > - check if context_saved is defined in the wrong > scheduler; > - if yes, call the wrong one. > > When using Credit2 at boot, and also Credit2 in > the other cpupool, this is wrong but innocuous, > because it only involves the idle vcpus. > > When using Credit2 at boot, and Credit1 in the > other cpupool, this is *totally* wrong, and > it's by chance it does not explode! > > When using Credit2 and other schedulers I'm > developping, I hit the following assert (in > sched_credit2.c, on a CPU inside a cpupool that > does not use Credit2): > > csched2_context_saved() > { > ... > ASSERT(!vcpu_on_runq(svc)); > ... > } > > Fix this by taking care, in VCPU2OP, of the case > when the vcpu is an idle one. > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> ... with or without the remark below addressed. > --- > Cc: George Dunlap <george.dunlap@citrix.com> > Cc: Juergen Gross <jgross@suse.com> > Cc: Jan Beulich <jbeulich@suse.com> > --- > Cc-ing Jan, as this should be backported at least to 4.8, but, IMO, as back as > possible. > --- > xen/common/schedule.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/xen/common/schedule.c b/xen/common/schedule.c > index 223a120..d12f346 100644 > --- a/xen/common/schedule.c > +++ b/xen/common/schedule.c > @@ -78,7 +78,19 @@ static struct scheduler __read_mostly ops; > : (typeof((opsptr)->fn(opsptr, ##__VA_ARGS__)))0 ) > > #define DOM2OP(_d) (((_d)->cpupool == NULL) ? &ops : ((_d)->cpupool->sched)) > -#define VCPU2OP(_v) (DOM2OP((_v)->domain)) > +static inline struct scheduler* VCPU2OP(const struct vcpu *v) Rename, e.g. get_scheduler() ? > +{ > + struct domain *d = v->domain; > + > + if ( likely(d->cpupool != NULL) ) > + return d->cpupool->sched; > + > + /* v->processor never changes for idle vcpus, so using it here is safe */ > + if ( likely(is_idle_domain(d)) ) > + return per_cpu(scheduler, v->processor); > + else > + return &ops; > +} > #define VCPU2ONLINE(_v) cpupool_domain_cpumask((_v)->domain) > > static inline void trace_runstate_change(struct vcpu *v, int new_state) Juergen
>>> On 16.03.17 at 22:30, <dario.faggioli@citrix.com> wrote: > --- a/xen/common/schedule.c > +++ b/xen/common/schedule.c > @@ -78,7 +78,19 @@ static struct scheduler __read_mostly ops; > : (typeof((opsptr)->fn(opsptr, ##__VA_ARGS__)))0 ) > > #define DOM2OP(_d) (((_d)->cpupool == NULL) ? &ops : ((_d)->cpupool->sched)) > -#define VCPU2OP(_v) (DOM2OP((_v)->domain)) > +static inline struct scheduler* VCPU2OP(const struct vcpu *v) The * and blank in the return type want to switch positions. > +{ > + struct domain *d = v->domain; > + > + if ( likely(d->cpupool != NULL) ) > + return d->cpupool->sched; > + > + /* v->processor never changes for idle vcpus, so using it here is safe */ > + if ( likely(is_idle_domain(d)) ) > + return per_cpu(scheduler, v->processor); > + else > + return &ops; Having read through the description, I don't think I can conclude why using &ops here is correct (or at least benign). And even if this was explained in the description, I think a brief comment would be rather desirable here (the more that it having been &ops implicitly was wrong before as per the description). Jan
On Fri, 2017-03-17 at 06:28 +0100, Juergen Gross wrote: > On 16/03/17 22:30, Dario Faggioli wrote: > > [..] > > Fix this by taking care, in VCPU2OP, of the case > > when the vcpu is an idle one. > > > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> > > Reviewed-by: Juergen Gross <jgross@suse.com> > Thanks. I won't apply this tag to v2 though, as I'm changing the patch a bit, as a result of Jan's comment. > ... with or without the remark below addressed. > About this... > > --- a/xen/common/schedule.c > > +++ b/xen/common/schedule.c > > @@ -78,7 +78,19 @@ static struct scheduler __read_mostly ops; > > : (typeof((opsptr)->fn(opsptr, ##__VA_ARGS__)))0 ) > > > > #define DOM2OP(_d) (((_d)->cpupool == NULL) ? &ops : ((_d)- > > >cpupool->sched)) > > -#define VCPU2OP(_v) (DOM2OP((_v)->domain)) > > +static inline struct scheduler* VCPU2OP(const struct vcpu *v) > > Rename, e.g. get_scheduler() ? > I indeed like the idea, and will do that. But in another patch (which I'll include when posting v2). In fact, since I want this to be backported, I don't want to change too much code around, making future backports more difficult. Regards, Dario
On Fri, 2017-03-17 at 01:42 -0600, Jan Beulich wrote: > > > > On 16.03.17 at 22:30, <dario.faggioli@citrix.com> wrote: > > +{ > > + struct domain *d = v->domain; > > + > > + if ( likely(d->cpupool != NULL) ) > > + return d->cpupool->sched; > > + > > + /* v->processor never changes for idle vcpus, so using it here > > is safe */ > > + if ( likely(is_idle_domain(d)) ) > > + return per_cpu(scheduler, v->processor); > > + else > > + return &ops; > > Having read through the description, I don't think I can conclude > why using &ops here is correct (or at least benign). And even if > this was explained in the description, I think a brief comment > would be rather desirable here (the more that it having been > &ops implicitly was wrong before as per the description). > Well, looking at all the callers, basically we never return ops. I guess I was being overly conservative about keeping somewhat intact the implant of original code. I'll send v2 with both a comment and an ASSERT() for making things more clear. Thanks, Dario
diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 223a120..d12f346 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -78,7 +78,19 @@ static struct scheduler __read_mostly ops; : (typeof((opsptr)->fn(opsptr, ##__VA_ARGS__)))0 ) #define DOM2OP(_d) (((_d)->cpupool == NULL) ? &ops : ((_d)->cpupool->sched)) -#define VCPU2OP(_v) (DOM2OP((_v)->domain)) +static inline struct scheduler* VCPU2OP(const struct vcpu *v) +{ + struct domain *d = v->domain; + + if ( likely(d->cpupool != NULL) ) + return d->cpupool->sched; + + /* v->processor never changes for idle vcpus, so using it here is safe */ + if ( likely(is_idle_domain(d)) ) + return per_cpu(scheduler, v->processor); + else + return &ops; +} #define VCPU2ONLINE(_v) cpupool_domain_cpumask((_v)->domain) static inline void trace_runstate_change(struct vcpu *v, int new_state)
Within context_saved(), we call the context_saved hook, and we use VCPU2OP() to determine from what scheduler. VCPU2OP uses DOM2OP, which uses d->cpupool, which is NULL when d is the idle domain. And in that case, DOM2OP just returns ops, the scheduler of cpupool0. Therefore, if: - cpupool0's scheduler defines context_saved (like Credit2 and RTDS do), - we are not in cpupool0 (i.e., our scheduler is not ops), - we are context switching from idle, we call VCPU2OP(idle_vcpu), which means DOM2OP(idle->cpupool), which is ops. Therefore, we both: - check if context_saved is defined in the wrong scheduler; - if yes, call the wrong one. When using Credit2 at boot, and also Credit2 in the other cpupool, this is wrong but innocuous, because it only involves the idle vcpus. When using Credit2 at boot, and Credit1 in the other cpupool, this is *totally* wrong, and it's by chance it does not explode! When using Credit2 and other schedulers I'm developping, I hit the following assert (in sched_credit2.c, on a CPU inside a cpupool that does not use Credit2): csched2_context_saved() { ... ASSERT(!vcpu_on_runq(svc)); ... } Fix this by taking care, in VCPU2OP, of the case when the vcpu is an idle one. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> --- Cc: George Dunlap <george.dunlap@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Jan Beulich <jbeulich@suse.com> --- Cc-ing Jan, as this should be backported at least to 4.8, but, IMO, as back as possible. --- xen/common/schedule.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)