diff mbox

xen: recalculate per-cpupool credits when updating timeslice

Message ID 1454062908-32013-1-git-send-email-jgross@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jürgen Groß Jan. 29, 2016, 10:21 a.m. UTC
When modifying the timeslice of the credit scheduler in a cpupool the
cpupool global credit value (n_cpus * credits_per_tslice) isn't
recalculated. This will lead to wrong scheduling decisions later.

Do the recalculation when updating the timeslice.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/sched_credit.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Jan Beulich Jan. 29, 2016, 10:46 a.m. UTC | #1
>>> On 29.01.16 at 11:21, <JGross@suse.com> wrote:
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -1086,12 +1086,19 @@ csched_dom_cntl(
>  static inline void
>  __csched_set_tslice(struct csched_private *prv, unsigned timeslice)
>  {
> +    unsigned long flags;
> +
> +    spin_lock_irqsave(&prv->lock, flags);
> +
>      prv->tslice_ms = timeslice;
>      prv->ticks_per_tslice = CSCHED_TICKS_PER_TSLICE;
>      if ( prv->tslice_ms < prv->ticks_per_tslice )
>          prv->ticks_per_tslice = 1;
>      prv->tick_period_us = prv->tslice_ms * 1000 / prv->ticks_per_tslice;
>      prv->credits_per_tslice = CSCHED_CREDITS_PER_MSEC * prv->tslice_ms;
> +    prv->credit = prv->credits_per_tslice * prv->ncpus;
> +
> +    spin_unlock_irqrestore(&prv->lock, flags);
>  }

The added locking, which has no reason give for in the description
at all, puzzles me: I can see it being needed (and having been
missing) when called from csched_sys_cntl(), but it's not clear to
me why it would be needed when called from csched_init(). Yet
csched_sys_cntl() subsequently als updates prv->ratelimit_us,
and hence the lock would perhaps better be taken there?

Jan
Jürgen Groß Jan. 29, 2016, 10:59 a.m. UTC | #2
On 29/01/16 11:46, Jan Beulich wrote:
>>>> On 29.01.16 at 11:21, <JGross@suse.com> wrote:
>> --- a/xen/common/sched_credit.c
>> +++ b/xen/common/sched_credit.c
>> @@ -1086,12 +1086,19 @@ csched_dom_cntl(
>>  static inline void
>>  __csched_set_tslice(struct csched_private *prv, unsigned timeslice)
>>  {
>> +    unsigned long flags;
>> +
>> +    spin_lock_irqsave(&prv->lock, flags);
>> +
>>      prv->tslice_ms = timeslice;
>>      prv->ticks_per_tslice = CSCHED_TICKS_PER_TSLICE;
>>      if ( prv->tslice_ms < prv->ticks_per_tslice )
>>          prv->ticks_per_tslice = 1;
>>      prv->tick_period_us = prv->tslice_ms * 1000 / prv->ticks_per_tslice;
>>      prv->credits_per_tslice = CSCHED_CREDITS_PER_MSEC * prv->tslice_ms;
>> +    prv->credit = prv->credits_per_tslice * prv->ncpus;
>> +
>> +    spin_unlock_irqrestore(&prv->lock, flags);
>>  }
> 
> The added locking, which has no reason give for in the description
> at all, puzzles me: I can see it being needed (and having been
> missing) when called from csched_sys_cntl(), but it's not clear to
> me why it would be needed when called from csched_init(). Yet
> csched_sys_cntl() subsequently als updates prv->ratelimit_us,
> and hence the lock would perhaps better be taken there?

The locking is needed to protect against csched_alloc_pdata() and
csched_free_pdata(). prv->credit could be permananently wrong
without the lock, while prv->ratelimit_us can't be modified
concurrently in a wrong way (it could be modified by two concurrent
calls of csched_sys_cntl(), but even with locking one of both
calls would be the winner, same applies to the case with no lock).

OTOH I don't mind moving the lock to csched_sys_cntl(). Dario,
George, any preferences?


Juergen
Alan Robinson Feb. 2, 2016, 7:55 a.m. UTC | #3
On Fri, Jan 29, 2016 at 11:21:48AM +0100, Juergen Gross wrote:
> When modifying the timeslice of the credit scheduler in a cpupool the
> cpupool global credit value (n_cpus * credits_per_tslice) isn't
> recalculated. This will lead to wrong scheduling decisions later.
> 
> Do the recalculation when updating the timeslice.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

We had notice that the quota's were not working after the timeslice
was changed in a mono-cpupool running two guests doing 'for loops'. 
Applied Juergen's patch to a SuSE xen-4.4.3_02-26.2 rpm, the patch 
was offset by 6 lines. The guests now observe the quotas when the
timeslice is changed.

Tested-by: Alan.Robinson <alan.robinson@ts.fujitsu.com>
Dario Faggioli Feb. 2, 2016, 9:53 a.m. UTC | #4
On Fri, 2016-01-29 at 11:59 +0100, Juergen Gross wrote:
> On 29/01/16 11:46, Jan Beulich wrote:
> > > > > On 29.01.16 at 11:21, <JGross@suse.com> wrote:
> > > --- a/xen/common/sched_credit.c
> > > +++ b/xen/common/sched_credit.c
> > > @@ -1086,12 +1086,19 @@ csched_dom_cntl(
> > >  static inline void
> > >  __csched_set_tslice(struct csched_private *prv, unsigned
> > > timeslice)
> > >  {
> > > +    unsigned long flags;
> > > +
> > > +    spin_lock_irqsave(&prv->lock, flags);
> > > +
> > >      prv->tslice_ms = timeslice;
> > >      prv->ticks_per_tslice = CSCHED_TICKS_PER_TSLICE;
> > >      if ( prv->tslice_ms < prv->ticks_per_tslice )
> > >          prv->ticks_per_tslice = 1;
> > >      prv->tick_period_us = prv->tslice_ms * 1000 / prv-
> > > >ticks_per_tslice;
> > >      prv->credits_per_tslice = CSCHED_CREDITS_PER_MSEC * prv-
> > > >tslice_ms;
> > > +    prv->credit = prv->credits_per_tslice * prv->ncpus;
> > > +
> > > +    spin_unlock_irqrestore(&prv->lock, flags);
> > >  }
> > 
> > The added locking, which has no reason give for in the description
> > at all, puzzles me: I can see it being needed (and having been
> > missing) when called from csched_sys_cntl(), but it's not clear to
> > me why it would be needed when called from csched_init(). Yet
> > csched_sys_cntl() subsequently als updates prv->ratelimit_us,
> > and hence the lock would perhaps better be taken there?
> 
> The locking is needed to protect against csched_alloc_pdata() and
> csched_free_pdata(). prv->credit could be permananently wrong
> without the lock, while prv->ratelimit_us can't be modified
> concurrently in a wrong way (it could be modified by two concurrent
> calls of csched_sys_cntl(), but even with locking one of both
> calls would be the winner, same applies to the case with no lock).
> 
> OTOH I don't mind moving the lock to csched_sys_cntl(). Dario,
> George, any preferences?
> 
Yes, I think having the lock in csched_sys_cntl() would be preferable.

In any case, since the lack of locking and lack of recalculation look
like two pretty independent existing bugs to me, can we have either:
 a. two patches;
 b. one patch but with both the issues described in the changelog.

My preference going to a.

Thanks and Regards,
Dario
Jürgen Groß Feb. 2, 2016, 10:35 a.m. UTC | #5
On 02/02/16 10:53, Dario Faggioli wrote:
> On Fri, 2016-01-29 at 11:59 +0100, Juergen Gross wrote:
>> On 29/01/16 11:46, Jan Beulich wrote:
>>>>>> On 29.01.16 at 11:21, <JGross@suse.com> wrote:
>>>> --- a/xen/common/sched_credit.c
>>>> +++ b/xen/common/sched_credit.c
>>>> @@ -1086,12 +1086,19 @@ csched_dom_cntl(
>>>>  static inline void
>>>>  __csched_set_tslice(struct csched_private *prv, unsigned
>>>> timeslice)
>>>>  {
>>>> +    unsigned long flags;
>>>> +
>>>> +    spin_lock_irqsave(&prv->lock, flags);
>>>> +
>>>>      prv->tslice_ms = timeslice;
>>>>      prv->ticks_per_tslice = CSCHED_TICKS_PER_TSLICE;
>>>>      if ( prv->tslice_ms < prv->ticks_per_tslice )
>>>>          prv->ticks_per_tslice = 1;
>>>>      prv->tick_period_us = prv->tslice_ms * 1000 / prv-
>>>>> ticks_per_tslice;
>>>>      prv->credits_per_tslice = CSCHED_CREDITS_PER_MSEC * prv-
>>>>> tslice_ms;
>>>> +    prv->credit = prv->credits_per_tslice * prv->ncpus;
>>>> +
>>>> +    spin_unlock_irqrestore(&prv->lock, flags);
>>>>  }
>>>
>>> The added locking, which has no reason give for in the description
>>> at all, puzzles me: I can see it being needed (and having been
>>> missing) when called from csched_sys_cntl(), but it's not clear to
>>> me why it would be needed when called from csched_init(). Yet
>>> csched_sys_cntl() subsequently als updates prv->ratelimit_us,
>>> and hence the lock would perhaps better be taken there?
>>
>> The locking is needed to protect against csched_alloc_pdata() and
>> csched_free_pdata(). prv->credit could be permananently wrong
>> without the lock, while prv->ratelimit_us can't be modified
>> concurrently in a wrong way (it could be modified by two concurrent
>> calls of csched_sys_cntl(), but even with locking one of both
>> calls would be the winner, same applies to the case with no lock).
>>
>> OTOH I don't mind moving the lock to csched_sys_cntl(). Dario,
>> George, any preferences?
>>
> Yes, I think having the lock in csched_sys_cntl() would be preferable.
> 
> In any case, since the lack of locking and lack of recalculation look
> like two pretty independent existing bugs to me, can we have either:
>  a. two patches;
>  b. one patch but with both the issues described in the changelog.
> 
> My preference going to a.

Without setting prv->credit the lock isn't necessary. In case of a
race domain weights wouldn't be honored correctly for just one
timeslice and I doubt this would be noticeable at all.

OTOH I don't mind splitting the patch into two, I have to respin
anyway.


Juergen
Dario Faggioli Feb. 2, 2016, 11:33 a.m. UTC | #6
On Tue, 2016-02-02 at 11:35 +0100, Juergen Gross wrote:
> On 02/02/16 10:53, Dario Faggioli wrote:
> > 
> > In any case, since the lack of locking and lack of recalculation
> > look
> > like two pretty independent existing bugs to me, can we have
> > either:
> >  a. two patches;
> >  b. one patch but with both the issues described in the changelog.
> > 
> > My preference going to a.
> 
> Without setting prv->credit the lock isn't necessary. In case of a
> race domain weights wouldn't be honored correctly for just one
> timeslice and I doubt this would be noticeable at all.
> 
Ah, yes, I see what you mean now!!

> OTOH I don't mind splitting the patch into two, I have to respin
> anyway.
> 
Well, no, given you explanation above, to which I agree (sory for not
seeing this before), keeping it being just one patch would actually be
better IMO...

But then, please, explain in the changelog that the recalculation would
introduce a race, and hence you also need to lock.

Thanks and regards,
Dario
diff mbox

Patch

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 03fb2c2..912511e 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -1086,12 +1086,19 @@  csched_dom_cntl(
 static inline void
 __csched_set_tslice(struct csched_private *prv, unsigned timeslice)
 {
+    unsigned long flags;
+
+    spin_lock_irqsave(&prv->lock, flags);
+
     prv->tslice_ms = timeslice;
     prv->ticks_per_tslice = CSCHED_TICKS_PER_TSLICE;
     if ( prv->tslice_ms < prv->ticks_per_tslice )
         prv->ticks_per_tslice = 1;
     prv->tick_period_us = prv->tslice_ms * 1000 / prv->ticks_per_tslice;
     prv->credits_per_tslice = CSCHED_CREDITS_PER_MSEC * prv->tslice_ms;
+    prv->credit = prv->credits_per_tslice * prv->ncpus;
+
+    spin_unlock_irqrestore(&prv->lock, flags);
 }
 
 static int