Message ID | 1497609841.30417.5.camel@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 16.06.17 at 12:44, <dario.faggioli@citrix.com> wrote: > On Fri, 2017-06-16 at 02:54 -0600, Jan Beulich wrote: >> > > > On 14.06.17 at 18:53, <dario.faggioli@citrix.com> wrote: >> > >> > --- a/xen/arch/x86/domain.c >> > +++ b/xen/arch/x86/domain.c >> > @@ -112,12 +112,18 @@ static void play_dead(void) >> > >> > static void idle_loop(void) >> > { >> > + unsigned int cpu = smp_processor_id(); >> > + >> > for ( ; ; ) >> > { >> > - if ( cpu_is_offline(smp_processor_id()) ) >> > + if ( cpu_is_offline(cpu) ) >> > play_dead(); >> > - (*pm_idle)(); >> > - do_tasklet(); >> > + >> > + /* Are we here for running vcpu context tasklets, or for >> > idling? */ >> > + if ( unlikely(tasklet_work_to_do(cpu)) ) >> >> I'm not really sure about the "unlikely()" here. >> > It's basically already there, without this patch, at the very beginning > of do_tasklet(): > > if ( likely(*work_to_do != (TASKLET_enqueued|TASKLET_scheduled)) ) > return; > > Which is right the check that I moved in tasklet_work_to_do(), and as > you can see, it has the likely. > > So, I fundamentally kept it for consistency with old code. I actually > think it does make sense, but I don't have a too strong opinion about > this. Okay then. >> > unsigned long *work_to_do = &per_cpu(tasklet_work_to_do, cpu); >> > struct list_head *list = &per_cpu(tasklet_list, cpu); >> > >> > - /* >> > - * Work must be enqueued *and* scheduled. Otherwise there is >> > no work to >> > - * do, and/or scheduler needs to run to update idle vcpu >> > priority. >> > - */ >> > - if ( likely(*work_to_do != >> > (TASKLET_enqueued|TASKLET_scheduled)) ) >> > - return; >> >> Perhaps it also wouldn't hurt to convert this to an ASSERT() too. >> > Nope, I can't. It's a best effort check, and *work_to_do (which is > per_cpu(tasklet_work_to_do,cpu)) can change, and the assert may fail. How that? TASKLET_enqueued can only be cleared by do_tasklet() itself, and I don't think nesting invocations of the function can or should occur. TASKLET_scheduled is only being cleared when schedule() observes that bit set without TASKLET_enqueued also set. IOW there may be races in setting of the bits, but since we expect the caller to have done this check already, I think an ASSERT() would be quite fine here. >> > --- a/xen/include/xen/tasklet.h >> > +++ b/xen/include/xen/tasklet.h >> > @@ -40,9 +40,19 @@ DECLARE_PER_CPU(unsigned long, >> > tasklet_work_to_do); >> > #define TASKLET_enqueued (1ul << _TASKLET_enqueued) >> > #define TASKLET_scheduled (1ul << _TASKLET_scheduled) >> > >> > +static inline bool tasklet_work_to_do(unsigned int cpu) >> > +{ >> > + /* >> > + * Work must be enqueued *and* scheduled. Otherwise there is >> > no work to >> > + * do, and/or scheduler needs to run to update idle vcpu >> > priority. >> > + */ >> > + return per_cpu(tasklet_work_to_do, cpu) == (TASKLET_enqueued| >> > + TASKLET_scheduled) >> > ; >> > +} >> >> Wouldn't cpu_is_haltable() then also better use this new function? >> > Mmm... Perhaps. It's certainly less code chrun. > > ARM code would then contain two invocations of cpu_is_haltable() (the > first happens with IRQ enabled, so a second one with IRQ disabled is > necessary). But that is *exactly* the same thing we do on x86 (they're > just in different functions in that case). > > So, I reworked the patch according to these suggestions, and you can > look at it below. I'm confused: You've added further uses of cpu_is_haltable() where the cheaper tasklet_work_to_do() would do. That's sort of the opposite of what I've suggested. Of course that suggestion of mine was more than 1:1 replacement - the implied question was whether cpu_is_haltable() simply checking tasklet_work_to_do to be non-zero isn't too lax (and wouldn't better be !tasklet_work_to_do()). Jan
On Fri, 2017-06-16 at 05:47 -0600, Jan Beulich wrote: > > On Fri, 2017-06-16 at 02:54 -0600, Jan Beulich wrote: > > > > > > On 14.06.17 at 18:53, <dario.faggioli@citrix.com> wrote: > > > > unsigned long *work_to_do = &per_cpu(tasklet_work_to_do, > > > > cpu); > > > > struct list_head *list = &per_cpu(tasklet_list, cpu); > > > > > > > > - /* > > > > - * Work must be enqueued *and* scheduled. Otherwise there > > > > is > > > > no work to > > > > - * do, and/or scheduler needs to run to update idle vcpu > > > > priority. > > > > - */ > > > > - if ( likely(*work_to_do != > > > > (TASKLET_enqueued|TASKLET_scheduled)) ) > > > > - return; > > > > > > Perhaps it also wouldn't hurt to convert this to an ASSERT() too. > > > > > > > Nope, I can't. It's a best effort check, and *work_to_do (which is > > per_cpu(tasklet_work_to_do,cpu)) can change, and the assert may > > fail. > > How that? TASKLET_enqueued can only be cleared by do_tasklet() > itself, and I don't think nesting invocations of the function can or > should occur. TASKLET_scheduled is only being cleared when > schedule() observes that bit set without TASKLET_enqueued also > set. IOW there may be races in setting of the bits, but since we > expect the caller to have done this check already, I think an > ASSERT() would be quite fine here. > Ok, makes sense. I will add the ASSERT() (with something like what you wrote here as a comment). > > > Wouldn't cpu_is_haltable() then also better use this new > > > function? > > > > > > > Mmm... Perhaps. It's certainly less code chrun. > > > > ARM code would then contain two invocations of cpu_is_haltable() > > (the > > first happens with IRQ enabled, so a second one with IRQ disabled > > is > > necessary). But that is *exactly* the same thing we do on x86 > > (they're > > just in different functions in that case). > > > > So, I reworked the patch according to these suggestions, and you > > can > > look at it below. > > I'm confused: You've added further uses of cpu_is_haltable() > where the cheaper tasklet_work_to_do() would do. > Indeed. Sorry! Fact is, I've read your comment backwards, i.e., as you were saying something like "wouldn't cpu_is_haltable() better be used here, instead of this new function?" And it's not that your wording is ambiguous, it's me that, apparently, can't read English! :-/ I'll rework the patch again... > Of course that suggestion > of mine was more than 1:1 replacement - the implied question was > whether cpu_is_haltable() simply checking tasklet_work_to_do to > be non-zero isn't too lax (and wouldn't better be > !tasklet_work_to_do()). > Now, to try to answer the real question... Let's assume that, on cpu x, we are about to check cpu_is_haltable(), while, on cpu y, tasklet_schedule_on_cpu(x) is being called, and manages to set _TASKLET_enqueued in *work_to_do. I.e., in current code: CPU x CPU y | | cpu_is_haltable(x) tasklet_schedule_on_cpu(x) |!softirq_pending(x) == true tasklet_enqueue() |cpu_online(x) == true test_and_set(TASKLET_enqueued, | work_to_do) |!work_to_do == FALSE So we don't go to sleep, and we stay in the idle loop for the next iteration. At which point, CPU y will have raised SCHEDULE_SOFTIRQ on x, schedule (still on x) will set TASKLET_scheduled, and we'll call do_tasklet(). Basically, right now, we risk spinning for the time that passes between TASKLET_enqueued being set and SCHEDULE_SOFTIRQ being raised and reaching cpu x. This should be a very short window, and, considering how the TASKLET_* flags are handled, this looks the correct behavior to me. If we use !tasklet_work_to_do() in cpu_is_haltable(): CPU x CPU y | | cpu_is_haltable(x) tasklet_schedule_on_cpu(x) |!softirq_pending(x) == true tasklet_enqueue() |cpu_online(x) == true test_and_set(TASKLET_enqueued, | work_to_do) |!(work_to_do == TASKLET_enqueued+ TASKLET_scheduled) == TRUE Which means we'd go to sleep... just for (most likely) be woken up very very soon by SCHEDULE_SOFTIRQ being thrown at us (cpu x) by cpu y. Am I overlooking anything? And is this (this time) what you were asking? Assuming answers are 'no' and 'yes', I think I'd leave cpu_is_haltable() as it is (perhaps adding a brief comment on why it's ok/better to check work_to_do directly, instead than calling tasklet_work_to_do()). Sorry again for the misunderstanding, Dario
>>> On 16.06.17 at 15:34, <dario.faggioli@citrix.com> wrote: > Assuming answers are 'no' and 'yes', I think I'd leave > cpu_is_haltable() as it is Agreed, you analysis was quite helpful. Jan
On Fri, 16 Jun 2017, Dario Faggioli wrote: > On Fri, 2017-06-16 at 02:54 -0600, Jan Beulich wrote: > > > > > On 14.06.17 at 18:53, <dario.faggioli@citrix.com> wrote: > > > > > > --- a/xen/arch/x86/domain.c > > > +++ b/xen/arch/x86/domain.c > > > @@ -112,12 +112,18 @@ static void play_dead(void) > > > > > > static void idle_loop(void) > > > { > > > + unsigned int cpu = smp_processor_id(); > > > + > > > for ( ; ; ) > > > { > > > - if ( cpu_is_offline(smp_processor_id()) ) > > > + if ( cpu_is_offline(cpu) ) > > > play_dead(); > > > - (*pm_idle)(); > > > - do_tasklet(); > > > + > > > + /* Are we here for running vcpu context tasklets, or for > > > idling? */ > > > + if ( unlikely(tasklet_work_to_do(cpu)) ) > > > > I'm not really sure about the "unlikely()" here. > > > It's basically already there, without this patch, at the very beginning > of do_tasklet(): > > if ( likely(*work_to_do != (TASKLET_enqueued|TASKLET_scheduled)) ) > return; > > Which is right the check that I moved in tasklet_work_to_do(), and as > you can see, it has the likely. > > So, I fundamentally kept it for consistency with old code. I actually > think it does make sense, but I don't have a too strong opinion about > this. > > > > + do_tasklet(cpu); > > > + else > > > + (*pm_idle)(); > > > > Please take the opportunity and drop the pointless parentheses > > and indirection. > > > Ok. > > > > --- a/xen/common/tasklet.c > > > +++ b/xen/common/tasklet.c > > > @@ -104,19 +104,11 @@ static void do_tasklet_work(unsigned int cpu, > > > struct list_head *list) > > > } > > > > > > /* VCPU context work */ > > > -void do_tasklet(void) > > > +void do_tasklet(unsigned int cpu) > > > { > > > - unsigned int cpu = smp_processor_id(); > > > > I'm not convinced it is a good idea to have the caller pass in the > > CPU > > number. > > > Yes, I know. I couldn't make up my mind about it either. I guess I get > get rid of this aspect of the patch. > > > > unsigned long *work_to_do = &per_cpu(tasklet_work_to_do, cpu); > > > struct list_head *list = &per_cpu(tasklet_list, cpu); > > > > > > - /* > > > - * Work must be enqueued *and* scheduled. Otherwise there is > > > no work to > > > - * do, and/or scheduler needs to run to update idle vcpu > > > priority. > > > - */ > > > - if ( likely(*work_to_do != > > > (TASKLET_enqueued|TASKLET_scheduled)) ) > > > - return; > > > > Perhaps it also wouldn't hurt to convert this to an ASSERT() too. > > > Nope, I can't. It's a best effort check, and *work_to_do (which is > per_cpu(tasklet_work_to_do,cpu)) can change, and the assert may fail. > > The code is, of course, safe, because, if we think that there's work > but there's not, the list of pending tasklets will be empty (and we > check that after taking the tasklet lock). > > > > --- a/xen/include/xen/tasklet.h > > > +++ b/xen/include/xen/tasklet.h > > > @@ -40,9 +40,19 @@ DECLARE_PER_CPU(unsigned long, > > > tasklet_work_to_do); > > > #define TASKLET_enqueued (1ul << _TASKLET_enqueued) > > > #define TASKLET_scheduled (1ul << _TASKLET_scheduled) > > > > > > +static inline bool tasklet_work_to_do(unsigned int cpu) > > > +{ > > > + /* > > > + * Work must be enqueued *and* scheduled. Otherwise there is > > > no work to > > > + * do, and/or scheduler needs to run to update idle vcpu > > > priority. > > > + */ > > > + return per_cpu(tasklet_work_to_do, cpu) == (TASKLET_enqueued| > > > + TASKLET_scheduled) > > > ; > > > +} > > > > Wouldn't cpu_is_haltable() then also better use this new function? > > > Mmm... Perhaps. It's certainly less code chrun. > > ARM code would then contain two invocations of cpu_is_haltable() (the > first happens with IRQ enabled, so a second one with IRQ disabled is > necessary). But that is *exactly* the same thing we do on x86 (they're > just in different functions in that case). > > So, I reworked the patch according to these suggestions, and you can > look at it below. > > If you like it better, I'm ok re-submitting it properly in this shape. > Other thoughts anyone else? > > Thanks and Regards, > Dario > --- > NOTE that, since we call do_tasklet() after having checked > cpu_is_haltable(), the if in there is not likely any longer. > --- > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 76310ed..86cd612 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -41,20 +41,28 @@ DEFINE_PER_CPU(struct vcpu *, curr_vcpu); > > void idle_loop(void) > { > + unsigned int cpu = smp_processor_id(); > + > for ( ; ; ) > { > - if ( cpu_is_offline(smp_processor_id()) ) > + if ( cpu_is_offline(cpu) ) > stop_cpu(); > > - local_irq_disable(); > - if ( cpu_is_haltable(smp_processor_id()) ) > + /* Are we here for running vcpu context tasklets, or for idling? */ > + if ( cpu_is_haltable(cpu) ) > { > - dsb(sy); > - wfi(); > + local_irq_disable(); > + /* We need to check again, with IRQ disabled */ > + if ( cpu_is_haltable(cpu) ) > + { > + dsb(sy); > + wfi(); > + } > + local_irq_enable(); > } > - local_irq_enable(); > + else > + do_tasklet(); > > - do_tasklet(); > do_softirq(); Are you sure you want to check that cpu_is_haltable twice? It doesn't make sense to me.
On Fri, 2017-06-16 at 10:41 -0700, Stefano Stabellini wrote: > On Fri, 16 Jun 2017, Dario Faggioli wrote: > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > index 76310ed..86cd612 100644 > > --- a/xen/arch/arm/domain.c > > +++ b/xen/arch/arm/domain.c > > @@ -41,20 +41,28 @@ DEFINE_PER_CPU(struct vcpu *, curr_vcpu); > > > > void idle_loop(void) > > { > > + unsigned int cpu = smp_processor_id(); > > + > > for ( ; ; ) > > { > > - if ( cpu_is_offline(smp_processor_id()) ) > > + if ( cpu_is_offline(cpu) ) > > stop_cpu(); > > > > - local_irq_disable(); > > - if ( cpu_is_haltable(smp_processor_id()) ) > > + /* Are we here for running vcpu context tasklets, or for > > idling? */ > > + if ( cpu_is_haltable(cpu) ) > > { > > - dsb(sy); > > - wfi(); > > + local_irq_disable(); > > + /* We need to check again, with IRQ disabled */ > > + if ( cpu_is_haltable(cpu) ) > > + { > > + dsb(sy); > > + wfi(); > > + } > > + local_irq_enable(); > > } > > - local_irq_enable(); > > + else > > + do_tasklet(); > > > > - do_tasklet(); > > do_softirq(); > > Are you sure you want to check that cpu_is_haltable twice? It doesn't > make sense to me. > It's because of IRQ being disabled the first time. But anyway, discard this patch. I'll go back to (a slightly modified version of) the first one I sent, which defines a tasklet specific helper function. I'll send it on Monday. Regards, Dario
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 76310ed..86cd612 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -41,20 +41,28 @@ DEFINE_PER_CPU(struct vcpu *, curr_vcpu); void idle_loop(void) { + unsigned int cpu = smp_processor_id(); + for ( ; ; ) { - if ( cpu_is_offline(smp_processor_id()) ) + if ( cpu_is_offline(cpu) ) stop_cpu(); - local_irq_disable(); - if ( cpu_is_haltable(smp_processor_id()) ) + /* Are we here for running vcpu context tasklets, or for idling? */ + if ( cpu_is_haltable(cpu) ) { - dsb(sy); - wfi(); + local_irq_disable(); + /* We need to check again, with IRQ disabled */ + if ( cpu_is_haltable(cpu) ) + { + dsb(sy); + wfi(); + } + local_irq_enable(); } - local_irq_enable(); + else + do_tasklet(); - do_tasklet(); do_softirq(); /* * We MUST be last (or before dsb, wfi). Otherwise after we get the diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 49388f4..c520fdd 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -112,12 +112,18 @@ static void play_dead(void) static void idle_loop(void) { + unsigned int cpu = smp_processor_id(); + for ( ; ; ) { - if ( cpu_is_offline(smp_processor_id()) ) + if ( cpu_is_offline(cpu) ) play_dead(); - (*pm_idle)(); - do_tasklet(); + + /* Are we here for idling, or for running vcpu context tasklets? */ + if ( cpu_is_haltable(cpu) ) + pm_idle(); + else + do_tasklet(); do_softirq(); /* * We MUST be last (or before pm_idle). Otherwise after we get the diff --git a/xen/common/tasklet.c b/xen/common/tasklet.c index 365a777..ebdce12 100644 --- a/xen/common/tasklet.c +++ b/xen/common/tasklet.c @@ -114,7 +114,7 @@ void do_tasklet(void) * Work must be enqueued *and* scheduled. Otherwise there is no work to * do, and/or scheduler needs to run to update idle vcpu priority. */ - if ( likely(*work_to_do != (TASKLET_enqueued|TASKLET_scheduled)) ) + if ( *work_to_do != (TASKLET_enqueued|TASKLET_scheduled) ) return; spin_lock_irq(&tasklet_lock);