diff mbox series

[1/3] xen/sched: allow rcu work to happen when syncing cpus in core scheduling

Message ID 20200430151559.1464-2-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series xen: Fix some bugs in scheduling | expand

Commit Message

Jürgen Groß April 30, 2020, 3:15 p.m. UTC
With RCU barriers moved from tasklets to normal RCU processing cpu
offlining in core scheduling might deadlock due to cpu synchronization
required by RCU processing and core scheduling concurrently.

Fix that by bailing out from core scheduling synchronization in case
of pending RCU work. Additionally the RCU softirq is now required to
be of higher priority than the scheduling softirqs in order to do
RCU processing before entering the scheduler again, as bailing out from
the core scheduling synchronization requires to raise another softirq
SCHED_SLAVE, which would bypass RCU processing again.

Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Tested-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/sched/core.c   | 10 +++++++---
 xen/include/xen/softirq.h |  2 +-
 2 files changed, 8 insertions(+), 4 deletions(-)

Comments

Dario Faggioli May 7, 2020, 6:34 p.m. UTC | #1
On Thu, 2020-04-30 at 17:15 +0200, Juergen Gross wrote:
> With RCU barriers moved from tasklets to normal RCU processing cpu
> offlining in core scheduling might deadlock due to cpu
> synchronization
> required by RCU processing and core scheduling concurrently.
> 
> Fix that by bailing out from core scheduling synchronization in case
> of pending RCU work. Additionally the RCU softirq is now required to
> be of higher priority than the scheduling softirqs in order to do
> RCU processing before entering the scheduler again, as bailing out
> from
> the core scheduling synchronization requires to raise another softirq
> SCHED_SLAVE, which would bypass RCU processing again.
> 
> Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com>
> Tested-by: Sergey Dyasli <sergey.dyasli@citrix.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>
>
In general, I'm fine with this patch and it can have my:

Acked-by: Dario Faggioli <dfaggioli@suse.com>

I'd ask for one thing, but that doesn't affect the ack, as it's not
"my" code. :-)

> diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h
> index b4724f5c8b..1f6c4783da 100644
> --- a/xen/include/xen/softirq.h
> +++ b/xen/include/xen/softirq.h
> @@ -4,10 +4,10 @@
>  /* Low-latency softirqs come first in the following list. */
>  enum {
>      TIMER_SOFTIRQ = 0,
> +    RCU_SOFTIRQ,
>      SCHED_SLAVE_SOFTIRQ,
>      SCHEDULE_SOFTIRQ,
>      NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ,
> -    RCU_SOFTIRQ,
>      TASKLET_SOFTIRQ,
>      NR_COMMON_SOFTIRQS
>  };
>
So, until now, it was kind of intuitive (at least, it was to me :-) )
that the TIMER_SOFTIRQ, we want it first, and the SCHEDULE one right
after it. And the comment above the enum ("Low-latency softirqs come
first in the following list"), although brief, is effective.

With the introduction of SCHED_SLAVE, things became slightly more
complex, but it still is not too far a reach to figure out the fact
that we want it to be above SCHEDULE, and the reasons for that.

Now that we're moving RCU from (almost) the very bottom to up here, I
think we need some more info, there in the code. Sure all the bits and
pieces are there in the changelogs, but I think it would be rather
helpful to have them easily available to people trying to understand or
modifying this code, e.g., with a comment.

I was also thinking that, even better than a comment, would be a
(build?) BUG_ON if RCU has no smaller value than SCHED_SLAVE and SLAVE.
Not here, of course, but maybe close to some piece of code that relies
on this assumption. Something that, if I tomorrow put the SCHED* ones
on top again, would catch my attention and tell me that I either take
care of that code path too, or I can't do it.

However, I'm not sure whether, e.g., the other hunk of this patch would
be a suitable place for something like this. And I can't, out of the
top of my head, think of a really good place for where to put it.
Therefore, I'm "only" asking for the comment... but if you (or others)
have ideas, that'd be cool. :-)

Thanks and Regards
Jürgen Groß May 8, 2020, 5:54 a.m. UTC | #2
On 07.05.20 20:34, Dario Faggioli wrote:
> On Thu, 2020-04-30 at 17:15 +0200, Juergen Gross wrote:
>> With RCU barriers moved from tasklets to normal RCU processing cpu
>> offlining in core scheduling might deadlock due to cpu
>> synchronization
>> required by RCU processing and core scheduling concurrently.
>>
>> Fix that by bailing out from core scheduling synchronization in case
>> of pending RCU work. Additionally the RCU softirq is now required to
>> be of higher priority than the scheduling softirqs in order to do
>> RCU processing before entering the scheduler again, as bailing out
>> from
>> the core scheduling synchronization requires to raise another softirq
>> SCHED_SLAVE, which would bypass RCU processing again.
>>
>> Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com>
>> Tested-by: Sergey Dyasli <sergey.dyasli@citrix.com>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>
> In general, I'm fine with this patch and it can have my:
> 
> Acked-by: Dario Faggioli <dfaggioli@suse.com>
> 
> I'd ask for one thing, but that doesn't affect the ack, as it's not
> "my" code. :-)
> 
>> diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h
>> index b4724f5c8b..1f6c4783da 100644
>> --- a/xen/include/xen/softirq.h
>> +++ b/xen/include/xen/softirq.h
>> @@ -4,10 +4,10 @@
>>   /* Low-latency softirqs come first in the following list. */
>>   enum {
>>       TIMER_SOFTIRQ = 0,
>> +    RCU_SOFTIRQ,
>>       SCHED_SLAVE_SOFTIRQ,
>>       SCHEDULE_SOFTIRQ,
>>       NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ,
>> -    RCU_SOFTIRQ,
>>       TASKLET_SOFTIRQ,
>>       NR_COMMON_SOFTIRQS
>>   };
>>
> So, until now, it was kind of intuitive (at least, it was to me :-) )
> that the TIMER_SOFTIRQ, we want it first, and the SCHEDULE one right
> after it. And the comment above the enum ("Low-latency softirqs come
> first in the following list"), although brief, is effective.
> 
> With the introduction of SCHED_SLAVE, things became slightly more
> complex, but it still is not too far a reach to figure out the fact
> that we want it to be above SCHEDULE, and the reasons for that.
> 
> Now that we're moving RCU from (almost) the very bottom to up here, I
> think we need some more info, there in the code. Sure all the bits and
> pieces are there in the changelogs, but I think it would be rather
> helpful to have them easily available to people trying to understand or
> modifying this code, e.g., with a comment.

That's reasonable.

> 
> I was also thinking that, even better than a comment, would be a
> (build?) BUG_ON if RCU has no smaller value than SCHED_SLAVE and SLAVE.
> Not here, of course, but maybe close to some piece of code that relies
> on this assumption. Something that, if I tomorrow put the SCHED* ones
> on top again, would catch my attention and tell me that I either take
> care of that code path too, or I can't do it.
> 
> However, I'm not sure whether, e.g., the other hunk of this patch would
> be a suitable place for something like this. And I can't, out of the
> top of my head, think of a really good place for where to put it.
> Therefore, I'm "only" asking for the comment... but if you (or others)
> have ideas, that'd be cool. :-)

I think the other hunk is exactly where the BUILD_BUG_ON() should be.
And this is a perfect place for the comment, too, as its placement will
explain the context very well.


Juergen
diff mbox series

Patch

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index d94b95285f..a099e37b0f 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -2457,13 +2457,17 @@  static struct sched_unit *sched_wait_rendezvous_in(struct sched_unit *prev,
             v = unit2vcpu_cpu(prev, cpu);
         }
         /*
-         * Coming from idle might need to do tasklet work.
+         * Check for any work to be done which might need cpu synchronization.
+         * This is either pending RCU work, or tasklet work when coming from
+         * idle.
          * In order to avoid deadlocks we can't do that here, but have to
-         * continue the idle loop.
+         * schedule the previous vcpu again, which will lead to the desired
+         * processing to be done.
          * Undo the rendezvous_in_cnt decrement and schedule another call of
          * sched_slave().
          */
-        if ( is_idle_unit(prev) && sched_tasklet_check_cpu(cpu) )
+        if ( rcu_pending(cpu) ||
+             (is_idle_unit(prev) && sched_tasklet_check_cpu(cpu)) )
         {
             struct vcpu *vprev = current;
 
diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h
index b4724f5c8b..1f6c4783da 100644
--- a/xen/include/xen/softirq.h
+++ b/xen/include/xen/softirq.h
@@ -4,10 +4,10 @@ 
 /* Low-latency softirqs come first in the following list. */
 enum {
     TIMER_SOFTIRQ = 0,
+    RCU_SOFTIRQ,
     SCHED_SLAVE_SOFTIRQ,
     SCHEDULE_SOFTIRQ,
     NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ,
-    RCU_SOFTIRQ,
     TASKLET_SOFTIRQ,
     NR_COMMON_SOFTIRQS
 };