From patchwork Thu Apr 30 15:15:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 11520657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 91CC9139A for ; Thu, 30 Apr 2020 15:17:32 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 77DCF20661 for ; Thu, 30 Apr 2020 15:17:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 77DCF20661 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jUAuy-0007OP-Ue; Thu, 30 Apr 2020 15:16:04 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jUAux-0007OA-Ou for xen-devel@lists.xenproject.org; Thu, 30 Apr 2020 15:16:03 +0000 X-Inumbo-ID: 81319f00-8af5-11ea-ae69-bc764e2007e4 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 81319f00-8af5-11ea-ae69-bc764e2007e4; Thu, 30 Apr 2020 15:16:03 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 4981FAD31; Thu, 30 Apr 2020 15:16:01 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xenproject.org Subject: [PATCH 1/3] xen/sched: allow rcu work to happen when syncing cpus in core scheduling Date: Thu, 30 Apr 2020 17:15:57 +0200 Message-Id: <20200430151559.1464-2-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200430151559.1464-1-jgross@suse.com> References: <20200430151559.1464-1-jgross@suse.com> X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Julien Grall , Wei Liu , Andrew Cooper , Ian Jackson , George Dunlap , Dario Faggioli , Jan Beulich Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" With RCU barriers moved from tasklets to normal RCU processing cpu offlining in core scheduling might deadlock due to cpu synchronization required by RCU processing and core scheduling concurrently. Fix that by bailing out from core scheduling synchronization in case of pending RCU work. Additionally the RCU softirq is now required to be of higher priority than the scheduling softirqs in order to do RCU processing before entering the scheduler again, as bailing out from the core scheduling synchronization requires to raise another softirq SCHED_SLAVE, which would bypass RCU processing again. Reported-by: Sergey Dyasli Tested-by: Sergey Dyasli Signed-off-by: Juergen Gross Acked-by: Dario Faggioli --- xen/common/sched/core.c | 10 +++++++--- xen/include/xen/softirq.h | 2 +- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c index d94b95285f..a099e37b0f 100644 --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -2457,13 +2457,17 @@ static struct sched_unit *sched_wait_rendezvous_in(struct sched_unit *prev, v = unit2vcpu_cpu(prev, cpu); } /* - * Coming from idle might need to do tasklet work. + * Check for any work to be done which might need cpu synchronization. + * This is either pending RCU work, or tasklet work when coming from + * idle. * In order to avoid deadlocks we can't do that here, but have to - * continue the idle loop. + * schedule the previous vcpu again, which will lead to the desired + * processing to be done. * Undo the rendezvous_in_cnt decrement and schedule another call of * sched_slave(). */ - if ( is_idle_unit(prev) && sched_tasklet_check_cpu(cpu) ) + if ( rcu_pending(cpu) || + (is_idle_unit(prev) && sched_tasklet_check_cpu(cpu)) ) { struct vcpu *vprev = current; diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h index b4724f5c8b..1f6c4783da 100644 --- a/xen/include/xen/softirq.h +++ b/xen/include/xen/softirq.h @@ -4,10 +4,10 @@ /* Low-latency softirqs come first in the following list. */ enum { TIMER_SOFTIRQ = 0, + RCU_SOFTIRQ, SCHED_SLAVE_SOFTIRQ, SCHEDULE_SOFTIRQ, NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ, - RCU_SOFTIRQ, TASKLET_SOFTIRQ, NR_COMMON_SOFTIRQS }; From patchwork Thu Apr 30 15:15:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 11520661 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E8D317EF for ; Thu, 30 Apr 2020 15:17:44 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E9DE920661 for ; Thu, 30 Apr 2020 15:17:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E9DE920661 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jUAv4-0007Qi-KQ; Thu, 30 Apr 2020 15:16:10 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jUAv2-0007QH-Qs for xen-devel@lists.xenproject.org; Thu, 30 Apr 2020 15:16:08 +0000 X-Inumbo-ID: 81432d56-8af5-11ea-9887-bc764e2007e4 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 81432d56-8af5-11ea-9887-bc764e2007e4; Thu, 30 Apr 2020 15:16:03 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 62BACADCD; Thu, 30 Apr 2020 15:16:01 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xenproject.org Subject: [PATCH 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu Date: Thu, 30 Apr 2020 17:15:58 +0200 Message-Id: <20200430151559.1464-3-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200430151559.1464-1-jgross@suse.com> References: <20200430151559.1464-1-jgross@suse.com> X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Julien Grall , Wei Liu , Andrew Cooper , Ian Jackson , George Dunlap , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" The dirty_cpu field of struct vcpu denotes which cpu still holds data of a vcpu. All accesses to this field should be atomic in case the vcpu could just be running, as it is accessed without any lock held in most cases. There are some instances where accesses are not atomically done, and even worse where multiple accesses are done when a single one would be mandated. Correct that in order to avoid potential problems. Add some assertions to verify dirty_cpu is handled properly. Signed-off-by: Juergen Gross --- xen/arch/x86/domain.c | 14 ++++++++++---- xen/include/xen/sched.h | 2 +- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index a4428190d5..f0579a56d1 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1769,6 +1769,7 @@ static void __context_switch(void) if ( !is_idle_domain(pd) ) { + ASSERT(read_atomic(&p->dirty_cpu) == cpu); memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES); vcpu_save_fpu(p); pd->arch.ctxt_switch->from(p); @@ -1832,7 +1833,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) { unsigned int cpu = smp_processor_id(); const struct domain *prevd = prev->domain, *nextd = next->domain; - unsigned int dirty_cpu = next->dirty_cpu; + unsigned int dirty_cpu = read_atomic(&next->dirty_cpu); ASSERT(prev != next); ASSERT(local_irq_is_enabled()); @@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) { /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */ flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE); + ASSERT(read_atomic(&next->dirty_cpu) == VCPU_CPU_CLEAN); } _update_runstate_area(prev); @@ -1956,13 +1958,17 @@ void sync_local_execstate(void) void sync_vcpu_execstate(struct vcpu *v) { - if ( v->dirty_cpu == smp_processor_id() ) + unsigned int dirty_cpu = read_atomic(&v->dirty_cpu); + + if ( dirty_cpu == smp_processor_id() ) sync_local_execstate(); - else if ( vcpu_cpu_dirty(v) ) + else if ( is_vcpu_dirty_cpu(dirty_cpu) ) { /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */ - flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE); + flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE); } + ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu || + dirty_cpu == VCPU_CPU_CLEAN); } static int relinquish_memory( diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 195e7ee583..008d3c8861 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int cpu) static inline bool vcpu_cpu_dirty(const struct vcpu *v) { - return is_vcpu_dirty_cpu(v->dirty_cpu); + return is_vcpu_dirty_cpu(read_atomic(&v->dirty_cpu)); } void vcpu_block(void); From patchwork Thu Apr 30 15:15:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 11520663 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 835F4139A for ; Thu, 30 Apr 2020 15:17:55 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6A0CC20661 for ; Thu, 30 Apr 2020 15:17:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6A0CC20661 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jUAv5-0007Rd-U3; Thu, 30 Apr 2020 15:16:11 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1jUAv4-0007R4-Um for xen-devel@lists.xenproject.org; Thu, 30 Apr 2020 15:16:10 +0000 X-Inumbo-ID: 811ec43f-8af5-11ea-9a67-12813bfff9fa Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 811ec43f-8af5-11ea-9a67-12813bfff9fa; Thu, 30 Apr 2020 15:16:03 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 7ECE7AF4C; Thu, 30 Apr 2020 15:16:01 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xenproject.org Subject: [PATCH 3/3] xen/cpupool: fix removing cpu from a cpupool Date: Thu, 30 Apr 2020 17:15:59 +0200 Message-Id: <20200430151559.1464-4-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200430151559.1464-1-jgross@suse.com> References: <20200430151559.1464-1-jgross@suse.com> X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , George Dunlap , Dario Faggioli Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Commit cb563d7665f2 ("xen/sched: support core scheduling for moving cpus to/from cpupools") introduced a regression when trying to remove an offline cpu from a cpupool, as the system would crash in this situation. Fix that by testing the cpu to be online. Fixes: cb563d7665f2 ("xen/sched: support core scheduling for moving cpus to/from cpupools") Signed-off-by: Juergen Gross Acked-by: Dario Faggioli --- xen/common/sched/cpupool.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/xen/common/sched/cpupool.c b/xen/common/sched/cpupool.c index d40345b585..de9e25af84 100644 --- a/xen/common/sched/cpupool.c +++ b/xen/common/sched/cpupool.c @@ -520,6 +520,9 @@ static int cpupool_unassign_cpu(struct cpupool *c, unsigned int cpu) debugtrace_printk("cpupool_unassign_cpu(pool=%d,cpu=%d)\n", c->cpupool_id, cpu); + if ( !cpu_online(cpu) ) + return -EINVAL; + master_cpu = sched_get_resource_cpu(cpu); ret = cpupool_unassign_cpu_start(c, master_cpu); if ( ret )