From patchwork Tue Mar 1 09:02:12 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 8463201 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id E4AAA9F372 for ; Tue, 1 Mar 2016 09:05:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3BE37202EC for ; Tue, 1 Mar 2016 09:05:46 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 22CE72035B for ; Tue, 1 Mar 2016 09:05:42 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.84) (envelope-from ) id 1aagCK-0000ZW-HL; Tue, 01 Mar 2016 09:02:28 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.84) (envelope-from ) id 1aagCI-0000Yw-HB for xen-devel@lists.xen.org; Tue, 01 Mar 2016 09:02:26 +0000 Received: from [193.109.254.147] by server-7.bemta-14.messagelabs.com id 20/A5-04065-1AA55D65; Tue, 01 Mar 2016 09:02:25 +0000 X-Env-Sender: jgross@suse.com X-Msg-Ref: server-14.tower-27.messagelabs.com!1456822944!15546880!1 X-Originating-IP: [195.135.220.15] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 8.11; banners=-,-,- X-VirusChecked: Checked Received: (qmail 21506 invoked from network); 1 Mar 2016 09:02:25 -0000 Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by server-14.tower-27.messagelabs.com with DHE-RSA-CAMELLIA256-SHA encrypted SMTP; 1 Mar 2016 09:02:25 -0000 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 4D199AC46; Tue, 1 Mar 2016 09:02:24 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xen.org Date: Tue, 1 Mar 2016 10:02:12 +0100 Message-Id: <1456822933-25041-3-git-send-email-jgross@suse.com> X-Mailer: git-send-email 2.6.2 In-Reply-To: <1456822933-25041-1-git-send-email-jgross@suse.com> References: <1456822933-25041-1-git-send-email-jgross@suse.com> Cc: Juergen Gross , wei.liu2@citrix.com, stefano.stabellini@eu.citrix.com, george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com, dario.faggioli@citrix.com, ian.jackson@eu.citrix.com, david.vrabel@citrix.com, jbeulich@suse.com Subject: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to temporarily pin a vcpu X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Some hardware (e.g. Dell studio 1555 laptops) require SMIs to be called on physical cpu 0 only. Linux drivers like dcdbas or i8k try to achieve this by pinning the running thread to cpu 0, but in Dom0 this is not enough: the vcpu must be pinned to physical cpu 0 via Xen, too. Add a stable hypercall option SCHEDOP_pin_temp to the sched_op hypercall to achieve this. It is taking a physical cpu number as parameter. If pinning is possible (the calling domain has the privilege to make the call and the cpu is available in the domain's cpupool) the calling vcpu is pinned to the specified cpu. The old cpu affinity is saved. To undo the temporary pinning a cpu -1 is specified. This will restore the original cpu affinity for the vcpu. Signed-off-by: Juergen Gross --- V2: - limit operation to hardware domain as suggested by Jan Beulich - some style issues corrected as requested by Jan Beulich - use fixed width types in interface as requested by Jan Beulich - add compat layer checking as requested by Jan Beulich --- xen/common/compat/schedule.c | 4 ++ xen/common/schedule.c | 92 +++++++++++++++++++++++++++++++++++++++++--- xen/include/public/sched.h | 17 ++++++++ xen/include/xlat.lst | 1 + 4 files changed, 109 insertions(+), 5 deletions(-) diff --git a/xen/common/compat/schedule.c b/xen/common/compat/schedule.c index 812c550..73b0f01 100644 --- a/xen/common/compat/schedule.c +++ b/xen/common/compat/schedule.c @@ -10,6 +10,10 @@ #define do_sched_op compat_sched_op +#define xen_sched_pin_temp sched_pin_temp +CHECK_sched_pin_temp; +#undef xen_sched_pin_temp + #define xen_sched_shutdown sched_shutdown CHECK_sched_shutdown; #undef xen_sched_shutdown diff --git a/xen/common/schedule.c b/xen/common/schedule.c index b0d4b18..653f852 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -271,6 +271,12 @@ int sched_move_domain(struct domain *d, struct cpupool *c) struct scheduler *old_ops; void *old_domdata; + for_each_vcpu ( d, v ) + { + if ( v->affinity_broken ) + return -EBUSY; + } + domdata = SCHED_OP(c->sched, alloc_domdata, d); if ( domdata == NULL ) return -ENOMEM; @@ -669,6 +675,14 @@ int cpu_disable_scheduler(unsigned int cpu) if ( cpumask_empty(&online_affinity) && cpumask_test_cpu(cpu, v->cpu_hard_affinity) ) { + if ( v->affinity_broken ) + { + /* The vcpu is temporarily pinned, can't move it. */ + vcpu_schedule_unlock_irqrestore(lock, flags, v); + ret = -EBUSY; + break; + } + if (system_state == SYS_STATE_suspend) { cpumask_copy(v->cpu_hard_affinity_saved, @@ -752,14 +766,20 @@ static int vcpu_set_affinity( struct vcpu *v, const cpumask_t *affinity, cpumask_t *which) { spinlock_t *lock; + int ret = 0; lock = vcpu_schedule_lock_irq(v); - cpumask_copy(which, affinity); + if ( v->affinity_broken ) + ret = -EBUSY; + else + { + cpumask_copy(which, affinity); - /* Always ask the scheduler to re-evaluate placement - * when changing the affinity */ - set_bit(_VPF_migrating, &v->pause_flags); + /* Always ask the scheduler to re-evaluate placement + * when changing the affinity */ + set_bit(_VPF_migrating, &v->pause_flags); + } vcpu_schedule_unlock_irq(lock, v); @@ -771,7 +791,7 @@ static int vcpu_set_affinity( vcpu_migrate(v); } - return 0; + return ret; } int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity) @@ -978,6 +998,51 @@ void watchdog_domain_destroy(struct domain *d) kill_timer(&d->watchdog_timer[i]); } +static long do_pin_temp(int cpu) +{ + struct vcpu *v = current; + spinlock_t *lock; + long ret = -EINVAL; + + lock = vcpu_schedule_lock_irq(v); + + if ( cpu < 0 ) + { + if ( v->affinity_broken ) + { + cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved); + v->affinity_broken = 0; + set_bit(_VPF_migrating, &v->pause_flags); + ret = 0; + } + } + else if ( cpu < nr_cpu_ids ) + { + if ( v->affinity_broken ) + ret = -EBUSY; + else if ( cpumask_test_cpu(cpu, VCPU2ONLINE(v)) ) + { + cpumask_copy(v->cpu_hard_affinity_saved, v->cpu_hard_affinity); + v->affinity_broken = 1; + cpumask_copy(v->cpu_hard_affinity, cpumask_of(cpu)); + set_bit(_VPF_migrating, &v->pause_flags); + ret = 0; + } + } + + vcpu_schedule_unlock_irq(lock, v); + + domain_update_node_affinity(v->domain); + + if ( v->pause_flags & VPF_migrating ) + { + vcpu_sleep_nosync(v); + vcpu_migrate(v); + } + + return ret; +} + typedef long ret_t; #endif /* !COMPAT */ @@ -1087,6 +1152,23 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) break; } + case SCHEDOP_pin_temp: + { + struct sched_pin_temp sched_pin_temp; + + ret = -EFAULT; + if ( copy_from_guest(&sched_pin_temp, arg, 1) ) + break; + + ret = -EPERM; + if ( !is_hardware_domain(current->domain) ) + break; + + ret = do_pin_temp(sched_pin_temp.pcpu); + + break; + } + default: ret = -ENOSYS; } diff --git a/xen/include/public/sched.h b/xen/include/public/sched.h index 2219696..a0ce5a6 100644 --- a/xen/include/public/sched.h +++ b/xen/include/public/sched.h @@ -118,6 +118,17 @@ * With id != 0 and timeout != 0, poke watchdog timer and set new timeout. */ #define SCHEDOP_watchdog 6 + +/* + * Temporarily pin the current vcpu to one physical cpu or undo that pinning. + * @arg == pointer to sched_pin_temp_t structure. + * + * Setting pcpu to -1 will undo a previous temporary pinning and restore the + * previous cpu affinity. The temporary aspect of the pinning isn't enforced + * by the hypervisor. + * This call is allowed for the hardware domain only. + */ +#define SCHEDOP_pin_temp 7 /* ` } */ struct sched_shutdown { @@ -148,6 +159,12 @@ struct sched_watchdog { typedef struct sched_watchdog sched_watchdog_t; DEFINE_XEN_GUEST_HANDLE(sched_watchdog_t); +struct sched_pin_temp { + int32_t pcpu; +}; +typedef struct sched_pin_temp sched_pin_temp_t; +DEFINE_XEN_GUEST_HANDLE(sched_pin_temp_t); + /* * Reason codes for SCHEDOP_shutdown. These may be interpreted by control * software to determine the appropriate action. For the most part, Xen does diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst index fda1137..52c7233 100644 --- a/xen/include/xlat.lst +++ b/xen/include/xlat.lst @@ -104,6 +104,7 @@ ? pmu_data pmu.h ? pmu_params pmu.h ! sched_poll sched.h +? sched_pin_temp sched.h ? sched_remote_shutdown sched.h ? sched_shutdown sched.h ? tmem_oid tmem.h