From patchwork Mon Dec 28 16:59:44 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joao Martins X-Patchwork-Id: 7926821 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 7D1259F38D for ; Mon, 28 Dec 2015 17:03:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9162E20254 for ; Mon, 28 Dec 2015 17:03:03 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8722E20251 for ; Mon, 28 Dec 2015 17:03:02 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aDb9n-0008Gm-3T; Mon, 28 Dec 2015 17:00:27 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aDb9l-0008GP-SV for xen-devel@lists.xen.org; Mon, 28 Dec 2015 17:00:26 +0000 Received: from [85.158.139.211] by server-3.bemta-5.messagelabs.com id 83/3A-13487-9AA61865; Mon, 28 Dec 2015 17:00:25 +0000 X-Env-Sender: joao.m.martins@oracle.com X-Msg-Ref: server-16.tower-206.messagelabs.com!1451322023!12724007!1 X-Originating-IP: [141.146.126.69] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTQxLjE0Ni4xMjYuNjkgPT4gMjc3MjE4\n X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 47750 invoked from network); 28 Dec 2015 17:00:24 -0000 Received: from aserp1040.oracle.com (HELO aserp1040.oracle.com) (141.146.126.69) by server-16.tower-206.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 28 Dec 2015 17:00:24 -0000 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id tBSH0Hvi006686 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Mon, 28 Dec 2015 17:00:17 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id tBSH0Hx2000923 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Mon, 28 Dec 2015 17:00:17 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by userv0121.oracle.com (8.13.8/8.13.8) with ESMTP id tBSH0GYs015368; Mon, 28 Dec 2015 17:00:16 GMT Received: from paddy.lan (/85.245.89.156) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 28 Dec 2015 09:00:16 -0800 From: Joao Martins To: xen-devel@lists.xen.org Date: Mon, 28 Dec 2015 16:59:44 +0000 Message-Id: <1451321985-13728-6-git-send-email-joao.m.martins@oracle.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1451321985-13728-1-git-send-email-joao.m.martins@oracle.com> References: <1451321985-13728-1-git-send-email-joao.m.martins@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] Cc: Andrew Cooper , Joao Martins , Keir Fraser , Jan Beulich Subject: [Xen-devel] [PATCH RFC 5/6] x86/time: implement PVCLOCK_TSC_STABLE_BIT X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When using TSC as clocksource we will solely rely on TSC for updating vcpu time infos (pvti). Right now, each vCPU takes the tsc_timestamp at different instants meaning every EPOCH + delta. This delta is variable depending on the time the CPU calibrates with CPU 0 (master), and will likely be different and variable across vCPUS. This means that each VCPU pvti won't account to its calibration error which could lead to time going backwards, and allowing a situation where time read on VCPU B immediately after A being smaller. While this doesn't happen a lot, I was able to observe (for clocksource=tsc) around 50 times in an hour having warps of < 100 ns. This patch proposes relying on host TSC synchronization and passthrough of the master tsc to the guest, when running on a TSC-safe platform. On the rendezvous function we will retrieve the platform time in ns and the last count read by the clocksource that was used to compute system time. master will write both master_tsc_stamp and master_stime, and the other vCPUS (slave) will use it to update their correspondent time infos. This way we can guarantee that on a platform with a constant and reliable TSC, that the time read on vcpu B right after A is bigger independently of the VCPU calibration error. Since pvclock time infos are monotonic as seen by any vCPU we set PVCLOCK_TSC_STABLE_BIT flag, which then enables usage of VDSO on Linux. IIUC, this is similar to how it's implemented on KVM. Signed-off-by: Joao Martins --- xen/arch/x86/time.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index 3f96ce6..e623891 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -910,6 +910,8 @@ static void __update_vcpu_system_time(struct vcpu *v, int force) _u.tsc_timestamp = tsc_stamp; _u.system_time = t->stime_local_stamp; + if ( clocksource_is_tsc ) + _u.flags |= PVCLOCK_TSC_STABLE_BIT; if ( is_hvm_domain(d) ) _u.tsc_timestamp += v->arch.hvm_vcpu.cache_tsc_offset; @@ -1370,9 +1372,12 @@ static void time_calibration_std_rendezvous(void *_r) if ( smp_processor_id() == 0 ) { + u64 last_counter; while ( atomic_read(&r->semaphore) != (total_cpus - 1) ) cpu_relax(); - r->master_stime = read_platform_stime(); + r->master_stime = read_platform_stime(&last_counter); + if ( clocksource_is_tsc ) + r->master_tsc_stamp = last_counter; mb(); /* write r->master_stime /then/ signal */ atomic_inc(&r->semaphore); } @@ -1384,7 +1389,10 @@ static void time_calibration_std_rendezvous(void *_r) mb(); /* receive signal /then/ read r->master_stime */ } - c->local_tsc_stamp = rdtsc(); + if ( clocksource_is_tsc ) + c->local_tsc_stamp = r->master_tsc_stamp; + else + c->local_tsc_stamp = rdtsc(); c->stime_local_stamp = get_s_time(); c->stime_master_stamp = r->master_stime;