From patchwork Fri Oct 20 05:37:28 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dongli Zhang X-Patchwork-Id: 10018941 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BBD6860211 for ; Fri, 20 Oct 2017 05:40:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AB93528EA8 for ; Fri, 20 Oct 2017 05:40:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E70428EAB; Fri, 20 Oct 2017 05:40:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3A82D28EA8 for ; Fri, 20 Oct 2017 05:40:10 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e5Pzy-0005h8-Lk; Fri, 20 Oct 2017 05:37:34 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e5Pzx-0005h2-I1 for xen-devel@lists.xenproject.org; Fri, 20 Oct 2017 05:37:33 +0000 Received: from [85.158.143.35] by server-10.bemta-6.messagelabs.com id 1D/E9-07499-C9B89E95; Fri, 20 Oct 2017 05:37:32 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrMIsWRWlGSWpSXmKPExsUyZ7p8oO6c7pe RBr2NnBbft0xmcmD0OPzhCksAYxRrZl5SfkUCa8a6DbeYC2aqVfy/ztrA2KvQxcjFISQwkUli 1fdlbBDOb0aJs0dnQTkbGSV2T2pg7mLk5OAVEJQ4OfMJC4RtJbFkzk5WEJtFQFtiaft1dhCbT UBHYtqBU2A1IgKKEpM6f4P1MgvUSnw6+Z0RxBYWSJNYMWsmWI2EgJLEv63dQDYHUI26xPp5Qh Dl2hLLFr5mhghLSyz/xwFRbShx+uE2xgmM/LOQHDQLoXkWkuZZCM0LGFlWMWoUpxaVpRbpGhr qJRVlpmeU5CZm5ugaGpjp5aYWFyemp+YkJhXrJefnbmIEBiYDEOxg/LQs4BCjJAeTkihvYOXL SCG+pPyUyozE4oz4otKc1OJDjBocHAJXDh6ZzSjFkpefl6okwTuzC6hOsCg1PbUiLTMHGDswp RIcPEoivO9B0rzFBYm5xZnpEKlTjK4cxzZd/sPE8WPSFSDZcfMukNwEJjd8f/CHSQhstpQ471 WQZgGQ5ozSPLjRsLi/xCgrJczLCHS4EE9BalFuZgmq/CtGcQ5GJWGIE3gy80rgLngFdBwT0HH s9i9AjitJREhJNTBGnny4epZBRaKCIMvO5SvnPRAsOp0Qsituvc+UNKuv4hJ8n30EzsqdtYr5 MUtg/Yb0+47yc4ouMwSeOZ2qdbDv1LPFEftmWRvYRcS9um8nLVPJYlJWcapT81FIy+2kV5ysp qpiCZxeaUs3R7SJXNxR79/xiOXP2ZVCT74dbpmSOXted+scszYlluKMREMt5qLiRACqsjQB9g IAAA== X-Env-Sender: dongli.zhang@oracle.com X-Msg-Ref: server-13.tower-21.messagelabs.com!1508477850!72189243!1 X-Originating-IP: [156.151.31.81] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTU2LjE1MS4zMS44MSA9PiAyODgzMzk=\n X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 18414 invoked from network); 20 Oct 2017 05:37:31 -0000 Received: from userp1040.oracle.com (HELO userp1040.oracle.com) (156.151.31.81) by server-13.tower-21.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 20 Oct 2017 05:37:31 -0000 Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v9K5bT0S030326 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 20 Oct 2017 05:37:29 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v9K5bT7N008121 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 20 Oct 2017 05:37:29 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v9K5bSAr028541; Fri, 20 Oct 2017 05:37:28 GMT MIME-Version: 1.0 Message-ID: Date: Thu, 19 Oct 2017 22:37:28 -0700 (PDT) From: Dongli Zhang To: X-Mailer: Zimbra on Oracle Beehive Content-Disposition: inline X-Source-IP: userv0021.oracle.com [156.151.31.71] Cc: jgross@suse.com, xen-devel@lists.xenproject.org, joao.m.martins@oracle.com, linux-kernel@vger.kernel.org Subject: Re: [Xen-devel] [PATCH 1/1] xen/time: do not decrease steal time after live migration on xen X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Hi Boris, ----- boris.ostrovsky@oracle.com wrote: > On 10/19/2017 04:02 AM, Dongli Zhang wrote: > > After guest live migration on xen, steal time in /proc/stat > > (cpustat[CPUTIME_STEAL]) might decrease because steal returned by > > xen_steal_lock() might be less than this_rq()->prev_steal_time which > is > > derived from previous return value of xen_steal_clock(). > > > > For instance, steal time of each vcpu is 335 before live migration. > > > > cpu 198 0 368 200064 1962 0 0 1340 0 0 > > cpu0 38 0 81 50063 492 0 0 335 0 0 > > cpu1 65 0 97 49763 634 0 0 335 0 0 > > cpu2 38 0 81 50098 462 0 0 335 0 0 > > cpu3 56 0 107 50138 374 0 0 335 0 0 > > > > After live migration, steal time is reduced to 312. > > > > cpu 200 0 370 200330 1971 0 0 1248 0 0 > > cpu0 38 0 82 50123 500 0 0 312 0 0 > > cpu1 65 0 97 49832 634 0 0 312 0 0 > > cpu2 39 0 82 50167 462 0 0 312 0 0 > > cpu3 56 0 107 50207 374 0 0 312 0 0 > > > > The code in this patch is borrowed from do_stolen_accounting() which > has > > already been removed from linux source code since commit > ecb23dc6f2ef > > ("xen: add steal_clock support on x86"). The core idea of both > > do_stolen_accounting() and this patch is to avoid accounting new > steal > > clock if it is smaller than previous old steal clock. > > > > Similar and more severe issue would impact prior linux 4.8-4.10 as > > discussed by Michael Las at > > > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest, > > which would overflow steal time and lead to 100% st usage in top > command > > for linux 4.8-4.10. A backport of this patch would fix that issue. > > > > References: > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest > > Signed-off-by: Dongli Zhang > > --- > > drivers/xen/time.c | 15 ++++++++++++++- > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/xen/time.c b/drivers/xen/time.c > > index ac5f23f..2b3a996 100644 > > --- a/drivers/xen/time.c > > +++ b/drivers/xen/time.c > > @@ -19,6 +19,8 @@ > > /* runstate info updated by Xen */ > > static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); > > > > +static DEFINE_PER_CPU(u64, xen_old_steal); > > + > > /* return an consistent snapshot of 64-bit time/counter value */ > > static u64 get64(const u64 *p) > > { > > @@ -83,9 +85,20 @@ bool xen_vcpu_stolen(int vcpu) > > u64 xen_steal_clock(int cpu) > > { > > struct vcpu_runstate_info state; > > + u64 xen_new_steal; > > + s64 steal_delta; > > > > xen_get_runstate_snapshot_cpu(&state, cpu); > > - return state.time[RUNSTATE_runnable] + > state.time[RUNSTATE_offline]; > > + xen_new_steal = state.time[RUNSTATE_runnable] > > + + state.time[RUNSTATE_offline]; > > + steal_delta = xen_new_steal - per_cpu(xen_old_steal, cpu); > > + > > + if (steal_delta < 0) > > + xen_new_steal = per_cpu(xen_old_steal, cpu); > > + else > > + per_cpu(xen_old_steal, cpu) = xen_new_steal; > > + > > + return xen_new_steal; > > } > > > > void xen_setup_runstate_info(int cpu) > > Can we stash state.time[] during suspend and then add stashed values > inside xen_get_runstate_snapshot_cpu()? Would you like to stash state.time[] during do_suspend() (or xen_suspend()) or code below is expected: ------------------------------------------------- Thank you very much! Dongli Zhang > > This will make xen_steal_clock() simpler. > > -boris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel --- a/drivers/xen/time.c +++ b/drivers/xen/time.c @@ -19,6 +19,8 @@ /* runstate info updated by Xen */ static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); +static DEFINE_PER_CPU(u64[4], old_runstate_time); + /* return an consistent snapshot of 64-bit time/counter value */ static u64 get64(const u64 *p) { @@ -52,6 +54,8 @@ static void xen_get_runstate_snapshot_cpu(struct vcpu_runstate_info *res, { u64 state_time; struct vcpu_runstate_info *state; + int i; + s64 time_delta; BUG_ON(preemptible()); @@ -64,6 +68,17 @@ static void xen_get_runstate_snapshot_cpu(struct vcpu_runstate_info *res, rmb(); /* Hypervisor might update data. */ } while (get64(&state->state_entry_time) != state_time || (state_time & XEN_RUNSTATE_UPDATE)); + + for (i = 0; i < 4; i++) { + if (i == RUNSTATE_runnable || i == RUNSTATE_offline) { + time_delta = res->time[i] - per_cpu(old_runstate_time, cpu)[i]; + + if (unlikely(time_delta < 0)) + res->time[i] = per_cpu(old_runstate_time, cpu)[i]; + else + per_cpu(old_runstate_time, cpu)[i] = res->time[i]; + } + } } -------------------------------------------------