From patchwork Wed Oct 25 06:45:15 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dongli Zhang X-Patchwork-Id: 10025951 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 3BFB5601E8 for ; Wed, 25 Oct 2017 06:48:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2C82328AF3 for ; Wed, 25 Oct 2017 06:48:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 209B828AF6; Wed, 25 Oct 2017 06:48:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F0BDA28AF3 for ; Wed, 25 Oct 2017 06:48:07 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e7FRi-0007EF-I8; Wed, 25 Oct 2017 06:45:46 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e7FRh-0007E9-Af for xen-devel@lists.xenproject.org; Wed, 25 Oct 2017 06:45:45 +0000 Received: from [85.158.143.35] by server-11.bemta-6.messagelabs.com id 46/53-20813-81330F95; Wed, 25 Oct 2017 06:45:44 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFuplkeJIrShJLcpLzFFi42LpnVTnqith/CH S4Hozi8X3LZOZHBg9Dn+4whLAGMWamZeUX5HAmvFj5TXGgsVyFX9ObWZsYLwk2cXIxSEkMJFJ 4v3n/+wQzm9Gid+Te6GcDYwSM3/vZe5i5ARyuhklXt+XALHZBHQkph04xQJiiwg4SGzafxCog YODWcBLYvocXZCwsECExIVrH5hAbBYBVYlXE5cygti8Am4SRzddYQexJQTkJG6e62SGsA0lTj /cxjiBkWcBI8MqRo3i1KKy1CJdI3O9pKLM9IyS3MTMHF1DAzO93NTi4sT01JzEpGK95PzcTYx AzzMAwQ7GxWsDDzFKcjApifJa3XofKcSXlJ9SmZFYnBFfVJqTWnyIUYaDQ0mC18/oQ6SQYFFq empFWmYOMARh0hIcPEoivLqGQGne4oLE3OLMdIjUKUZvjmObLv9h4vgx6QqQ7Lh5F0huApMbv j8Aks9mvm5gFmLJy89LlRLn1QTZIAAyIqM0D24BLJYuMcpKCfMyAp0sxFOQWpSbWYIq/4pRnI NRSZhXHmQKT2ZeCdwdr4BOZAI6sUkV7MSSRISUVAPjplYWydf3DrKfeBx/Lvpe05zkKeWvctP Ulyun+EUo31LIkD72/c5P8XPhIjcDpjx5odr68gbbUx+VdMF8g6I/6/mP35nryONykePwXoPH jK7K37XVdFWkJr5s69a9nyzH3GX8ed+2ef4tuVMn782R+bdNrS7khenhPR67XdWemlcdfP+oO velEktxRqKhFnNRcSIAzSHck6ACAAA= X-Env-Sender: dongli.zhang@oracle.com X-Msg-Ref: server-3.tower-21.messagelabs.com!1508913942!75293695!1 X-Originating-IP: [141.146.126.69] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTQxLjE0Ni4xMjYuNjkgPT4gMjc3MjE4\n X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 45253 invoked from network); 25 Oct 2017 06:45:43 -0000 Received: from aserp1040.oracle.com (HELO aserp1040.oracle.com) (141.146.126.69) by server-3.tower-21.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 25 Oct 2017 06:45:43 -0000 Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v9P6jfjP014950 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 25 Oct 2017 06:45:41 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v9P6jeeE028465 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 25 Oct 2017 06:45:40 GMT Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v9P6jeaM020547; Wed, 25 Oct 2017 06:45:40 GMT Received: from linux.cn.oracle.com (/10.182.70.198) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 24 Oct 2017 23:45:36 -0700 From: Dongli Zhang To: xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org Date: Wed, 25 Oct 2017 14:45:15 +0800 Message-Id: <1508913915-7382-1-git-send-email-dongli.zhang@oracle.com> X-Mailer: git-send-email 2.7.4 X-Source-IP: userv0021.oracle.com [156.151.31.71] Cc: jgross@suse.com, boris.ostrovsky@oracle.com, joao.m.martins@oracle.com Subject: [Xen-devel] [PATCH v3 1/1] xen/time: do not decrease steal time after live migration on xen X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP After guest live migration on xen, steal time in /proc/stat (cpustat[CPUTIME_STEAL]) might decrease because steal returned by xen_steal_lock() might be less than this_rq()->prev_steal_time which is derived from previous return value of xen_steal_clock(). For instance, steal time of each vcpu is 335 before live migration. cpu 198 0 368 200064 1962 0 0 1340 0 0 cpu0 38 0 81 50063 492 0 0 335 0 0 cpu1 65 0 97 49763 634 0 0 335 0 0 cpu2 38 0 81 50098 462 0 0 335 0 0 cpu3 56 0 107 50138 374 0 0 335 0 0 After live migration, steal time is reduced to 312. cpu 200 0 370 200330 1971 0 0 1248 0 0 cpu0 38 0 82 50123 500 0 0 312 0 0 cpu1 65 0 97 49832 634 0 0 312 0 0 cpu2 39 0 82 50167 462 0 0 312 0 0 cpu3 56 0 107 50207 374 0 0 312 0 0 Since runstate times are cumulative and cleared during xen live migration by xen hypervisor, the idea of this patch is to accumulate runstate times to global percpu variables before live migration suspend. Once guest VM is resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new runstate times and previously accumulated times stored in global percpu variables. Similar and more severe issue would impact prior linux 4.8-4.10 as discussed by Michael Las at https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest, which would overflow steal time and lead to 100% st usage in top command for linux 4.8-4.10. A backport of this patch would fix that issue. References: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest Signed-off-by: Dongli Zhang --- Changed since v1: * relocate modification to xen_get_runstate_snapshot_cpu Changed since v2: * accumulate runstate times before live migration --- drivers/xen/manage.c | 1 + drivers/xen/time.c | 19 +++++++++++++++++++ include/xen/xen-ops.h | 1 + 3 files changed, 21 insertions(+) diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index c425d03..9aa2955 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -72,6 +72,7 @@ static int xen_suspend(void *data) } gnttab_suspend(); + xen_accumulate_runstate_time(); xen_arch_pre_suspend(); /* diff --git a/drivers/xen/time.c b/drivers/xen/time.c index ac5f23f..6df3f82 100644 --- a/drivers/xen/time.c +++ b/drivers/xen/time.c @@ -19,6 +19,8 @@ /* runstate info updated by Xen */ static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate); +static DEFINE_PER_CPU(u64[4], old_runstate_time); + /* return an consistent snapshot of 64-bit time/counter value */ static u64 get64(const u64 *p) { @@ -52,6 +54,7 @@ static void xen_get_runstate_snapshot_cpu(struct vcpu_runstate_info *res, { u64 state_time; struct vcpu_runstate_info *state; + int i; BUG_ON(preemptible()); @@ -64,6 +67,22 @@ static void xen_get_runstate_snapshot_cpu(struct vcpu_runstate_info *res, rmb(); /* Hypervisor might update data. */ } while (get64(&state->state_entry_time) != state_time || (state_time & XEN_RUNSTATE_UPDATE)); + + for (i = 0; i < 4; i++) + res->time[i] += per_cpu(old_runstate_time, cpu)[i]; +} + +void xen_accumulate_runstate_time(void) +{ + struct vcpu_runstate_info state; + int cpu; + + for_each_possible_cpu(cpu) { + xen_get_runstate_snapshot_cpu(&state, cpu); + memcpy(per_cpu(old_runstate_time, cpu), + state.time, + 4 * sizeof(u64)); + } } /* diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index 218e6aa..5680059 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -32,6 +32,7 @@ void xen_resume_notifier_unregister(struct notifier_block *nb); bool xen_vcpu_stolen(int vcpu); void xen_setup_runstate_info(int cpu); void xen_time_setup_guest(void); +void xen_accumulate_runstate_time(void); void xen_get_runstate_snapshot(struct vcpu_runstate_info *res); u64 xen_steal_clock(int cpu);