From patchwork Fri Jun 3 08:49:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 9152273 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1E76E60751 for ; Fri, 3 Jun 2016 08:50:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1086A282E8 for ; Fri, 3 Jun 2016 08:50:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0311F28327; Fri, 3 Jun 2016 08:50:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,FREEMAIL_FROM,RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 83581282E8 for ; Fri, 3 Jun 2016 08:50:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752523AbcFCIto (ORCPT ); Fri, 3 Jun 2016 04:49:44 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:33485 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751739AbcFCItl (ORCPT ); Fri, 3 Jun 2016 04:49:41 -0400 Received: by mail-oi0-f68.google.com with SMTP id m198so15830544oig.0; Fri, 03 Jun 2016 01:49:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=SJ+L6VvXQmMK40jfgg9uz2Y1Vfib9OOBbxIiP7uokmk=; b=wftdTIICzHoEF/HqPW3gIfYKjx8ndbJSLD+OynM+CtNAK9h1XroWUrMHGKZ3z93mzS LVe6chsFzdO9WYR/ElhFgWhMAko9hS9eYpF2fjKgysignNn/8F+4zxiII4f9eTpH1mmf KLwd+3Psjn4f+IHsR6QqecvNmyLJUU6xpI7aJDu2obn41Yvwr/JkYWxOAotSdlO3kCHN q93CgiiDaa+/9KzDqCOFWDckc2z/jlm3XONM3m6YxExxtJOwwxNZu+eHFWOfFgeGqHQ1 MY4U0o0plCPCC6HSSKbh8u9n+64OHJeviHUe4PCjK9R3owsmXKpmto+TGHWkpbJfmOLm Y72Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=SJ+L6VvXQmMK40jfgg9uz2Y1Vfib9OOBbxIiP7uokmk=; b=fZGeUxVEQyHsG8d4TqNiiat0Bj0GyBgaTzBQQAIRNq/PuJxmCJ1QcDBufLvSIAwbKF uyccANFczW5jkHNLbIo5Hd3P0mn25yZ16r76VcRIOpZjDm9iEZoBhAitORU4zG5GAbr2 XonL3fRRxNaes0RremDrl9US2Yy5GUW5Mbs3FDhyd4LH8lc7FehGk2AcXXufNv8/ywRP UtkaqSstyDeRugBUAjN588qOVvWVvBvXv/8BfbhdNJa54hrFeXxhDa8+v4eDEGeygMH2 0VWJ1NOReZIUtqhrJEixBuWefsf8cfW3bFyogtEywSWj+PE49GxvHkz1aKrzTRexBMyq ZA1Q== X-Gm-Message-State: ALyK8tKljflUT74uqLanIPdjKNJVIhQwDPxtbuxXxE6Tt1+FnS/Or+JFhbq5wMoYECJKpldoX1co+Xnz6PBtLg== X-Received: by 10.157.45.129 with SMTP id g1mr1074644otb.57.1464943780338; Fri, 03 Jun 2016 01:49:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.44.172 with HTTP; Fri, 3 Jun 2016 01:49:39 -0700 (PDT) In-Reply-To: <20160603071656.GA7466@gmail.com> References: <1463574454-3587-1-git-send-email-wanpeng.li@hotmail.com> <20160603071656.GA7466@gmail.com> From: Wanpeng Li Date: Fri, 3 Jun 2016 16:49:39 +0800 Message-ID: Subject: Re: [PATCH v3] sched/cputime: add steal time support to full dynticks CPU time accounting To: Ingo Molnar Cc: "linux-kernel@vger.kernel.org" , kvm , Wanpeng Li , "Peter Zijlstra (Intel)" , Rik van Riel , Thomas Gleixner , Frederic Weisbecker , Paolo Bonzini , Radim Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 2016-06-03 15:16 GMT+08:00 Ingo Molnar : > > * Wanpeng Li wrote: > >> From: Wanpeng Li >> >> This patch adds steal guest time support to full dynticks CPU >> time accounting. After 'commit ff9a9b4c4334 ("sched, time: Switch >> VIRT_CPU_ACCOUNTING_GEN to jiffy granularity")', time is jiffy >> based sampling even if it's still listened to ring boundaries, so >> steal_account_process_tick() is reused to account how much 'ticks' >> are steal time after the last accumulation. > > WTF? This changelog has 4 grammar errors and it sails through review just like > that? > > 1) What does 'time is jiffy based sampling' mean? > 2) what does 'even if it's still listened to ring boundaries' mean? > 3) "how muck 'ticks'"? > 4) "are steal time"? > > So I fixed this to be at least parseable: > > This patch adds guest steal-time support to full dynticks CPU > time accounting. After the following commit: > > ff9a9b4c4334 ("sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity") > > ... time sampling became jiffy based, even if it's still listened > to ring boundaries, so steal_account_process_tick() is reused > to account how many 'ticks' are stolen-time, after the last accumulation. > Thanks, Ingo, my fault! > Although I'm still wondering what this key phrase means: > > even if it's still listened to ring boundaries, > > Could someone please explain what this means? (Beyond the 5th grammar error this > portion has, which I'll fix once it actually makes sense to me...) delta time is accounted through context tracking which can probe on context boundaries such as kernel and userspace(includes syscalls and exceptions entry/exit) when use vtime. > > Furthermore, the real problem that made me go back and tear the changelog apart is > that the code flow itself is incredibly ugly and fragile as hell: > >> write_seqcount_begin(&tsk->vtime_seqcount); >> tsk->vtime_snap_whence = VTIME_SYS; >> if (vtime_delta(tsk)) { >> + cputime_t steal_time; >> + unsigned long delta_st = steal_account_process_tick(); >> delta_cpu = get_vtime_delta(tsk); >> + steal_time = jiffies_to_cputime(delta_st); >> + >> + if (steal_time >= delta_cpu) { >> + write_seqcount_end(&tsk->vtime_seqcount); >> + return; >> + } >> + delta_cpu -= steal_time; >> account_user_time(tsk, delta_cpu, cputime_to_scaled(delta_cpu)); >> } >> write_seqcount_end(&tsk->vtime_seqcount); >> } > > Yeah, a return in the middle of a locking critical section, really?? > > Also, how about basic style details like leaving an extra newline after local > variable definition sections, like every other scheduler function does? > > Also, what's this thing about calling a time unit variable 'delta_cpu'? When I > reviewed this one of my first reactions was: "Why are we comparing time to CPU > ID??". > > Plus as an added bonus a 'delta_st' variable name to count ticks, which variable > is not just badly named but single-use. WTF? > > Something like this looks much better and shorter: > > void vtime_account_user(struct task_struct *tsk) > { > cputime_t delta_time, steal_time; > > write_seqcount_begin(&tsk->vtime_seqcount); > tsk->vtime_snap_whence = VTIME_SYS; > if (vtime_delta(tsk)) { > delta_time = get_vtime_delta(tsk); > steal_time = jiffies_to_cputime(steal_account_process_tick()); > > if (steal_time < delta_time) { > delta_time -= steal_time; > account_user_time(tsk, delta_time, cputime_to_scaled(delta_time)); > } > } > write_seqcount_end(&tsk->vtime_seqcount); > } > > See the consistent, obvious naming of the variables and the clear code flow? Yeah, thank you again, Ingo, I just cleanup the whole patch as your suggestion. --- void vtime_account_system(struct task_struct *tsk) @@ -718,13 +722,18 @@ void vtime_gen_account_irq_exit(struct task_struct *tsk) void vtime_account_user(struct task_struct *tsk) { - cputime_t delta_cpu; + cputime_t delta_time, steal_time; write_seqcount_begin(&tsk->vtime_seqcount); tsk->vtime_snap_whence = VTIME_SYS; if (vtime_delta(tsk)) { - delta_cpu = get_vtime_delta(tsk); - account_user_time(tsk, delta_cpu, cputime_to_scaled(delta_cpu)); + delta_time = get_vtime_delta(tsk); + steal_time = jiffies_to_cputime(steal_account_process_tick()); + + if (steal_time < delta_time) { + delta_time -= steal_time; + account_user_time(tsk, delta_time, cputime_to_scaled(delta_time)); + } } write_seqcount_end(&tsk->vtime_seqcount); } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 75f98c5..9ff036b 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -257,7 +257,7 @@ void account_idle_time(cputime_t cputime) cpustat[CPUTIME_IDLE] += (__force u64) cputime; } -static __always_inline bool steal_account_process_tick(void) +static __always_inline unsigned long steal_account_process_tick(void) { #ifdef CONFIG_PARAVIRT if (static_key_false(¶virt_steal_enabled)) { @@ -279,7 +279,7 @@ static __always_inline bool steal_account_process_tick(void) return steal_jiffies; } #endif - return false; + return 0; } /* @@ -691,9 +691,13 @@ static cputime_t get_vtime_delta(struct task_struct *tsk) static void __vtime_account_system(struct task_struct *tsk) { - cputime_t delta_cpu = get_vtime_delta(tsk); + cputime_t delta_time = get_vtime_delta(tsk); + cputime_t steal_time = jiffies_to_cputime(steal_account_process_tick()); - account_system_time(tsk, irq_count(), delta_cpu, cputime_to_scaled(delta_cpu)); + if (steal_time < delta_time) { + delta_time -= steal_time; + account_system_time(tsk, irq_count(), delta_time, cputime_to_scaled(delta_time)); + } }