From patchwork Tue Jan 30 18:44:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jackie Li X-Patchwork-Id: 10192229 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C752960383 for ; Tue, 30 Jan 2018 18:46:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B564D2434C for ; Tue, 30 Jan 2018 18:46:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A9B012580E; Tue, 30 Jan 2018 18:46:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F33052434C for ; Tue, 30 Jan 2018 18:46:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0D2966E4FD; Tue, 30 Jan 2018 18:46:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id DC3446E4FD for ; Tue, 30 Jan 2018 18:46:46 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Jan 2018 10:46:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,435,1511856000"; d="scan'208";a="26707590" Received: from yli84-z170x-ud5.fm.intel.com ([10.19.83.14]) by fmsmga001.fm.intel.com with ESMTP; 30 Jan 2018 10:46:46 -0800 From: Jackie Li To: intel-gfx@lists.freedesktop.org Date: Tue, 30 Jan 2018 10:44:46 -0800 Message-Id: <1517337886-6164-1-git-send-email-yaodong.li@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <6d9d0e76-1899-dac0-0c95-5724d8d09ad3@intel.com> References: <6d9d0e76-1899-dac0-0c95-5724d8d09ad3@intel.com> Subject: [Intel-gfx] [RFC] drm/i915: cpufreq based dynamic ring freqency table on Gen9 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP A known issue that ring frequency transition would lead to a full system stall had impacted the media performance when the GT frequency is lower than IA freqency and there's heavy GT workload while having low IA workload. In this case, the ring frequency will be toggled between GT and IA frequency. The solution is to keep the ring frequency higher than or equal to current IA frequency so that we can avoid the toggling between different frequencies issue. This patch monitors the cpufreq frequency notifications (freqency transition notification when governors are used, or new frequency policy when bypassing the governors such as the active mode of intel_pstate) and updates the ring frequency table by calculating the ring frequency for each possible gpu frequency with a new formula ring_freq = max(current IA freq, 2 * gpu_freq). with this we can guarantee that the ring frequency would depend on IA freq when the 2x gpu freq is lower then the active IA freq, and it would depend on 2x gpu freq once the 2x gpu freq is higher than current IA freq. Potential issues with this solution: 0) currently, this patch is now only monitoring the frequency change of CPU 0. it would fail if another logic CPU was using a higher frequency. 1) the latency of the ring freq table update was not considered. Need more understanding to the cost of each ring freq table update. 2) Not sure the impact for low GT workload while the ring frequency is still set to higher or equal to IA frequency. Cc: Chris Wilson Cc: Sagar Arun Kamble Cc: Dmitry Rogozhkin Signed-off-by: Jackie Li --- drivers/gpu/drm/i915/i915_drv.h | 4 ++ drivers/gpu/drm/i915/intel_pm.c | 106 ++++++++++++++++++++++++++++++++++++++-- 2 files changed, 107 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a689396..39311cd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2044,6 +2044,10 @@ struct drm_i915_private { /* gen6+ GT PM state */ struct intel_gen6_power_mgmt gt_pm; + /* cpufreq notifiers*/ + struct notifier_block cpufreq_transition; + struct notifier_block cpufreq_policy; + /* ilk-only ips/rps state. Everything in here is protected by the global * mchdev_lock in intel_pm.c */ struct intel_ilk_power_mgmt ips; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 1db79a8..6a7e439 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -6849,7 +6849,7 @@ static void gen6_update_ring_freq(struct drm_i915_private *dev_priv) /* convert DDR frequency from units of 266.6MHz to bandwidth */ min_ring_freq = mult_frac(min_ring_freq, 8, 3); - if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) { + if (IS_CANNONLAKE(dev_priv)) { /* Convert GT frequency to 50 HZ units */ min_gpu_freq = rps->min_freq / GEN9_FREQ_SCALER; max_gpu_freq = rps->max_freq / GEN9_FREQ_SCALER; @@ -6867,7 +6867,7 @@ static void gen6_update_ring_freq(struct drm_i915_private *dev_priv) int diff = max_gpu_freq - gpu_freq; unsigned int ia_freq = 0, ring_freq = 0; - if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) { + if (IS_CANNONLAKE(dev_priv)) { /* * ring_freq = 2 * GT. ring_freq is in 100MHz units * No floor required for ring frequency on SKL. @@ -8024,6 +8024,103 @@ void intel_disable_gt_powersave(struct drm_i915_private *dev_priv) mutex_unlock(&dev_priv->pcu_lock); } +static void gen9_update_ring_freq(struct drm_i915_private *dev_priv, + unsigned int cpu_freq_mhz) +{ + struct intel_rps *rps = &dev_priv->gt_pm.rps; + unsigned int max_gpu_freq, min_gpu_freq; + unsigned int gpu_freq, ring_freq; + unsigned int cpu_freq_50mhz = cpu_freq_mhz / 50; + + GEM_BUG_ON(!IS_GEN9(dev_priv)); + + mutex_lock(&dev_priv->pcu_lock); + + if (IS_GEN9_BC(dev_priv)) { + /* Convert GT frequency to 50MHz units */ + min_gpu_freq = rps->min_freq / GEN9_FREQ_SCALER; + max_gpu_freq = rps->max_freq / GEN9_FREQ_SCALER; + } else { + min_gpu_freq = rps->min_freq; + max_gpu_freq = rps->max_freq; + } + + DRM_INFO("%s: gpu freq in 50MHz unit (%d - %d), (%d - %d)," + "cpu freq in 50MHz unit %d\n", + __func__, + min_gpu_freq, max_gpu_freq, + min_gpu_freq * 50, + max_gpu_freq * 50, + cpu_freq_50mhz); + + for (gpu_freq = max_gpu_freq; gpu_freq >= min_gpu_freq; gpu_freq--) { + /* Ring freq is 100MHz unit */ + ring_freq = max(cpu_freq_50mhz / 2, gpu_freq); + + sandybridge_pcode_write( + dev_priv, + GEN6_PCODE_WRITE_MIN_FREQ_TABLE, + ring_freq << GEN6_PCODE_FREQ_RING_RATIO_SHIFT | + gpu_freq); + } + + mutex_unlock(&dev_priv->pcu_lock); +} + +static int cpufreq_transition_cb(struct notifier_block *nb, unsigned long val, + void *data) +{ + struct cpufreq_freqs *freq = data; + struct drm_i915_private *i915 = container_of(nb, + struct drm_i915_private, + cpufreq_transition); + + if (val != CPUFREQ_POSTCHANGE) + return 0; + + if (freq->cpu != 0) + return 0; + + if (freq->new == freq->old) + return 0; + + gen9_update_ring_freq(i915, freq->new / 1000); + + return 0; +} + +static int cpufreq_policy_cb(struct notifier_block *nb, unsigned long val, + void *data) +{ + struct cpufreq_policy *policy = data; + struct drm_i915_private *i915 = container_of(nb, + struct drm_i915_private, + cpufreq_policy); + + if (val != CPUFREQ_NOTIFY) + return 0; + + if (policy->cpu != 0) + return 0; + + gen9_update_ring_freq(i915, policy->max / 1000); + + return 0; +} + + +static inline +void intel_register_cpufreq_notifiers(struct drm_i915_private *i915) +{ + i915->cpufreq_transition.notifier_call = cpufreq_transition_cb; + i915->cpufreq_policy.notifier_call = cpufreq_policy_cb; + + cpufreq_register_notifier(&i915->cpufreq_transition, + CPUFREQ_TRANSITION_NOTIFIER); + cpufreq_register_notifier(&i915->cpufreq_policy, + CPUFREQ_POLICY_NOTIFIER); +} + static inline void intel_enable_llc_pstate(struct drm_i915_private *i915) { lockdep_assert_held(&i915->pcu_lock); @@ -8031,7 +8128,10 @@ static inline void intel_enable_llc_pstate(struct drm_i915_private *i915) if (i915->gt_pm.llc_pstate.enabled) return; - gen6_update_ring_freq(i915); + if (IS_GEN9(i915)) + intel_register_cpufreq_notifiers(i915); + else + gen6_update_ring_freq(i915); i915->gt_pm.llc_pstate.enabled = true; }