Message ID | 20180116152116.17900-1-mika.kuoppala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Mika Kuoppala (2018-01-16 15:21:16) > There is a suspicion that with aggressive upclocking, power rail > voltage fluctuations can disrupt c state transition, leading > to system hang. > > When upclocking with 4 cpu Baytrails, bring up cpus to c1 and then > go through bins gradually towards target frequency to give leeway > for hw. > > We go towards requested frequency on 1 millisecond intervals. For > each 1 millisecond, we increase the frequency by half of bins > that are in between current frequency and target. Either this is good for everyone or it is not. Doing more punit accesses seems counter productive though, and adds 8ms to the initial request? -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > Quoting Mika Kuoppala (2018-01-16 15:21:16) >> There is a suspicion that with aggressive upclocking, power rail >> voltage fluctuations can disrupt c state transition, leading >> to system hang. >> >> When upclocking with 4 cpu Baytrails, bring up cpus to c1 and then >> go through bins gradually towards target frequency to give leeway >> for hw. >> >> We go towards requested frequency on 1 millisecond intervals. For >> each 1 millisecond, we increase the frequency by half of bins >> that are in between current frequency and target. > > Either this is good for everyone or it is not. Doing more punit accesses > seems counter productive though, and adds 8ms to the initial request? Wanted to see if it is not punit access in itself but voltage rampup. We can forget this patch as atleast with these values as it didn't survive night. -Mika
Hi, On 17-01-18 09:31, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > >> Quoting Mika Kuoppala (2018-01-16 15:21:16) >>> There is a suspicion that with aggressive upclocking, power rail >>> voltage fluctuations can disrupt c state transition, leading >>> to system hang. >>> >>> When upclocking with 4 cpu Baytrails, bring up cpus to c1 and then >>> go through bins gradually towards target frequency to give leeway >>> for hw. >>> >>> We go towards requested frequency on 1 millisecond intervals. For >>> each 1 millisecond, we increase the frequency by half of bins >>> that are in between current frequency and target. >> >> Either this is good for everyone or it is not. Doing more punit accesses >> seems counter productive though, and adds 8ms to the initial request? > > Wanted to see if it is not punit access in itself but voltage rampup. We > can forget this patch as atleast with these values as it didn't survive > night. I noticed this patch was lacking the ping Chris' version has. Is that in the intel_idle driver now ? And did your test kernel have the intel_idle patch? If not the lack of the ping may be the reason why this patch did not survive the night. Regards, Hans
On Wed, Jan 17, 2018 at 10:31:09AM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > Quoting Mika Kuoppala (2018-01-16 15:21:16) > >> There is a suspicion that with aggressive upclocking, power rail > >> voltage fluctuations can disrupt c state transition, leading > >> to system hang. > >> > >> When upclocking with 4 cpu Baytrails, bring up cpus to c1 and then > >> go through bins gradually towards target frequency to give leeway > >> for hw. > >> > >> We go towards requested frequency on 1 millisecond intervals. For > >> each 1 millisecond, we increase the frequency by half of bins > >> that are in between current frequency and target. > > > > Either this is good for everyone or it is not. Doing more punit accesses > > seems counter productive though, and adds 8ms to the initial request? > > Wanted to see if it is not punit access in itself but voltage rampup. We > can forget this patch as atleast with these values as it didn't survive > night. I guess one big problem is that the GPU frequency is just one small part of the voltage equation. RC6 might play a bigger role since it could cause the voltage to alternate between low and high values rapidly. And the Vnn rail is shared by most devices on the soc, so even just limiting things to the GPU might not be entirely sufficient. And if there's some link between the number of cores and the instability then the Vcc rail(s) are perhaps also suspect. And for those I guess we would have to somehow throttle the CPU C and P state transitions.
Hi, On 17-01-18 16:45, Ville Syrjälä wrote: > On Wed, Jan 17, 2018 at 10:31:09AM +0200, Mika Kuoppala wrote: >> Chris Wilson <chris@chris-wilson.co.uk> writes: >> >>> Quoting Mika Kuoppala (2018-01-16 15:21:16) >>>> There is a suspicion that with aggressive upclocking, power rail >>>> voltage fluctuations can disrupt c state transition, leading >>>> to system hang. >>>> >>>> When upclocking with 4 cpu Baytrails, bring up cpus to c1 and then >>>> go through bins gradually towards target frequency to give leeway >>>> for hw. >>>> >>>> We go towards requested frequency on 1 millisecond intervals. For >>>> each 1 millisecond, we increase the frequency by half of bins >>>> that are in between current frequency and target. >>> >>> Either this is good for everyone or it is not. Doing more punit accesses >>> seems counter productive though, and adds 8ms to the initial request? >> >> Wanted to see if it is not punit access in itself but voltage rampup. We >> can forget this patch as atleast with these values as it didn't survive >> night. > > I guess one big problem is that the GPU frequency is just one small part > of the voltage equation. RC6 might play a bigger role since it could > cause the voltage to alternate between low and high values rapidly. And > the Vnn rail is shared by most devices on the soc, so even just limiting > things to the GPU might not be entirely sufficient. And if there's some > link between the number of cores and the instability then the Vcc > rail(s) are perhaps also suspect. And for those I guess we would have > to somehow throttle the CPU C and P state transitions. Right, various users have reported success with this patch: https://bugzilla.kernel.org/attachment.cgi?id=260585 Often mixed with some patch to also make the GPU frequency changes less. I guess that when only C7 is available it is used less, or maybe the punit allows for a larger delay for the rails to stabilize when coming out of C7, who knows. Might be worthwhile to try combining this patch with Chris' patch to see if that helps. Regards, Hans
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index cc659b4b2a45..9c718c2811b8 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1043,6 +1043,13 @@ static int i915_frequency_info(struct seq_file *m, void *unused) seq_printf(m, "current GPU freq: %d MHz\n", intel_gpu_freq(dev_priv, rps->cur_freq)); + if (IS_VALLEYVIEW(dev_priv)) { + seq_printf(m, "target GPU freq: %d MHz\n", + intel_gpu_freq(dev_priv, rps->target_freq)); + seq_printf(m, "QOS CPU DMA Latency %d usecs\n", + pm_qos_request(PM_QOS_CPU_DMA_LATENCY)); + } + seq_printf(m, "max GPU freq: %d MHz\n", intel_gpu_freq(dev_priv, rps->max_freq)); diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 6c8da9d20c33..9718afacfa30 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -896,6 +896,10 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, BUG_ON(device_info->gen > sizeof(device_info->gen_mask) * BITS_PER_BYTE); device_info->gen_mask = BIT(device_info->gen - 1); + pm_qos_add_request(&dev_priv->gt_pm.rps.qos, + PM_QOS_CPU_DMA_LATENCY, + PM_QOS_DEFAULT_VALUE); + spin_lock_init(&dev_priv->irq_lock); spin_lock_init(&dev_priv->gpu_error.lock); mutex_init(&dev_priv->backlight_lock); @@ -953,6 +957,7 @@ static void i915_driver_cleanup_early(struct drm_i915_private *dev_priv) intel_irq_fini(dev_priv); i915_workqueues_cleanup(dev_priv); i915_engines_cleanup(dev_priv); + pm_qos_remove_request(&dev_priv->gt_pm.rps.qos); } static int i915_mmio_setup(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c42015b05b47..90cc2788fbbb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -936,6 +936,10 @@ struct intel_rps { /* manual wa residency calculations */ struct intel_rps_ei ei; + + u8 target_freq; /* vlv target to reach */ + struct pm_qos_request qos; + struct delayed_work vlv_rps_work; }; struct intel_rc6 { diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 1db79a860b96..00ef4cb7bd8f 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -6200,7 +6200,7 @@ static int gen6_set_rps(struct drm_i915_private *dev_priv, u8 val) return 0; } -static int valleyview_set_rps(struct drm_i915_private *dev_priv, u8 val) +static int valleyview_set_hw_rps(struct drm_i915_private *dev_priv, u8 val) { int err; @@ -6224,6 +6224,69 @@ static int valleyview_set_rps(struct drm_i915_private *dev_priv, u8 val) return 0; } +static void valleyview_queue_rps_change(struct drm_i915_private *dev_priv) +{ + struct intel_rps * const rps = &dev_priv->gt_pm.rps; + unsigned long delay; + + delay = 1 + msecs_to_jiffies(1); + queue_delayed_work(system_wq, &rps->vlv_rps_work, delay); +} + +static int valleyview_set_rps(struct drm_i915_private *dev_priv, u8 val) +{ + struct intel_rps * const rps = &dev_priv->gt_pm.rps; + + /* 2 cpu Baytrails seems to be safe */ + if (IS_CHERRYVIEW(dev_priv) || num_online_cpus() <= 2) + return valleyview_set_hw_rps(dev_priv, val); + + /* + * For >2 cpu Baytrails we need to get cpus to C1 + * and limit the freq rampup. + */ + rps->target_freq = val; + pm_qos_update_request(&dev_priv->gt_pm.rps.qos, 0); + + valleyview_queue_rps_change(dev_priv); + + return 0; +} + +static void vlv_rps_write_work(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, typeof(*dev_priv), + gt_pm.rps.vlv_rps_work.work); + struct intel_rps * const rps = &dev_priv->gt_pm.rps; + u8 next; + + mutex_lock(&dev_priv->pcu_lock); + + if (rps->target_freq == rps->cur_freq) { + pm_qos_update_request(&dev_priv->gt_pm.rps.qos, + PM_QOS_DEFAULT_VALUE); + mutex_unlock(&dev_priv->pcu_lock); + return; + } + + if (rps->target_freq > rps->cur_freq) { + u8 diff = rps->target_freq - rps->cur_freq; + + if (diff > 1) + diff = diff >> 1; + + next = rps->cur_freq + diff; + } else { + next = rps->target_freq; + } + + valleyview_set_hw_rps(dev_priv, next); + mutex_unlock(&dev_priv->pcu_lock); + + valleyview_queue_rps_change(dev_priv); +} + /* vlv_set_rps_idle: Set the frequency to idle, if Gfx clocks are down * * * If Gfx is Idle, then @@ -6237,9 +6300,11 @@ static void vlv_set_rps_idle(struct drm_i915_private *dev_priv) u32 val = rps->idle_freq; int err; - if (rps->cur_freq <= val) + if (IS_CHERRYVIEW(dev_priv) && rps->cur_freq <= val) return; + rps->target_freq = val; + /* The punit delays the write of the frequency and voltage until it * determines the GPU is awake. During normal usage we don't want to * waste power changing the frequency if the GPU is sleeping (rc6). @@ -9336,6 +9401,9 @@ int intel_freq_opcode(struct drm_i915_private *dev_priv, int val) void intel_pm_setup(struct drm_i915_private *dev_priv) { + INIT_DELAYED_WORK(&dev_priv->gt_pm.rps.vlv_rps_work, + vlv_rps_write_work); + mutex_init(&dev_priv->pcu_lock); atomic_set(&dev_priv->gt_pm.rps.num_waiters, 0);
There is a suspicion that with aggressive upclocking, power rail voltage fluctuations can disrupt c state transition, leading to system hang. When upclocking with 4 cpu Baytrails, bring up cpus to c1 and then go through bins gradually towards target frequency to give leeway for hw. We go towards requested frequency on 1 millisecond intervals. For each 1 millisecond, we increase the frequency by half of bins that are in between current frequency and target. This will have an adverse effect on client boosting, delaying reaching full frequency by about 8ms. References: https://bugzilla.kernel.org/show_bug.cgi?id=109051 References: https://bugs.freedesktop.org/show_bug.cgi?id=102657 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 7 ++++ drivers/gpu/drm/i915/i915_drv.c | 5 +++ drivers/gpu/drm/i915/i915_drv.h | 4 +++ drivers/gpu/drm/i915/intel_pm.c | 72 +++++++++++++++++++++++++++++++++++-- 4 files changed, 86 insertions(+), 2 deletions(-)