Message ID | 1448284006-13596-11-git-send-email-philippe.longepe@linux.intel.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On 2015.11.23 05:47 Philippe Longepe wrote: > static inline int32_t intel_pstate_calc_scaled_busy(struct cpudata *cpu) > { > + u64 cummulative_iowait, delta_iowait_us; > + u64 delta_iowait_mperf; > + u64 mperf, now; > struct sample *sample = &cpu->sample; > struct pstate_data *pstate = &cpu->pstate; To prevent a build "unused variable 'pstate'" warning, suggest deleting that last line. As of this patch 5 of 5 it is no longer used. i.e.: - struct pstate_data *pstate = &cpu->pstate; -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thank you Smythies, I found another small error on patch 5/5. Let me doing few test on a CherryTrail and I'll resubmit a version 2 for this patch. Thx, Philippe On 23/11/2015 17:10, Doug Smythies wrote: > On 2015.11.23 05:47 Philippe Longepe wrote: > >> static inline int32_t intel_pstate_calc_scaled_busy(struct cpudata *cpu) >> { >> + u64 cummulative_iowait, delta_iowait_us; >> + u64 delta_iowait_mperf; >> + u64 mperf, now; >> struct sample *sample = &cpu->sample; >> struct pstate_data *pstate = &cpu->pstate; > To prevent a build "unused variable 'pstate'" warning, suggest deleting that last line. > As of this patch 5 of 5 it is no longer used. i.e.: > > - struct pstate_data *pstate = &cpu->pstate; > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015.11.23 05:07 Philippe Longepe wrote: > The time spent in iowait is converted into cycles at max freq, > so it increases the load during IOs. ... > + cummulative_iowait = get_cpu_iowait_time_us(cpu->cpu, &now); ... Did you consider to use get_cpu_idle_time_us(cpu_num, NULL) instead? Why? Because it doesn't include iowait time, and thus just subtract it from the kernel time to get the load, including iowait. Crudely, below is what I have been trying for a month now (which also makes the is mpref / tsc allowed or not discussion moot, because it isn't used). Note, it can include iowait or not, you can observe by the commented out lines (I have been testing both methods): @@ -726,21 +737,32 @@ static inline void intel_pstate_sample(struct cpudata *cpu) tsc = rdtsc(); local_irq_restore(flags); + time = ktime_get(); + idle_us = get_cpu_idle_time_us(cpu_num, NULL); +// io_wait_us = get_cpu_iowait_time_us(cpu_num, NULL); + if (idle_us == -1ULL) { + /* !NO_HZ so we can rely on cpustat.idle */ + idle = kcpustat_cpu(cpu_num).cpustat[CPUTIME_IDLE]; +// io_wait = kcpustat_cpu(cpu_num).cpustat[CPUTIME_IOWAIT]; + idle_us = cputime_to_usecs(idle); +// io_wait_us = cputime_to_usecs(io_wait); + } ... + cpu->sample.duration_us = ktime_us_delta(time, cpu->prev_time); + cpu->sample.idle_us = idle_us - cpu->prev_idle_us; +// cpu->sample.io_wait_us = io_wait_us - cpu->prev_io_wait_us; And then the load calculation becomes (I use units of tenths of a percent in my stuff): - core_pct = int_tofp(sample->mperf) * int_tofp(1000); - core_pct = div64_u64(core_pct, int_tofp(sample->tsc)); +// core_pct = sample->duration_us - sample->idle_us - sample->io_wait_us; + core_pct = sample->duration_us - sample->idle_us; + if (core_pct < 0) core_pct = 0; + core_pct = int_tofp(core_pct) * int_tofp(1000); + core_pct = div64_u64(core_pct, int_tofp(sample->duration_us)); In the above notice the special NO_HZ case. I don't this area well, but I was of the understanding that the NO_HZ case requires special code. (see also: fs/proc/stat.c, which is where I stole this stuff.) Note that I haven't actually tested NO_HZ mode yet. ... Doug -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 4f08fa7..61b31d4 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -114,6 +114,7 @@ struct cpudata { u64 prev_mperf; u64 prev_tsc; struct sample sample; + u64 prev_cummulative_iowait; }; static struct cpudata **all_cpu_data; @@ -931,12 +932,22 @@ static inline void intel_pstate_set_sample_time(struct cpudata *cpu) mod_timer_pinned(&cpu->timer, jiffies + delay); } - static inline int32_t intel_pstate_calc_scaled_busy(struct cpudata *cpu) { + u64 cummulative_iowait, delta_iowait_us; + u64 delta_iowait_mperf; + u64 mperf, now; struct sample *sample = &cpu->sample; struct pstate_data *pstate = &cpu->pstate; - int64_t core_busy_ratio; + + cummulative_iowait = get_cpu_iowait_time_us(cpu->cpu, &now); + /* Convert iowait time into number of IO cycles spent at max_freq */ + delta_iowait_us = cummulative_iowait - cpu->prev_cummulative_iowait; + delta_iowait_mperf = div64_u64(delta_iowait_us * cpu->pstate.scaling * + cpu->pstate.max_pstate, 1000); + + mperf = cpu->sample.mperf + delta_iowait_mperf; + cpu->prev_cummulative_iowait = cummulative_iowait; /* * The load can be estimated as the ratio of the mperf counter @@ -944,30 +955,11 @@ static inline int32_t intel_pstate_calc_scaled_busy(struct cpudata *cpu) * (C0) and the time stamp counter running at the same frequency * also during C-states. */ - sample->cpu_load = div64_u64(100 * sample->mperf, sample->tsc); - - /* - * The target P-state can be estimated with the following formula: - * PercentPerformance = PercentBusy * (delta_aperf/delta_mperf); - * (see Section 14.2 from Intel Software Developer Manual) - * with PercentBusy = 100 * (delta_mperf / delta_tsc) and - * PercentPerformance can be simplified with: - * (delta_mperf * delta_aperf) / (delta_tsc * delta_mperf) = - * delta_aperf / delta_tsc. Finally, we normalize core_busy_ratio, - * which was our actual percent performance to what we requested - * during the last sample period. The result will be a percentage of - * busy at a specified pstate. - */ - core_busy_ratio = div64_u64(int_tofp(100) * sample->aperf * - pstate->max_pstate, sample->tsc * pstate->current_pstate); + sample->cpu_load = div64_u64(100 * mperf, sample->tsc); - sample->freq = div64_u64(sample->aperf * pstate->max_pstate * - pstate->scaling, sample->mperf); - - return core_busy_ratio; + return sample->cpu_load; } - static inline int32_t intel_pstate_get_scaled_busy_estimate(struct cpudata *cpu) { int32_t core_busy, max_pstate, current_pstate, sample_ratio;