Message ID | 1452647483-14244-3-git-send-email-jacob.jun.pan@linux.intel.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Tue, 12 Jan 2016, Jacob Pan wrote: > @@ -805,30 +809,18 @@ static int rapl_write_data_raw(struct rapl_domain *rd, > enum rapl_primitives prim, > unsigned long long value) > { > - u64 msr_val; > - u32 msr; > struct rapl_primitive_info *rp = &rpi[prim]; > int cpu; > + u64 bits; > > cpu = find_active_cpu_on_package(rd->package_id); > if (cpu < 0) > return cpu; > - msr = rd->msrs[rp->id]; > - if (rdmsrl_safe_on_cpu(cpu, msr, &msr_val)) { > - dev_dbg(&rd->power_zone.dev, > - "failed to read msr 0x%x on cpu %d\n", msr, cpu); > - return -EIO; > - } > - value = rapl_unit_xlate(rd, rd->package_id, rp->unit, value, 1); > - msr_val &= ~rp->mask; > - msr_val |= value << rp->shift; > - if (wrmsrl_safe_on_cpu(cpu, msr, msr_val)) { > - dev_dbg(&rd->power_zone.dev, > - "failed to write msr 0x%x on cpu %d\n", msr, cpu); > - return -EIO; > - } > > - return 0; > + bits = rapl_unit_xlate(rd, rd->package_id, rp->unit, value, 1); > + bits |= bits << rp->shift; > + > + return rmwmsrl_safe_on_cpu(cpu, rd->msrs[rp->id], rp->mask, bits); So here you actually use that new (misnomed) function, but for > +static void power_limit_irq_save_cpu(void *info) and > +static void power_limit_irq_restore_cpu(void *info) you use a bog standard smp function call. What's the benefit of adding that rmw function over a bog standard smp function call if you can only use it for one instance of the same pattern? Boris asked you the same question here https://lkml.kernel.org/r/20151220152749.GA29805@pd.tnic but you decided to ignore it. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 13 Jan 2016 10:47:26 +0100 (CET) Thomas Gleixner <tglx@linutronix.de> wrote: > So here you actually use that new (misnomed) function, but for > > > +static void power_limit_irq_save_cpu(void *info) > > and > > > +static void power_limit_irq_restore_cpu(void *info) > > you use a bog standard smp function call. What's the benefit of > adding that rmw function over a bog standard smp function call if you > can only use it for one instance of the same pattern? > > Boris asked you the same question here > > https://lkml.kernel.org/r/20151220152749.GA29805@pd.tnic > > but you decided to ignore it. +Borislav, Thanks for bring this out. I didn't mean to ignore. I thought my point was stated in the commit message there was no point of going back and forth. Read-Modify-Write is quite common, not just for RAPL could be used by future code. Sorry if I wasn't clear. Jacob -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 13, 2016 at 08:21:13AM -0800, Jacob Pan wrote: > Thanks for bring this out. I didn't mean to ignore. I thought my point > was stated in the commit message there was no point of going back and > forth. Read-Modify-Write is quite common, not just for RAPL could be > used by future code. Sorry if I wasn't clear. But it also shows that it doesn't suffice for all your needs. So why add it? You can much better define your own functions which do all the MSR handling you require and call them with smp_call_function_*.
On Wed, 13 Jan 2016 17:36:10 +0100 Borislav Petkov <bp@alien8.de> wrote: > On Wed, Jan 13, 2016 at 08:21:13AM -0800, Jacob Pan wrote: > > Thanks for bring this out. I didn't mean to ignore. I thought my > > point was stated in the commit message there was no point of going > > back and forth. Read-Modify-Write is quite common, not just for > > RAPL could be used by future code. Sorry if I wasn't clear. > > But it also shows that it doesn't suffice for all your needs. So why > add it? > > You can much better define your own functions which do all the MSR > handling you require and call them with smp_call_function_*. > yeah, that is what I did in the original patch. https://lkml.org/lkml/2015/12/7/1090 Then i was suggested to add a rmw msr api for the common good :), I think it is a good idea since such operation is not limited to RAPL driver. Other register access APIs such as regmap have more complete selections. Jacob -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 13, 2016 at 09:51:24AM -0800, Jacob Pan wrote: > yeah, that is what I did in the original patch. > https://lkml.org/lkml/2015/12/7/1090 > > Then i was suggested to add a rmw msr api for the common good :), I > think it is a good idea since such operation is not limited to RAPL > driver. Others are...?
On Wed, 13 Jan 2016 19:04:12 +0100 Borislav Petkov <bp@alien8.de> wrote: > > Then i was suggested to add a rmw msr api for the common good :), I > > think it is a good idea since such operation is not limited to RAPL > > driver. > > Others are...? nothing for the _safe version right now. Quick scan through the code I see the non safe version has a couple can be converted to rmw. e.g. static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate) { ... rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h); if (newstate == DC_DISABLE) { pr_debug("CPU#%d disabling modulation\n", cpu); wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & ~(1<<4), h); } else { pr_debug("CPU#%d setting duty cycle to %d%%\n", cpu, ((125 * newstate) / 10)); /* bits 63 - 5 : reserved * bit 4 : enable/disable * bits 3-1 : duty cycle * bit 0 : reserved */ l = (l & ~14); l = l | (1<<4) | ((newstate & 0x7)<<1); wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h); } static int sfi_cpufreq_target(struct cpufreq_policy *policy, unsigned int index) { ... rdmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, &lo, &hi); lo = (lo & ~INTEL_PERF_CTL_MASK) | ((u32) sfi_cpufreq_array[next_perf_state].ctrl_val & INTEL_PERF_CTL_MASK); wrmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, lo, hi); -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 13, 2016 at 10:21:38AM -0800, Jacob Pan wrote: > static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate) > { > ... > > rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h); > if (newstate == DC_DISABLE) { > pr_debug("CPU#%d disabling modulation\n", cpu); > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & ~(1<<4), > h); } else { > pr_debug("CPU#%d setting duty cycle to %d%%\n", > cpu, ((125 * newstate) / 10)); > /* bits 63 - 5 : reserved > * bit 4 : enable/disable > * bits 3-1 : duty cycle > * bit 0 : reserved > */ > l = (l & ~14); > l = l | (1<<4) | ((newstate & 0x7)<<1); > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h); > } This cannot be converted because you need to do the stuff between the rdmsr_on_cpu() and wrmsr_on_cpu() calls. > static int sfi_cpufreq_target(struct cpufreq_policy *policy, unsigned > int index) { > ... > > rdmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, &lo, &hi); > lo = (lo & ~INTEL_PERF_CTL_MASK) | > ((u32) sfi_cpufreq_array[next_perf_state].ctrl_val & > INTEL_PERF_CTL_MASK); > wrmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, lo, hi); Ditto. These two examples prove my point, actually.
On Wed, 13 Jan 2016 20:16:22 +0100 Borislav Petkov <bp@alien8.de> wrote: > On Wed, Jan 13, 2016 at 10:21:38AM -0800, Jacob Pan wrote: > > static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate) > > { > > ... > > > > rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h); > > if (newstate == DC_DISABLE) { > > pr_debug("CPU#%d disabling modulation\n", cpu); > > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & > > ~(1<<4), h); } else { > > pr_debug("CPU#%d setting duty cycle to %d%%\n", > > cpu, ((125 * newstate) / 10)); > > /* bits 63 - 5 : reserved > > * bit 4 : enable/disable > > * bits 3-1 : duty cycle > > * bit 0 : reserved > > */ > > l = (l & ~14); > > l = l | (1<<4) | ((newstate & 0x7)<<1); > > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h); > > } > > This cannot be converted because you need to do the stuff between the > rdmsr_on_cpu() and wrmsr_on_cpu() calls. > it can be converted if move the below if statement outside read/write pair. if (newstate == DC_DISABLE) { > > static int sfi_cpufreq_target(struct cpufreq_policy *policy, > > unsigned int index) { > > ... > > > > rdmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, &lo, &hi); > > lo = (lo & ~INTEL_PERF_CTL_MASK) | > > ((u32) sfi_cpufreq_array[next_perf_state].ctrl_val & > > INTEL_PERF_CTL_MASK); > > wrmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, lo, hi); > > Ditto. > > These two examples prove my point, actually. same here, it is just clear mask and set mask, why not? -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 13, 2016 at 12:10:03PM -0800, Jacob Pan wrote: > > > rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h); > > > if (newstate == DC_DISABLE) { > > > pr_debug("CPU#%d disabling modulation\n", cpu); > > > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & > > > ~(1<<4), h); } else { > > > pr_debug("CPU#%d setting duty cycle to %d%%\n", > > > cpu, ((125 * newstate) / 10)); > > > /* bits 63 - 5 : reserved > > > * bit 4 : enable/disable > > > * bits 3-1 : duty cycle > > > * bit 0 : reserved > > > */ > > > l = (l & ~14); > > > l = l | (1<<4) | ((newstate & 0x7)<<1); > > > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h); > > > } > > > > This cannot be converted because you need to do the stuff between the > > rdmsr_on_cpu() and wrmsr_on_cpu() calls. > > > it can be converted if move the below if statement outside read/write > pair. > if (newstate == DC_DISABLE) { You mean something like this (I'm having hard time even figuring out what goes where): if (newstate == DC_DISABLE) { pr_debug("CPU#%d disabling modulation\n", cpu); rmwmsrl_safe_on_cpu(cpu, MSR_IA32_THERM_CONTROL, (1 << 4), 0); } else { pr_debug("CPU#%d setting duty cycle to %d%%\n", cpu, ((125 * newstate) / 10)); rmwmsrl_safe_on_cpu(cpu, MSR_IA32_THERM_CONTROL, 14, (1 << 4) | ((newstate & 0x7)<<1)); } Now this is *absolutely* unreadable and hard to use. The previous version at least showed what happens to which bits. This call site will make everyone go look at the definition of rmwmsrl_safe_on_cpu() and see what those last two arguments do actually. And, again, for the n-th time, this still doesn't work if you need to do other stuff between the rdmsr and wrmsr. So your interface will cover *some* cases but not all. So people should do rmwmsrl_safe_on_cpu() but not always - only if they don't need to do stuff between the reads and the writes. Hmm, no thanks. > > > static int sfi_cpufreq_target(struct cpufreq_policy *policy, > > > unsigned int index) { > > > ... > > > > > > rdmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, &lo, &hi); > > > lo = (lo & ~INTEL_PERF_CTL_MASK) | > > > ((u32) sfi_cpufreq_array[next_perf_state].ctrl_val & > > > INTEL_PERF_CTL_MASK); > > > wrmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, lo, hi); > > > > Ditto. > > > > These two examples prove my point, actually. > > same here, it is just clear mask and set mask, why not? Like this? rmwmsrl_safe_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, INTEL_PERF_CTL_MASK, (u32)sfi_cpufreq_array[next_perf_state].ctrl_val & INTEL_PERF_CTL_MASK); Yikes! So yes, it can work but it is ugly, hard to parse and use, not generic enough, etc, etc. So thanks, but no thanks.
On Wed, 13 Jan 2016, Jacob Pan wrote: > On Wed, 13 Jan 2016 20:16:22 +0100 > Borislav Petkov <bp@alien8.de> wrote: > > > On Wed, Jan 13, 2016 at 10:21:38AM -0800, Jacob Pan wrote: > > > static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate) > > > { > > > ... > > > > > > rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h); > > > if (newstate == DC_DISABLE) { > > > pr_debug("CPU#%d disabling modulation\n", cpu); > > > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & > > > ~(1<<4), h); } else { > > > pr_debug("CPU#%d setting duty cycle to %d%%\n", > > > cpu, ((125 * newstate) / 10)); > > > /* bits 63 - 5 : reserved > > > * bit 4 : enable/disable > > > * bits 3-1 : duty cycle > > > * bit 0 : reserved > > > */ > > > l = (l & ~14); > > > l = l | (1<<4) | ((newstate & 0x7)<<1); > > > wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h); > > > } > > > > This cannot be converted because you need to do the stuff between the > > rdmsr_on_cpu() and wrmsr_on_cpu() calls. > > > it can be converted if move the below if statement outside read/write > pair. > if (newstate == DC_DISABLE) { > > > > static int sfi_cpufreq_target(struct cpufreq_policy *policy, > > > unsigned int index) { > > > ... > > > > > > rdmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, &lo, &hi); > > > lo = (lo & ~INTEL_PERF_CTL_MASK) | > > > ((u32) sfi_cpufreq_array[next_perf_state].ctrl_val & > > > INTEL_PERF_CTL_MASK); > > > wrmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, lo, hi); > > > > Ditto. > > > > These two examples prove my point, actually. > > same here, it is just clear mask and set mask, why not? And what's the actual saving over a simple function which does that rdmsr, modify, wrmsr thing and call it via smp_call_function like you did for 2 of 3 places in the rapl driver? The amount of IPIs is the same. The amount of saved code is questionable. Lets look at your usecase: @@ -805,30 +809,18 @@ static int rapl_write_data_raw(struct rapl_domain *rd, enum rapl_primitives prim, unsigned long long value) { - u64 msr_val; - u32 msr; struct rapl_primitive_info *rp = &rpi[prim]; int cpu; + u64 bits; cpu = find_active_cpu_on_package(rd->package_id); if (cpu < 0) return cpu; - msr = rd->msrs[rp->id]; - if (rdmsrl_safe_on_cpu(cpu, msr, &msr_val)) { - dev_dbg(&rd->power_zone.dev, - "failed to read msr 0x%x on cpu %d\n", msr, cpu); - return -EIO; - } - value = rapl_unit_xlate(rd, rd->package_id, rp->unit, value, 1); - msr_val &= ~rp->mask; - msr_val |= value << rp->shift; - if (wrmsrl_safe_on_cpu(cpu, msr, msr_val)) { - dev_dbg(&rd->power_zone.dev, - "failed to write msr 0x%x on cpu %d\n", msr, cpu); - return -EIO; - } - return 0; + bits = rapl_unit_xlate(rd, rd->package_id, rp->unit, value, 1); + bits |= bits << rp->shift; + + return rmwmsrl_safe_on_cpu(cpu, rd->msrs[rp->id], rp->mask, bits); } So that has: 5 insertions and 17 deletions And the library code adds 65 lines including an export. So the text size balance of this is: Mainline: text data bss dec hex filename 5021 1008 0 6029 178d ../build/arch/x86/lib/built-in.o 10870 1040 24 11934 2e9e ../build/drivers/powercap/intel_rapl.o Your patch (Just the above part which uses the lib stuff) 5385 1008 0 6393 18f9 ../build/arch/x86/lib/built-in.o 10838 1040 24 11902 2e7e ../build/drivers/powercap/intel_rapl.o ----- + 332 Now you have two more possible candidates, which require another 65 lines of different library code and lets assume another 364 bytes of library code. So lets further assume the above examples safe us like the rapl one 32 bytes each, then the net damage is: 632 byte extra text size. So what exactly is the point of this exercise? Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2016-01-13 at 22:26 +0100, Borislav Petkov wrote: [Cut] > > rmwmsrl_safe_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, > INTEL_PERF_CTL_MASK, > (u32)sfi_cpufreq_array[next_perf_state].ctr > l_val & INTEL_PERF_CTL_MASK); > > Yikes! > > So yes, it can work but it is ugly, hard to parse and use, not > generic > enough, etc, etc. > > So thanks, but no thanks. > I agree, in some cases it will not make much sense to use read- modify_write calls, the user may decide whether it makes sense or not. But such interface is not new to Linux kernel: regmap_update_bits(), which is referenced for 346 times. Are you saying that any such calls are not useful? Thanks, Srinivas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 13 Jan 2016, Srinivas Pandruvada wrote: > On Wed, 2016-01-13 at 22:26 +0100, Borislav Petkov wrote: > > rmwmsrl_safe_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, > > INTEL_PERF_CTL_MASK, > > (u32)sfi_cpufreq_array[next_perf_state].ctr > > l_val & INTEL_PERF_CTL_MASK); > > > > Yikes! > > > > So yes, it can work but it is ugly, hard to parse and use, not > > generic > > enough, etc, etc. > > > > So thanks, but no thanks. > > > I agree, in some cases it will not make much sense to use read- > modify_write calls, the user may decide whether it makes sense or not. > But such interface is not new to Linux kernel: > > regmap_update_bits(), which is referenced for 346 times. > > Are you saying that any such calls are not useful? There are certainly cases when such calls are useful. And those cases are when we have a sufficiently big occurence of similar code which is sufficiently complex to justify the library code and the export. Thanks, tglx
On Wed, 13 Jan 2016 23:02:33 +0100 (CET) Thomas Gleixner <tglx@linutronix.de> wrote: > On Wed, 13 Jan 2016, Srinivas Pandruvada wrote: > > On Wed, 2016-01-13 at 22:26 +0100, Borislav Petkov wrote: > > > rmwmsrl_safe_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, > > > INTEL_PERF_CTL_MASK, > > > (u32)sfi_cpufreq_array[next_perf_state].ctr > > > l_val & INTEL_PERF_CTL_MASK); > > > > > > Yikes! > > > > > > So yes, it can work but it is ugly, hard to parse and use, not > > > generic > > > enough, etc, etc. > > > > > > So thanks, but no thanks. > > > > > I agree, in some cases it will not make much sense to use read- > > modify_write calls, the user may decide whether it makes sense or > > not. But such interface is not new to Linux kernel: > > > > regmap_update_bits(), which is referenced for 346 times. > > > > Are you saying that any such calls are not useful? > > There are certainly cases when such calls are useful. And those cases > are when we have a sufficiently big occurence of similar code which > is sufficiently complex to justify the library code and the export. > The balance of pros and cons depends on the number of occurrence. The lib call overhead is constant where saving from the callers are multiplied. Anyway, I will go back to my original code until we have enough callers to tip the balance. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 13, 2016 at 01:54:43PM -0800, Srinivas Pandruvada wrote: > regmap_update_bits(), which is referenced for 346 times. I'm at a loss as to why people keep talking about regmap. $ git grep regmap_update_bits arch/x86/ $ > Are you saying that any such calls are not useful? It seems I have to repeat myself ad absurdum so let me summarize: I don't have anything against useful APIs. Those are not.
On Wed, 13 Jan 2016 22:26:02 +0100 Borislav Petkov <bp@alien8.de> wrote: > You mean something like this (I'm having hard time even figuring out > what goes where): > > if (newstate == DC_DISABLE) { > pr_debug("CPU#%d disabling modulation\n", cpu); > rmwmsrl_safe_on_cpu(cpu, MSR_IA32_THERM_CONTROL, (1 > << 4), 0); } else { > pr_debug("CPU#%d setting duty cycle to %d%%\n", cpu, > ((125 * newstate) / 10)); rmwmsrl_safe_on_cpu(cpu, > MSR_IA32_THERM_CONTROL, 14, (1 << 4) | ((newstate & > 0x7)<<1)); } > > Now this is *absolutely* unreadable and hard to use. The previous > version at least showed what happens to which bits. This call site > will make everyone go look at the definition of rmwmsrl_safe_on_cpu() > and see what those last two arguments do actually. > To me the caller code became more readable. I think you are referring the function name being not readable, which is separate of this conversion. > And, again, for the n-th time, this still doesn't work if you need to > do other stuff between the rdmsr and wrmsr. So your interface will > cover *some* cases but not all. So people should do > rmwmsrl_safe_on_cpu() but not always - only if they don't need to do > stuff between the reads and the writes. > I know, I never disagreed with that :) which is why I am not using it in the other cases in RAPL driver. > Hmm, no thanks. > > > > > static int sfi_cpufreq_target(struct cpufreq_policy *policy, > > > > unsigned int index) { > > > > ... > > > > > > > > rdmsr_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, &lo, &hi); > > > > lo = (lo & ~INTEL_PERF_CTL_MASK) | > > > > ((u32) > > > > sfi_cpufreq_array[next_perf_state].ctrl_val & > > > > INTEL_PERF_CTL_MASK); wrmsr_on_cpu(policy->cpu, > > > > MSR_IA32_PERF_CTL, lo, hi); > > > > > > Ditto. > > > > > > These two examples prove my point, actually. > > > > same here, it is just clear mask and set mask, why not? > > Like this? > > rmwmsrl_safe_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, > INTEL_PERF_CTL_MASK, > (u32)sfi_cpufreq_array[next_perf_state].ctrl_val > & INTEL_PERF_CTL_MASK); > > Yikes! > > So yes, it can work but it is ugly, hard to parse and use, not generic > enough, etc, etc. I don't think the conversion adds extra ugliness. It actually makes it more clear what is the mask to clear and what bits to set. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/13/16 14:11, Jacob Pan wrote: > > The balance of pros and cons depends on the number of occurrence. The > lib call overhead is constant where saving from the callers are > multiplied. Anyway, I will go back to my original code until we have > enough callers to tip the balance. > The thing about premature librarization is that it can cause very unnatural code to end up being written. Until it is clear what the APIs we actually need are, we shouldn't force them into a mold. Of course, it often makes sense to make these *local* APIs that, if useful, can get globalized. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 13, 2016 at 02:20:07PM -0800, Jacob Pan wrote: > To me the caller code became more readable. Not to me. > I think you are referring the function name being not readable, which > is separate of this conversion. You think wrong. I typed in the example and commented right under it. The old code is much more understandable than the new code. The old code shows what bits get set and cleared, the new code uses masks, sometimes hidden behind defines, which people have to look up to understand what's going on. And then look at the function definition to know which arg which is. And so on and so on... But I'm going to stop wasting my time with this now. I gave you enough arguments against.
On Wed, 2016-01-13 at 23:16 +0100, Borislav Petkov wrote: > On Wed, Jan 13, 2016 at 01:54:43PM -0800, Srinivas Pandruvada wrote: > > regmap_update_bits(), which is referenced for 346 times. > > I'm at a loss as to why people keep talking about regmap. Thomas Gleixner, understood the context and he gave a satisfactory reply. Thanks, Srinivas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c index 48747c2..b34d6a6 100644 --- a/drivers/powercap/intel_rapl.c +++ b/drivers/powercap/intel_rapl.c @@ -263,7 +263,11 @@ static struct rapl_package *find_package_by_id(int id) /* caller to ensure CPU hotplug lock is held */ static int find_active_cpu_on_package(int package_id) { - int i; + /* try to avoid remote cpu call, use raw since we are preemptible */ + int i = raw_smp_processor_id(); + + if (topology_physical_package_id(i) == package_id) + return i; for_each_online_cpu(i) { if (topology_physical_package_id(i) == package_id) @@ -805,30 +809,18 @@ static int rapl_write_data_raw(struct rapl_domain *rd, enum rapl_primitives prim, unsigned long long value) { - u64 msr_val; - u32 msr; struct rapl_primitive_info *rp = &rpi[prim]; int cpu; + u64 bits; cpu = find_active_cpu_on_package(rd->package_id); if (cpu < 0) return cpu; - msr = rd->msrs[rp->id]; - if (rdmsrl_safe_on_cpu(cpu, msr, &msr_val)) { - dev_dbg(&rd->power_zone.dev, - "failed to read msr 0x%x on cpu %d\n", msr, cpu); - return -EIO; - } - value = rapl_unit_xlate(rd, rd->package_id, rp->unit, value, 1); - msr_val &= ~rp->mask; - msr_val |= value << rp->shift; - if (wrmsrl_safe_on_cpu(cpu, msr, msr_val)) { - dev_dbg(&rd->power_zone.dev, - "failed to write msr 0x%x on cpu %d\n", msr, cpu); - return -EIO; - } - return 0; + bits = rapl_unit_xlate(rd, rd->package_id, rp->unit, value, 1); + bits |= bits << rp->shift; + + return rmwmsrl_safe_on_cpu(cpu, rd->msrs[rp->id], rp->mask, bits); } /* @@ -893,6 +885,21 @@ static int rapl_check_unit_atom(struct rapl_package *rp, int cpu) return 0; } +static void power_limit_irq_save_cpu(void *info) +{ + u32 l, h = 0; + struct rapl_package *rp = (struct rapl_package *)info; + + /* save the state of PLN irq mask bit before disabling it */ + rdmsr_safe(MSR_IA32_PACKAGE_THERM_INTERRUPT, &l, &h); + if (!(rp->power_limit_irq & PACKAGE_PLN_INT_SAVED)) { + rp->power_limit_irq = l & PACKAGE_THERM_INT_PLN_ENABLE; + rp->power_limit_irq |= PACKAGE_PLN_INT_SAVED; + } + l &= ~PACKAGE_THERM_INT_PLN_ENABLE; + wrmsr_safe(MSR_IA32_PACKAGE_THERM_INTERRUPT, l, h); +} + /* REVISIT: * When package power limit is set artificially low by RAPL, LVT @@ -906,7 +913,6 @@ static int rapl_check_unit_atom(struct rapl_package *rp, int cpu) static void package_power_limit_irq_save(int package_id) { - u32 l, h = 0; int cpu; struct rapl_package *rp; @@ -920,20 +926,27 @@ static void package_power_limit_irq_save(int package_id) cpu = find_active_cpu_on_package(package_id); if (cpu < 0) return; - /* save the state of PLN irq mask bit before disabling it */ - rdmsr_safe_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &l, &h); - if (!(rp->power_limit_irq & PACKAGE_PLN_INT_SAVED)) { - rp->power_limit_irq = l & PACKAGE_THERM_INT_PLN_ENABLE; - rp->power_limit_irq |= PACKAGE_PLN_INT_SAVED; - } - l &= ~PACKAGE_THERM_INT_PLN_ENABLE; - wrmsr_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, l, h); + smp_call_function_single(cpu, power_limit_irq_save_cpu, rp, 1); +} + +static void power_limit_irq_restore_cpu(void *info) +{ + u32 l, h = 0; + struct rapl_package *rp = (struct rapl_package *)info; + + rdmsr_safe(MSR_IA32_PACKAGE_THERM_INTERRUPT, &l, &h); + + if (rp->power_limit_irq & PACKAGE_THERM_INT_PLN_ENABLE) + l |= PACKAGE_THERM_INT_PLN_ENABLE; + else + l &= ~PACKAGE_THERM_INT_PLN_ENABLE; + + wrmsr_safe(MSR_IA32_PACKAGE_THERM_INTERRUPT, l, h); } /* restore per package power limit interrupt enable state */ static void package_power_limit_irq_restore(int package_id) { - u32 l, h; int cpu; struct rapl_package *rp; @@ -951,14 +964,8 @@ static void package_power_limit_irq_restore(int package_id) /* irq enable state not saved, nothing to restore */ if (!(rp->power_limit_irq & PACKAGE_PLN_INT_SAVED)) return; - rdmsr_safe_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &l, &h); - - if (rp->power_limit_irq & PACKAGE_THERM_INT_PLN_ENABLE) - l |= PACKAGE_THERM_INT_PLN_ENABLE; - else - l &= ~PACKAGE_THERM_INT_PLN_ENABLE; - wrmsr_on_cpu(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, l, h); + smp_call_function_single(cpu, power_limit_irq_restore_cpu, rp, 1); } static void set_floor_freq_default(struct rapl_domain *rd, bool mode)
Reduce remote CPU calls for MSR access by combining read modify write into one function. Also optimize search for active CPU on package. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> --- drivers/powercap/intel_rapl.c | 77 +++++++++++++++++++++++-------------------- 1 file changed, 42 insertions(+), 35 deletions(-)