Message ID | 1341494608-16591-2-git-send-email-daniel.lezcano@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thursday, July 05, 2012, Daniel Lezcano wrote: > We have the state index passed as parameter to the 'enter' function. > Most of the drivers assign their 'enter' functions several times in > the cpuidle_state structure, as we have the index, we can delegate > to the driver to handle their own callback array. > > That will have the benefit of removing multiple lines of code in the > different drivers. Hmm. I suppose the cpuidle subsystem was designed the way it was for a reason. Among other things, this was to avoid recurrence in callbacks - please see acpi_idle_enter_bm() for example. Now, if .enter() is moved to the driver structure, it will have to be an all-purpose complicated routine calling itself recursively at least in some cases. I'm not quite convinced that would be an improvement. On the other hand, I don't see anything wrong with setting several callback pointers to the same routine. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/05/2012 10:38 PM, Rafael J. Wysocki wrote: > On Thursday, July 05, 2012, Daniel Lezcano wrote: >> We have the state index passed as parameter to the 'enter' function. >> Most of the drivers assign their 'enter' functions several times in >> the cpuidle_state structure, as we have the index, we can delegate >> to the driver to handle their own callback array. >> >> That will have the benefit of removing multiple lines of code in the >> different drivers. > > Hmm. I suppose the cpuidle subsystem was designed the way it was for a reason. > Among other things, this was to avoid recurrence in callbacks - please see > acpi_idle_enter_bm() for example. > > Now, if .enter() is moved to the driver structure, it will have to be an > all-purpose complicated routine calling itself recursively at least in > some cases. I'm not quite convinced that would be an improvement. > > On the other hand, I don't see anything wrong with setting several callback > pointers to the same routine. Deepthi sent a few months ago a patch moving the per-cpu cpuidle_state to a single structure stored in the driver. The drivers were modified and cleanup to take into account this modification. We saw some regressions like for example the 'disable' which were not per cpu and has been moved to the cpuidle_state_usage (and IMHO it is a workaround more than a real fix). And now we have new architectures (tegra3, big.LITTLE) with different latencies per cpu. Logically we should revert Deepthi's patches but from my POV, going back and forth is not a good solution (also we have to undo all modifications done in the different drivers). The main purpose of all these cleanup patches are to move out all non-data information from the cpuidle_state structure in order to add a new api which could be 'cpuidle_register_cpu_latency(int cpu, struct cpuidle_latencies latencies)'. For this specific patch, the 'enter' function for all the drivers is not used [1] and one of the driver is about to use a single function [2]. So we have only one driver is a couple of functions for this which can be replaced by an array of callbacks in the driver itself as we have the index. [1] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012355.html [2] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012399.html
On Friday, July 06, 2012, Daniel Lezcano wrote: > On 07/05/2012 10:38 PM, Rafael J. Wysocki wrote: > > On Thursday, July 05, 2012, Daniel Lezcano wrote: > >> We have the state index passed as parameter to the 'enter' function. > >> Most of the drivers assign their 'enter' functions several times in > >> the cpuidle_state structure, as we have the index, we can delegate > >> to the driver to handle their own callback array. > >> > >> That will have the benefit of removing multiple lines of code in the > >> different drivers. > > > > Hmm. I suppose the cpuidle subsystem was designed the way it was for a reason. > > Among other things, this was to avoid recurrence in callbacks - please see > > acpi_idle_enter_bm() for example. > > > > Now, if .enter() is moved to the driver structure, it will have to be an > > all-purpose complicated routine calling itself recursively at least in > > some cases. I'm not quite convinced that would be an improvement. > > > > On the other hand, I don't see anything wrong with setting several callback > > pointers to the same routine. > > Deepthi sent a few months ago a patch moving the per-cpu cpuidle_state > to a single structure stored in the driver. The drivers were modified > and cleanup to take into account this modification. > > We saw some regressions like for example the 'disable' which were not > per cpu and has been moved to the cpuidle_state_usage (and IMHO it is a > workaround more than a real fix). And now we have new architectures > (tegra3, big.LITTLE) with different latencies per cpu. Logically we > should revert Deepthi's patches but from my POV, going back and forth is > not a good solution (also we have to undo all modifications done in the > different drivers). > > The main purpose of all these cleanup patches are to move out all > non-data information from the cpuidle_state structure in order to add a > new api which could be 'cpuidle_register_cpu_latency(int cpu, struct > cpuidle_latencies latencies)'. Well, my question is: what specifically is wrong with having callback pointers in struct cpuidle_state in case when that structure is known to be common for all the CPUs in the system? Sure, there are systems in which that is not the case, but then why should we complicate the simplest case in order to address them? > For this specific patch, the 'enter' function for all the drivers is not > used [1] and one of the driver is about to use a single function [2]. So > we have only one driver is a couple of functions for this which can be > replaced by an array of callbacks in the driver itself as we have the index. > > [1] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012355.html I agree with Deepthi. > [2] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012399.html And that is fine and dandy as long as the OMAP people want that. Evidently, that is not wanted by everybody and if it had been, the cpuidle subsystem would have been designed differently. I really would like that thing to be reconsidered before we proceed with these changes. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Daniel, On Friday 06 July 2012 04:28 PM, Daniel Lezcano wrote: > The main purpose of all these cleanup patches are to move out all > non-data information from the cpuidle_state structure in order to add a > new api which could be 'cpuidle_register_cpu_latency(int cpu, struct > cpuidle_latencies latencies)'. So are there any technical difficulties in adding such an api with the non-data information being part of cpuidle_state? Or its just that you think the already duplicated non-data information (per state) is going to be duplicated much more (per state per cpu) in most cases. regards, Rajendra -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/10/2012 10:29 AM, Rajendra Nayak wrote: > Hi Daniel, > > On Friday 06 July 2012 04:28 PM, Daniel Lezcano wrote: >> The main purpose of all these cleanup patches are to move out all >> non-data information from the cpuidle_state structure in order to add a >> new api which could be 'cpuidle_register_cpu_latency(int cpu, struct >> cpuidle_latencies latencies)'. > > So are there any technical difficulties in adding such an api with the > non-data information being part of cpuidle_state? Or its just that you > think the already duplicated non-data information (per state) is going > to be duplicated much more (per state per cpu) in most cases. Yes, it is possible. I am already working on an alternate patchset I will post soon, but I don't like to add more things in a subsystem before cleaning up the code when it makes sense. When I looked at the cpuidle code, I noticed this 'enter' field was used for the same function. With per cpu latency and using the cpuidle_state structure, that means duplication of this field. For an x86_64 architecture with 16 cores and 4 C-states, that is 512 bytes of memory instead of 8 bytes. That is the same for example the 'disable' field which could be stored in the 'flags' field as a bit instead of the moving it to the 'stats' structure with "unsigned long long". As the architectures tend to add more cores [1][2], that means more memory consumption. In a very near future, we will have 100 cores in a cpu. Assuming we have 4 C-states, that is 100 * 4 * 8 = 3200 bytes of memory used for a single function. IMHO, the kernel shouldn't assume the user must add more memory on the system. Anyway, I will post a patchset to use the cpuidle_state per cpu. Thanks -- Daniel [1] http://www.engadget.com/2007/02/11/intel-demonstrates-80-core-processor/ [2] http://www.tilera.com/products/processors/TILE-Gx_Family
> As the architectures tend to add more cores [1][2], that means more > memory consumption. In a very near future, we will have 100 cores in a > cpu. Assuming we have 4 C-states, that is 100 * 4 * 8 = 3200 bytes of > memory used for a single function. IMHO, the kernel shouldn't assume the > user must add more memory on the system. I completely agree. > > Anyway, I will post a patchset to use the cpuidle_state per cpu. Maybe what you were doing wasn't clearly evident with the existing cpuidle core, but once its scaled to add a per cpu state, it might look more meaningful. So if you post your patches to add cpuidle state per cpu, it should certainly give more insight into why this cleanup makes sense. > > Thanks > -- Daniel > > [1] http://www.engadget.com/2007/02/11/intel-demonstrates-80-core-processor/ > [2] http://www.tilera.com/products/processors/TILE-Gx_Family > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 0132706..c064144 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -46,7 +46,9 @@ static inline int cpuidle_enter(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { struct cpuidle_state *target_state = &drv->states[index]; - return target_state->enter(dev, drv, index); + + return drv->enter ? drv->enter(dev, drv, index) : + target_state->enter(dev, drv, index); } static inline int cpuidle_enter_tk(struct cpuidle_device *dev, diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index a6b3f2e..4d816c7 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -131,6 +131,7 @@ struct cpuidle_driver { struct cpuidle_state states[CPUIDLE_STATE_MAX]; int state_count; int safe_state_index; + int (*enter)(struct cpuidle_device *, struct cpuidle_driver *, int); }; #ifdef CONFIG_CPU_IDLE
We have the state index passed as parameter to the 'enter' function. Most of the drivers assign their 'enter' functions several times in the cpuidle_state structure, as we have the index, we can delegate to the driver to handle their own callback array. That will have the benefit of removing multiple lines of code in the different drivers. In order to smoothly modify the driver, the 'enter' function are in the driver structure and in the cpuidle state structure. That will let the time to modify the different drivers one by one. So the 'cpuidle_enter' function checks if the 'enter' callback is assigned in the driver structure and use it, otherwise it invokes the 'enter' assigned to the cpuidle_state. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> --- drivers/cpuidle/cpuidle.c | 4 +++- include/linux/cpuidle.h | 1 + 2 files changed, 4 insertions(+), 1 deletions(-)