diff mbox

[2/4] cpuidle: define the enter function in the driver structure

Message ID 1341494608-16591-2-git-send-email-daniel.lezcano@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Lezcano July 5, 2012, 1:23 p.m. UTC
We have the state index passed as parameter to the 'enter' function.
Most of the drivers assign their 'enter' functions several times in
the cpuidle_state structure, as we have the index, we can delegate
to the driver to handle their own callback array.

That will have the benefit of removing multiple lines of code in the
different drivers.

In order to smoothly modify the driver, the 'enter' function are in
the driver structure and in the cpuidle state structure. That will
let the time to modify the different drivers one by one.
So the 'cpuidle_enter' function checks if the 'enter' callback is
assigned in the driver structure and use it, otherwise it invokes
the 'enter' assigned to the cpuidle_state.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/cpuidle/cpuidle.c |    4 +++-
 include/linux/cpuidle.h   |    1 +
 2 files changed, 4 insertions(+), 1 deletions(-)

Comments

Rafael Wysocki July 5, 2012, 8:38 p.m. UTC | #1
On Thursday, July 05, 2012, Daniel Lezcano wrote:
> We have the state index passed as parameter to the 'enter' function.
> Most of the drivers assign their 'enter' functions several times in
> the cpuidle_state structure, as we have the index, we can delegate
> to the driver to handle their own callback array.
> 
> That will have the benefit of removing multiple lines of code in the
> different drivers.

Hmm. I suppose the cpuidle subsystem was designed the way it was for a reason.
Among other things, this was to avoid recurrence in callbacks - please see
acpi_idle_enter_bm() for example.

Now, if .enter() is moved to the driver structure, it will have to be an
all-purpose complicated routine calling itself recursively at least in
some cases.  I'm not quite convinced that would be an improvement.

On the other hand, I don't see anything wrong with setting several callback
pointers to the same routine.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano July 6, 2012, 10:58 a.m. UTC | #2
On 07/05/2012 10:38 PM, Rafael J. Wysocki wrote:
> On Thursday, July 05, 2012, Daniel Lezcano wrote:
>> We have the state index passed as parameter to the 'enter' function.
>> Most of the drivers assign their 'enter' functions several times in
>> the cpuidle_state structure, as we have the index, we can delegate
>> to the driver to handle their own callback array.
>>
>> That will have the benefit of removing multiple lines of code in the
>> different drivers.
> 
> Hmm. I suppose the cpuidle subsystem was designed the way it was for a reason.
> Among other things, this was to avoid recurrence in callbacks - please see
> acpi_idle_enter_bm() for example.
> 
> Now, if .enter() is moved to the driver structure, it will have to be an
> all-purpose complicated routine calling itself recursively at least in
> some cases.  I'm not quite convinced that would be an improvement.
> 
> On the other hand, I don't see anything wrong with setting several callback
> pointers to the same routine.

Deepthi sent a few months ago a patch moving the per-cpu cpuidle_state
to a single structure stored in the driver. The drivers were modified
and cleanup to take into account this modification.

We saw some regressions like for example the 'disable' which were not
per cpu and has been moved to the cpuidle_state_usage (and IMHO it is a
workaround more than a real fix). And now we have new architectures
(tegra3, big.LITTLE) with different latencies per cpu. Logically we
should revert Deepthi's patches but from my POV, going back and forth is
not a good solution (also we have to undo all modifications done in the
different drivers).

The main purpose of all these cleanup patches are to move out all
non-data information from the cpuidle_state structure in order to add a
new api which could be 'cpuidle_register_cpu_latency(int cpu, struct
cpuidle_latencies latencies)'.

For this specific patch, the 'enter' function for all the drivers is not
used [1] and one of the driver is about to use a single function [2]. So
we have only one driver is a couple of functions for this which can be
replaced by an array of callbacks in the driver itself as we have the index.

[1] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012355.html
[2] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012399.html
Rafael Wysocki July 6, 2012, 9:23 p.m. UTC | #3
On Friday, July 06, 2012, Daniel Lezcano wrote:
> On 07/05/2012 10:38 PM, Rafael J. Wysocki wrote:
> > On Thursday, July 05, 2012, Daniel Lezcano wrote:
> >> We have the state index passed as parameter to the 'enter' function.
> >> Most of the drivers assign their 'enter' functions several times in
> >> the cpuidle_state structure, as we have the index, we can delegate
> >> to the driver to handle their own callback array.
> >>
> >> That will have the benefit of removing multiple lines of code in the
> >> different drivers.
> > 
> > Hmm. I suppose the cpuidle subsystem was designed the way it was for a reason.
> > Among other things, this was to avoid recurrence in callbacks - please see
> > acpi_idle_enter_bm() for example.
> > 
> > Now, if .enter() is moved to the driver structure, it will have to be an
> > all-purpose complicated routine calling itself recursively at least in
> > some cases.  I'm not quite convinced that would be an improvement.
> > 
> > On the other hand, I don't see anything wrong with setting several callback
> > pointers to the same routine.
> 
> Deepthi sent a few months ago a patch moving the per-cpu cpuidle_state
> to a single structure stored in the driver. The drivers were modified
> and cleanup to take into account this modification.
> 
> We saw some regressions like for example the 'disable' which were not
> per cpu and has been moved to the cpuidle_state_usage (and IMHO it is a
> workaround more than a real fix). And now we have new architectures
> (tegra3, big.LITTLE) with different latencies per cpu. Logically we
> should revert Deepthi's patches but from my POV, going back and forth is
> not a good solution (also we have to undo all modifications done in the
> different drivers).
> 
> The main purpose of all these cleanup patches are to move out all
> non-data information from the cpuidle_state structure in order to add a
> new api which could be 'cpuidle_register_cpu_latency(int cpu, struct
> cpuidle_latencies latencies)'.

Well, my question is: what specifically is wrong with having callback
pointers in struct cpuidle_state in case when that structure is known
to be common for all the CPUs in the system?

Sure, there are systems in which that is not the case, but then why
should we complicate the simplest case in order to address them?

> For this specific patch, the 'enter' function for all the drivers is not
> used [1] and one of the driver is about to use a single function [2]. So
> we have only one driver is a couple of functions for this which can be
> replaced by an array of callbacks in the driver itself as we have the index.
> 
> [1] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012355.html

I agree with Deepthi.

> [2] http://lists.linaro.org/pipermail/linaro-dev/2012-June/012399.html

And that is fine and dandy as long as the OMAP people want that.  Evidently,
that is not wanted by everybody and if it had been, the cpuidle subsystem
would have been designed differently.

I really would like that thing to be reconsidered before we proceed with
these changes.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajendra Nayak July 10, 2012, 8:29 a.m. UTC | #4
Hi Daniel,

On Friday 06 July 2012 04:28 PM, Daniel Lezcano wrote:
> The main purpose of all these cleanup patches are to move out all
> non-data information from the cpuidle_state structure in order to add a
> new api which could be 'cpuidle_register_cpu_latency(int cpu, struct
> cpuidle_latencies latencies)'.

So are there any technical difficulties in adding such an api with the
non-data information being part of cpuidle_state? Or its just that you
think the already duplicated non-data information (per state) is going
to be duplicated much more (per state per cpu) in most cases.

regards,
Rajendra

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano July 10, 2012, 11:39 a.m. UTC | #5
On 07/10/2012 10:29 AM, Rajendra Nayak wrote:
> Hi Daniel,
> 
> On Friday 06 July 2012 04:28 PM, Daniel Lezcano wrote:
>> The main purpose of all these cleanup patches are to move out all
>> non-data information from the cpuidle_state structure in order to add a
>> new api which could be 'cpuidle_register_cpu_latency(int cpu, struct
>> cpuidle_latencies latencies)'.
> 
> So are there any technical difficulties in adding such an api with the
> non-data information being part of cpuidle_state? Or its just that you
> think the already duplicated non-data information (per state) is going
> to be duplicated much more (per state per cpu) in most cases.

Yes, it is possible. I am already working on an alternate patchset I
will post soon, but I don't like to add more things in a subsystem
before cleaning up the code when it makes sense.

When I looked at the cpuidle code, I noticed this 'enter' field was used
for the same function. With per cpu latency and using the cpuidle_state
structure, that means duplication of this field. For an x86_64
architecture with 16 cores and 4 C-states, that is 512 bytes of memory
instead of 8 bytes.

That is the same for example the 'disable' field which could be stored
in the 'flags' field as a bit instead of the moving it to the 'stats'
structure with "unsigned long long".

As the architectures tend to add more cores [1][2], that means more
memory consumption. In a very near future, we will have 100 cores in a
cpu. Assuming we have 4 C-states, that is 100 * 4 * 8 = 3200 bytes of
memory used for a single function. IMHO, the kernel shouldn't assume the
user must add more memory on the system.

Anyway, I will post a patchset to use the cpuidle_state per cpu.

Thanks
  -- Daniel

[1] http://www.engadget.com/2007/02/11/intel-demonstrates-80-core-processor/
[2] http://www.tilera.com/products/processors/TILE-Gx_Family
Rajendra Nayak July 10, 2012, 12:38 p.m. UTC | #6
> As the architectures tend to add more cores [1][2], that means more
> memory consumption. In a very near future, we will have 100 cores in a
> cpu. Assuming we have 4 C-states, that is 100 * 4 * 8 = 3200 bytes of
> memory used for a single function. IMHO, the kernel shouldn't assume the
> user must add more memory on the system.

I completely agree.
>
> Anyway, I will post a patchset to use the cpuidle_state per cpu.

Maybe what you were doing wasn't clearly evident with the existing
cpuidle core, but once its scaled to add a per cpu state, it might
look more meaningful. So if you post your patches to add cpuidle
state per cpu, it should certainly give more insight into why this
cleanup makes sense.

>
> Thanks
>    -- Daniel
>
> [1] http://www.engadget.com/2007/02/11/intel-demonstrates-80-core-processor/
> [2] http://www.tilera.com/products/processors/TILE-Gx_Family
>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 0132706..c064144 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -46,7 +46,9 @@  static inline int cpuidle_enter(struct cpuidle_device *dev,
 				struct cpuidle_driver *drv, int index)
 {
 	struct cpuidle_state *target_state = &drv->states[index];
-	return target_state->enter(dev, drv, index);
+
+	return drv->enter ? drv->enter(dev, drv, index) :
+		target_state->enter(dev, drv, index);
 }
 
 static inline int cpuidle_enter_tk(struct cpuidle_device *dev,
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index a6b3f2e..4d816c7 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -131,6 +131,7 @@  struct cpuidle_driver {
 	struct cpuidle_state	states[CPUIDLE_STATE_MAX];
 	int			state_count;
 	int			safe_state_index;
+	int (*enter)(struct cpuidle_device *, struct cpuidle_driver *, int);
 };
 
 #ifdef CONFIG_CPU_IDLE