Message ID | 1399653631-4938-2-git-send-email-broonie@kernel.org (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Hi Mark, On 09/05/14 17:40, Mark Brown wrote: > From: Mark Brown <broonie@linaro.org> > > The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even > without IKS support since it implements support for clusters with shared > clocks (a common big.LITTLE configuration). In order to allow it to be > built provide the non-IKS stubs for arm64, enabling cpufreq with all the > cores available. > I am in process of using this driver for ARM64 and hit the same issue. I don't like this approach at all. I too did similar changes/hacks which are good for quick testing but not for upstream. I would like to move all the switcher code out of the driver as extension. Also the core driver should be made to work with any multi-clsuter platform not just big-little(bL). bL is one of them and bL switcher support should an extension of it. The main reason for this is I see some non-bL multi-cluster platform support getting added, this driver should ideally support that. > It may make sense to make an asm-generic version of these stubs instead but > given that there's only likely to be these two architectures using the code > and asm-generic stubs also need per architecture updates it's probably more > trouble than it's worth. > I would not take this approach too. As mentioned above if we can resolve it in that way we may not require this. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, May 09, 2014 at 05:40:30PM +0100, Mark Brown wrote: > From: Mark Brown <broonie@linaro.org> > > The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even > without IKS support since it implements support for clusters with shared > clocks (a common big.LITTLE configuration). In order to allow it to be > built provide the non-IKS stubs for arm64, enabling cpufreq with all the > cores available. Have you thought of patching the actual cpufreq driver? Are you adding this code just to avoid compiler errors on arm64 with this driver? > It may make sense to make an asm-generic version of these stubs instead asm-generic/bL_switcher.h? I take it as a good joke ;) > but > given that there's only likely to be these two architectures using the code > and asm-generic stubs also need per architecture updates it's probably more > trouble than it's worth. Exactly.
On Fri, May 09, 2014 at 06:05:56PM +0100, Sudeep Holla wrote: > On 09/05/14 17:40, Mark Brown wrote: > >From: Mark Brown <broonie@linaro.org> > >The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even > >without IKS support since it implements support for clusters with shared > >clocks (a common big.LITTLE configuration). In order to allow it to be > >built provide the non-IKS stubs for arm64, enabling cpufreq with all the > >cores available. > I am in process of using this driver for ARM64 and hit the same issue. > I don't like this approach at all. I too did similar changes/hacks which are > good for quick testing but not for upstream. I'm not a big fan of this either, but then as I indicated on the cpufreq bit of the series I'm not a massive fan of the way this is handled in the first place on either ARM or ARMv8. This at least gives us parity between the two architectures (modulo IKS implementation) which is progress especially given the fact that much of the work done on this stuff is being done on 32 bit due to hardware availability. Given that the code isn't invasive I think the expediency tradeoff is OK for mainline, it's easy enough to get rid of when we come up with something better but in the meaintine it helps actual systems work better in mainline - if we didn't have the ARM implementation already I think it'd be different but we do. Perfect can be the enemy of good (or at least adequate), one of the problems I'm seeing right now with convincing people to work with mainline is that people are missing lots of important functionality when they look at mainline. > I would like to move all the switcher code out of the driver as extension. > Also the core driver should be made to work with any multi-clsuter platform not > just big-little(bL). bL is one of them and bL switcher support should an > extension of it. > The main reason for this is I see some non-bL multi-cluster platform support > getting added, this driver should ideally support that. It is not entirely clear to me what you mean by "this driver" or "the core driver" in all the above, sorry. Personally the solution I'd rather see is cpufreq-cpu0 extended to handle shared clocks which would remove the need to use the big.LITTLE driver on systems not doing IKS. It isn't at all clear to me that cpufreq should understand clusters (or much of anything other than clocks) in the non-IKS case, the sharing of clocks between cores is not directly connected to their clustering. Clustering is important to the scheduler and an understanding of the power and clock sharing constraints that come along with clusters is going to be required there as part of energy aware scheduling but I'm not seeing any obvious reason for the frequency scaling driver to know about this, even with cpufreq governors it's mostly the clock sharing. For IKS where we're pairing the CPUs up and telling the kernel that switching between the physical clusters is part of scaling the "frequency" of the virtual cluster presented to the rest of the system things are of course different and cpufreq does need to understand the physical clusters. Having multiple generic frequency scaling drivers feels wrong; there's going to be code duplication between them and it doesn't seem like there should be anything going on that we can't automatically figure out at runtime.
On Fri, May 09, 2014 at 06:47:38PM +0100, Catalin Marinas wrote: > On Fri, May 09, 2014 at 05:40:30PM +0100, Mark Brown wrote: > > The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even > > without IKS support since it implements support for clusters with shared > > clocks (a common big.LITTLE configuration). In order to allow it to be > > built provide the non-IKS stubs for arm64, enabling cpufreq with all the > > cores available. > Have you thought of patching the actual cpufreq driver? Are you adding > this code just to avoid compiler errors on arm64 with this driver? Yes, that was actually my first thought but I wasn't loving the ifdeferry - it's fairly easy to do IIRC but the general taste seems to be towards having stubs rather than ifdefs and the code seemed to be lending itself to that. There was also the fact that ifdefs could have been done for the non-IKS case on 32 bit but instead stubs were provided. If people prefer I can do an ifdeffed version though, it doesn't make much odds. This was purely to get the driver compiling and hopefully running on ARMv8, there is some user demand for deploying the driver on big.LITTLE systems. > > It may make sense to make an asm-generic version of these stubs instead > asm-generic/bL_switcher.h? I take it as a good joke ;) Hey, perhaps other hardware architectures are implementing similar concepts even now in order to keep up! :P
On 09/05/14 18:50, Mark Brown wrote: > On Fri, May 09, 2014 at 06:05:56PM +0100, Sudeep Holla wrote: >> On 09/05/14 17:40, Mark Brown wrote: > >>> From: Mark Brown <broonie@linaro.org> > >>> The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even >>> without IKS support since it implements support for clusters with shared >>> clocks (a common big.LITTLE configuration). In order to allow it to be >>> built provide the non-IKS stubs for arm64, enabling cpufreq with all the >>> cores available. > >> I am in process of using this driver for ARM64 and hit the same issue. >> I don't like this approach at all. I too did similar changes/hacks which are >> good for quick testing but not for upstream. > > I'm not a big fan of this either, but then as I indicated on the cpufreq > bit of the series I'm not a massive fan of the way this is handled in > the first place on either ARM or ARMv8. This at least gives us parity > between the two architectures (modulo IKS implementation) which is > progress especially given the fact that much of the work done on this > stuff is being done on 32 bit due to hardware availability. > OK good to know that you too agree that this is not a good approach. > Given that the code isn't invasive I think the expediency tradeoff is OK > for mainline, it's easy enough to get rid of when we come up with > something better but in the meaintine it helps actual systems work > better in mainline - if we didn't have the ARM implementation already I > think it'd be different but we do. > I disagree, I don't see a real urgency for this on ARM64. Even on ARM32, no single platform other than TC2 is using this(at least as I see in the mainline). If some real platform that needs this support urgently, then we can think of similar short-term solution as part of adding support for cpufreq on that platform. Do you know any platform that needs this right now ? > Perfect can be the enemy of good (or at least adequate), one of the > problems I'm seeing right now with convincing people to work with > mainline is that people are missing lots of important functionality when > they look at mainline. > Ok fair enough. But we can take some time and see if we can workout better solution rather than jumping to add interim solutions when is unlikely to be used on any real platform. >> I would like to move all the switcher code out of the driver as extension. >> Also the core driver should be made to work with any multi-clsuter platform not >> just big-little(bL). bL is one of them and bL switcher support should an >> extension of it. > >> The main reason for this is I see some non-bL multi-cluster platform support >> getting added, this driver should ideally support that. > > It is not entirely clear to me what you mean by "this driver" or "the > core driver" in all the above, sorry. > I meant the core arm-big-little cpufreq driver. > Personally the solution I'd rather see is cpufreq-cpu0 extended to > handle shared clocks which would remove the need to use the big.LITTLE Ideally yes. IIUC it has dependencies on CPU0 and I have not looked that driver and all of its users to judge how feasible is that. At-least we can have cpufreq-cpu0 for all single cluster systems w/o support for per-CPU DVFS and another which can handle multi-cluster systems(with or w/o per-CPU DVFS). Just a thought... > driver on systems not doing IKS. It isn't at all clear to me that > cpufreq should understand clusters (or much of anything other than > clocks) in the non-IKS case, the sharing of clocks between cores is not Yes I agree, I had brought up this in one of the discussions around extending OPP bindings. The cluster dependency needs to be removed and it should be derived from the clocks. > directly connected to their clustering. Clustering is important to the > scheduler and an understanding of the power and clock sharing > constraints that come along with clusters is going to be required there > as part of energy aware scheduling but I'm not seeing any obvious reason > for the frequency scaling driver to know about this, even with cpufreq > governors it's mostly the clock sharing. > Both CPUFreq and Energy aware scheduling need understanding of clock sharing. Both have similar goals(save power with little or no performance degradation), but EA scheduler will be more efficient(both in terms of power and performance). Governors need this knowledge of sharing as it affects the load calculation. (IIRC maximum load of all the cpu sharing clocks is taken) > For IKS where we're pairing the CPUs up and telling the kernel that > switching between the physical clusters is part of scaling the > "frequency" of the virtual cluster presented to the rest of the system > things are of course different and cpufreq does need to understand the > physical clusters. > > Having multiple generic frequency scaling drivers feels wrong; there's > going to be code duplication between them and it doesn't seem like there > should be anything going on that we can't automatically figure out at > runtime. > Completely agree with you. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, May 09, 2014 at 07:57:55PM +0100, Sudeep Holla wrote: > On 09/05/14 18:50, Mark Brown wrote: > >Given that the code isn't invasive I think the expediency tradeoff is OK > >for mainline, it's easy enough to get rid of when we come up with > >something better but in the meaintine it helps actual systems work > >better in mainline - if we didn't have the ARM implementation already I > >think it'd be different but we do. > I disagree, I don't see a real urgency for this on ARM64. Even on ARM32, no > single platform other than TC2 is using this(at least as I see in the mainline). > If some real platform that needs this support urgently, then we can think of > similar short-term solution as part of adding support for cpufreq on that > platform. Do you know any platform that needs this right now ? There's the big.LITTLE Exynos devices, at least the 5410 has shipped in product using IKS I believe and so should've been using the big.LITTLE cpufreq driver to do the cluster switching; I think some 5420 systems were doing the same. Or reimplementing it which would just be sad. There's some other non-public devices I am aware of - the whole reason I wrote this series was for those. > >Perfect can be the enemy of good (or at least adequate), one of the > >problems I'm seeing right now with convincing people to work with > >mainline is that people are missing lots of important functionality when > >they look at mainline. > Ok fair enough. But we can take some time and see if we can workout better > solution rather than jumping to add interim solutions when is unlikely to be > used on any real platform. There are real users who want to use this fairly urgently. I can't be specific, sorry. > >Personally the solution I'd rather see is cpufreq-cpu0 extended to > >handle shared clocks which would remove the need to use the big.LITTLE > Ideally yes. IIUC it has dependencies on CPU0 and I have not looked that > driver and all of its users to judge how feasible is that. At-least we can > have cpufreq-cpu0 for all single cluster systems w/o support for per-CPU DVFS > and another which can handle multi-cluster systems(with or w/o per-CPU DVFS). > Just a thought... It's not clear to me that having multiple clusters should require a different driver, the single cluster case is just a specialisation of the multicluster one as far as I can see. It's possible I'm missing something though. I have to confess I didn't look in detail at cpufreq-cpu0 to make sure it's the best place to start from but it looks clean and does have the regulator stuff, though it's probably better to say that the goal is to merge that and the big.LITTLE driver. > >directly connected to their clustering. Clustering is important to the > >scheduler and an understanding of the power and clock sharing > >constraints that come along with clusters is going to be required there > >as part of energy aware scheduling but I'm not seeing any obvious reason > >for the frequency scaling driver to know about this, even with cpufreq > >governors it's mostly the clock sharing. > Both CPUFreq and Energy aware scheduling need understanding of clock > sharing. Both have similar goals(save power with little or no performance > degradation), but EA scheduler will be more efficient(both in terms of power > and performance). > Governors need this knowledge of sharing as it affects the load calculation. > (IIRC maximum load of all the cpu sharing clocks is taken) Yes, definitely with respect to the clocks - my point was more about the clustering bit (which is related to but not 100% tied to clocks).
On Fri, 9 May 2014, Mark Brown wrote: > On Fri, May 09, 2014 at 06:47:38PM +0100, Catalin Marinas wrote: > > On Fri, May 09, 2014 at 05:40:30PM +0100, Mark Brown wrote: > > > > The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even > > > without IKS support since it implements support for clusters with shared > > > clocks (a common big.LITTLE configuration). In order to allow it to be > > > built provide the non-IKS stubs for arm64, enabling cpufreq with all the > > > cores available. > > > Have you thought of patching the actual cpufreq driver? Are you adding > > this code just to avoid compiler errors on arm64 with this driver? > > Yes, that was actually my first thought but I wasn't loving the > ifdeferry - it's fairly easy to do IIRC but the general taste seems to > be towards having stubs rather than ifdefs and the code seemed to be > lending itself to that. There was also the fact that ifdefs could have > been done for the non-IKS case on 32 bit but instead stubs were > provided. If people prefer I can do an ifdeffed version though, it > doesn't make much odds. I personally don't understand the b.L cpufreq driver fully, especially with the IKS functionality bolted on top. Obviously I didn't write that part otherwise I would hopefully understand it better. Still, I'd have preferred for the IKS extension to the b.L cpufreq driver to be more isolated in a separate file or the like. > This was purely to get the driver compiling and hopefully running on > ARMv8, there is some user demand for deploying the driver on big.LITTLE > systems. > > > > It may make sense to make an asm-generic version of these stubs instead > > > asm-generic/bL_switcher.h? I take it as a good joke ;) > > Hey, perhaps other hardware architectures are implementing similar > concepts even now in order to keep up! Let's hope we won't need to rely on IKS any longer by the time support for them land into mainline. Nicolas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/05/14 20:29, Mark Brown wrote: > On Fri, May 09, 2014 at 07:57:55PM +0100, Sudeep Holla wrote: >> On 09/05/14 18:50, Mark Brown wrote: > >>> Given that the code isn't invasive I think the expediency tradeoff is OK >>> for mainline, it's easy enough to get rid of when we come up with >>> something better but in the meaintine it helps actual systems work >>> better in mainline - if we didn't have the ARM implementation already I >>> think it'd be different but we do. > >> I disagree, I don't see a real urgency for this on ARM64. Even on ARM32, no >> single platform other than TC2 is using this(at least as I see in the mainline). > >> If some real platform that needs this support urgently, then we can think of >> similar short-term solution as part of adding support for cpufreq on that >> platform. Do you know any platform that needs this right now ? > > There's the big.LITTLE Exynos devices, at least the 5410 has shipped in > product using IKS I believe and so should've been using the big.LITTLE > cpufreq driver to do the cluster switching; I think some 5420 systems > were doing the same. Or reimplementing it which would just be sad. > There's some other non-public devices I am aware of - the whole reason I > wrote this series was for those. > Correct all these are 32-bit platforms which are now in process of upstreaming. So far they had their own driver and never used the arm-big-little driver. And this current series deals with 64-bit platforms. >>> Perfect can be the enemy of good (or at least adequate), one of the >>> problems I'm seeing right now with convincing people to work with >>> mainline is that people are missing lots of important functionality when >>> they look at mainline. > >> Ok fair enough. But we can take some time and see if we can workout better >> solution rather than jumping to add interim solutions when is unlikely to be >> used on any real platform. > > There are real users who want to use this fairly urgently. I can't be > specific, sorry. > That's fine, I understand. But the main argument is that if these platforms will not add support for cpufreq upstream anytime soon, I don't see any value in rushing to this short term solution. >>> Personally the solution I'd rather see is cpufreq-cpu0 extended to >>> handle shared clocks which would remove the need to use the big.LITTLE > >> Ideally yes. IIUC it has dependencies on CPU0 and I have not looked that >> driver and all of its users to judge how feasible is that. At-least we can >> have cpufreq-cpu0 for all single cluster systems w/o support for per-CPU DVFS >> and another which can handle multi-cluster systems(with or w/o per-CPU DVFS). > >> Just a thought... > > It's not clear to me that having multiple clusters should require a > different driver, the single cluster case is just a specialisation of > the multicluster one as far as I can see. It's possible I'm missing > something though. > No you are correct, but that change would require lot of changes and testing. So my proposal above is just first step to avoid any new drivers and then we need to merge it with cpufreq-cpu0. > I have to confess I didn't look in detail at cpufreq-cpu0 to make sure > it's the best place to start from but it looks clean and does have the > regulator stuff, though it's probably better to say that the goal is to > merge that and the big.LITTLE driver. > Me either, hence I didn't want to comment much on merging everything to cpufrq-cpu0. Need to understand all the platforms using it so that generic solution based on clocks continues to work. >>> directly connected to their clustering. Clustering is important to the >>> scheduler and an understanding of the power and clock sharing >>> constraints that come along with clusters is going to be required there >>> as part of energy aware scheduling but I'm not seeing any obvious reason >>> for the frequency scaling driver to know about this, even with cpufreq >>> governors it's mostly the clock sharing. > >> Both CPUFreq and Energy aware scheduling need understanding of clock >> sharing. Both have similar goals(save power with little or no performance >> degradation), but EA scheduler will be more efficient(both in terms of power >> and performance). > >> Governors need this knowledge of sharing as it affects the load calculation. >> (IIRC maximum load of all the cpu sharing clocks is taken) > > Yes, definitely with respect to the clocks - my point was more about the > clustering bit (which is related to but not 100% tied to clocks). > Ok, that makes sense. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 12, 2014 at 09:34:04AM +0100, Sudeep Holla wrote: > On 09/05/14 20:29, Mark Brown wrote: > >On Fri, May 09, 2014 at 07:57:55PM +0100, Sudeep Holla wrote: > >>If some real platform that needs this support urgently, then we can think of > >>similar short-term solution as part of adding support for cpufreq on that > >>platform. Do you know any platform that needs this right now ? ... > >There's some other non-public devices I am aware of - the whole reason I > >wrote this series was for those. > Correct all these are 32-bit platforms which are now in process of upstreaming. > So far they had their own driver and never used the arm-big-little driver. > And this current series deals with 64-bit platforms. Except for the non-public devices on the end of the list there :) > >There are real users who want to use this fairly urgently. I can't be > >specific, sorry. > That's fine, I understand. But the main argument is that if these platforms > will not add support for cpufreq upstream anytime soon, I don't see any value > in rushing to this short term solution. This all gets a bit circular though - the more patches people have to carry to make upstream useful to them less useful working with upstream seems to them, meaning the devices are less likely to appear upstream at all at least in any sort of complete form. One out of tree patch being required isn't going to be a deal breaker by itself but they all add up to a perception that upstream isn't useful. There's definitely some taste considerations with how far you go to cater for such systems - in this case I'd say it's just tweaking Kconfig to allow people to use code that's already present rather than adding really new code (there's the stubs but they're pretty insubstantial) which means upstream doesn't have to carry anything that wasn't there already.
> This all gets a bit circular though - the more patches people have to > carry to make upstream useful to them less useful working with upstream > seems to them, meaning the devices are less likely to appear upstream at > all at least in any sort of complete form. One out of tree patch being > required isn't going to be a deal breaker by itself but they all add up > to a perception that upstream isn't useful. > > There's definitely some taste considerations with how far you go to > cater for such systems - in this case I'd say it's just tweaking Kconfig > to allow people to use code that's already present rather than adding > really new code (there's the stubs but they're pretty insubstantial) > which means upstream doesn't have to carry anything that wasn't there > already. Agree, with the original version of this patch I just put all the stubs in drivers/cpufreq/arm_big_little.h to get around needing to add files to arm64 (when it shouldn't need it). At least then the stubs were pretty local to the implementation and didn't require all of the ifdefs in the code (as Mark says that looks horrible). Mark -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm64/include/asm/bL_switcher.h b/arch/arm64/include/asm/bL_switcher.h new file mode 100644 index 000000000000..2bee500b7f54 --- /dev/null +++ b/arch/arm64/include/asm/bL_switcher.h @@ -0,0 +1,54 @@ +/* + * Based on the stubs for the ARM implementation which is: + * + * Created by: Nicolas Pitre, April 2012 + * Copyright: (C) 2012-2013 Linaro Limited + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef ASM_BL_SWITCHER_H +#define ASM_BL_SWITCHER_H + +#include <linux/notifier.h> +#include <linux/types.h> + +typedef void (*bL_switch_completion_handler)(void *cookie); + +static inline int bL_switch_request(unsigned int cpu, + unsigned int new_cluster_id) +{ + return -ENOTSUPP; +} + +/* + * Register here to be notified about runtime enabling/disabling of + * the switcher. + * + * The notifier chain is called with the switcher activation lock held: + * the switcher will not be enabled or disabled during callbacks. + * Callbacks must not call bL_switcher_{get,put}_enabled(). + */ +#define BL_NOTIFY_PRE_ENABLE 0 +#define BL_NOTIFY_POST_ENABLE 1 +#define BL_NOTIFY_PRE_DISABLE 2 +#define BL_NOTIFY_POST_DISABLE 3 + +static inline int bL_switcher_register_notifier(struct notifier_block *nb) +{ + return 0; +} + +static inline int bL_switcher_unregister_notifier(struct notifier_block *nb) +{ + return 0; +} + +static inline bool bL_switcher_get_enabled(void) { return false; } +static inline void bL_switcher_put_enabled(void) { } +static inline int bL_switcher_trace_trigger(void) { return 0; } +static inline int bL_switcher_get_logical_index(u32 mpidr) { return -EUNATCH; } + +#endif