Message ID | 20130406192957.GQ11459@1wt.eu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Willy, On 04/06/2013 09:29 PM, Willy Tarreau wrote: > Hi, > > A patch from Marvell was merged in GCC 4.8 to add support for their > PJ4 CPU core used in Armada370 and XP. I noticed a steady 2% network > performance increase using -mcpu=marvell-pj4 and around 1% when using > -mtune=xscale instead. I have no idea whether it provides anything to > the Dove platform and I don't have one to test, so I didn't touch it. > > Now that 4.8 is released, it would be nice to have this option used > when supported. Sure it will be very nice! But what happens when the kernel is built in multiarch? It seems to me that in this case gcc will tune the code for PJ4 whereas the kernel built can also be run on a Cortex-A, a Scorpion or a Krait. Regards, > > Regards, > Willy > > From b5b34f87e753fbd756f3c23536e731bb1aa6bf7f Mon Sep 17 00:00:00 2001 > From: Willy Tarreau <w@1wt.eu> > Date: Sun, 3 Mar 2013 23:26:41 +0100 > Subject: [PATCH] ARM: makefile: add tuning options for Armada 370/XP > > Let's pass -mcpu=marvell-pj4 and fall back to -mtune=xscale for Armada370 > and ArmadaXP. Both settings have shown an improvement over the default > setting on these chips using gcc-4.7 with and without the Marvell patch > (typically 2% on network traffic). > > Signed-off-by: Willy Tarreau <w@1wt.eu> > --- > arch/arm/Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/arm/Makefile b/arch/arm/Makefile > index 15747d9..7dd5418 100644 > --- a/arch/arm/Makefile > +++ b/arch/arm/Makefile > @@ -93,6 +93,7 @@ tune-$(CONFIG_CPU_XSC3) :=$(call cc-option,-mtune=xscale,-mtune=strongarm110) - > tune-$(CONFIG_CPU_FEROCEON) :=$(call cc-option,-mtune=marvell-f,-mtune=xscale) > tune-$(CONFIG_CPU_V6) :=$(call cc-option,-mtune=arm1136j-s,-mtune=strongarm) > tune-$(CONFIG_CPU_V6K) :=$(call cc-option,-mtune=arm1136j-s,-mtune=strongarm) > +tune-$(CONFIG_MACH_ARMADA_370_XP) :=$(call cc-option,-mcpu=marvell-pj4,-mtune=xscale) > > ifeq ($(CONFIG_AEABI),y) > CFLAGS_ABI :=-mabi=aapcs-linux -mno-thumb-interwork >
Hi Grégory, On Sat, Apr 06, 2013 at 11:48:28PM +0200, Gregory CLEMENT wrote: > Hi Willy, > > On 04/06/2013 09:29 PM, Willy Tarreau wrote: > > Hi, > > > > A patch from Marvell was merged in GCC 4.8 to add support for their > > PJ4 CPU core used in Armada370 and XP. I noticed a steady 2% network > > performance increase using -mcpu=marvell-pj4 and around 1% when using > > -mtune=xscale instead. I have no idea whether it provides anything to > > the Dove platform and I don't have one to test, so I didn't touch it. > > > > Now that 4.8 is released, it would be nice to have this option used > > when supported. > > Sure it will be very nice! But what happens when the kernel is built > in multiarch? > It seems to me that in this case gcc will tune the code for PJ4 whereas > the kernel built can also be run on a Cortex-A, a Scorpion or a Krait. I have not thought about that. Maybe we're already reaching the limits of the multiarch mode ? After all, pj4 is not a cortex-a, which probably is the reason that justifies having specific gcc optimizations that are different from cortex-a. So in the end, I'm starting to wonder whether it makes sense to permit non-100% compatible armv7 chips to coexist in the same kernel :-/ Otherwise we could do what already exists on x86, which is to let the user select which chip he wants to optimize for. I have no idea on the subject to be honnest. Best regards, Willy
On Sat, Apr 06, 2013 at 09:29:57PM +0200, Willy Tarreau wrote: > diff --git a/arch/arm/Makefile b/arch/arm/Makefile > index 15747d9..7dd5418 100644 > --- a/arch/arm/Makefile > +++ b/arch/arm/Makefile > @@ -93,6 +93,7 @@ tune-$(CONFIG_CPU_XSC3) :=$(call cc-option,-mtune=xscale,-mtune=strongarm110) - > tune-$(CONFIG_CPU_FEROCEON) :=$(call cc-option,-mtune=marvell-f,-mtune=xscale) > tune-$(CONFIG_CPU_V6) :=$(call cc-option,-mtune=arm1136j-s,-mtune=strongarm) > tune-$(CONFIG_CPU_V6K) :=$(call cc-option,-mtune=arm1136j-s,-mtune=strongarm) > +tune-$(CONFIG_MACH_ARMADA_370_XP) :=$(call cc-option,-mcpu=marvell-pj4,-mtune=xscale) Do not do this. This is not how these options work. Look at all the above - they all use -mtune=. The reason for this is that we want to control which instructions are used (-march=) independently of how the instructions are scheduled (-mtune=). Using -mcpu= influences both of those in an adverse way - it can enable instructions which are not present on other CPUs in the same kernel. So, the use of -mcpu= is not permitted in the kernel. Instead, use -march= and -mtune= in an appropriate manner to achieve your goal.
diff --git a/arch/arm/Makefile b/arch/arm/Makefile index 15747d9..7dd5418 100644 --- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -93,6 +93,7 @@ tune-$(CONFIG_CPU_XSC3) :=$(call cc-option,-mtune=xscale,-mtune=strongarm110) - tune-$(CONFIG_CPU_FEROCEON) :=$(call cc-option,-mtune=marvell-f,-mtune=xscale) tune-$(CONFIG_CPU_V6) :=$(call cc-option,-mtune=arm1136j-s,-mtune=strongarm) tune-$(CONFIG_CPU_V6K) :=$(call cc-option,-mtune=arm1136j-s,-mtune=strongarm) +tune-$(CONFIG_MACH_ARMADA_370_XP) :=$(call cc-option,-mcpu=marvell-pj4,-mtune=xscale) ifeq ($(CONFIG_AEABI),y) CFLAGS_ABI :=-mabi=aapcs-linux -mno-thumb-interwork