Message ID | 1428499058-8322-1-git-send-email-andi@firstfloor.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 8 Apr 2015 06:17:38 -0700 Andi Kleen <andi@firstfloor.org> wrote: > From: Andi Kleen <ak@linux.intel.com> > > gcc 5 has a new no_reorder attribute that prevents top level > reordering only for that symbol. I'm having trouble locating gcc documentation which explains all this stuff. > Kernels don't like any reordering of initcalls between files, as several > initcalls depend on each other. LTO previously needed to use > -fno-toplevel-reordering to prevent boot failures. That's "-fno-toplevel-reorder", I believe? > Add a __noreorder wrapper for the no_reorder attribute and use > it for initcalls. Head is spinning a bit. As this all appears to be shiny new added-by-andi gcc functionality, it would be useful if we could have a few more words describing what it's all about. Reordering of what with respect to what and why and why is it bad. Why is gcc reordering things anyway, and what's the downside of preventing this. Why is the compiler reordering things rather than the linker. etc etc etc. Please gently educate us ;) -- To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Andrew, On Wed, Apr 08, 2015 at 03:31:12PM -0700, Andrew Morton wrote: > On Wed, 8 Apr 2015 06:17:38 -0700 Andi Kleen <andi@firstfloor.org> wrote: > > > From: Andi Kleen <ak@linux.intel.com> > > > > gcc 5 has a new no_reorder attribute that prevents top level > > reordering only for that symbol. > > I'm having trouble locating gcc documentation which explains all this > stuff. The official manuals only have released versions, and gcc 5 is not released yet, but it's here: https://github.com/gcc-mirror/gcc/blob/master/gcc/doc/extend.texi#L3505 > > Kernels don't like any reordering of initcalls between files, as several > > initcalls depend on each other. LTO previously needed to use > > -fno-toplevel-reordering to prevent boot failures. > > That's "-fno-toplevel-reorder", I believe? Yes. > > > Add a __noreorder wrapper for the no_reorder attribute and use > > it for initcalls. > > Head is spinning a bit. As this all appears to be shiny new > added-by-andi gcc functionality, it would be useful if we could have a > few more words describing what it's all about. Reordering of what with > respect to what and why and why is it bad. Why is gcc reordering > things anyway, and what's the downside of preventing this. Why is the > compiler reordering things rather than the linker. etc etc etc. Ok, let me try. The original gcc a long time was function at a time: it read one function, optimizes and writes it out, then the next. Then gcc 3.x added unit-at-a-time where it reads one complete file, optimizes it completely and writes it out. This has the advantage that it can make better inlining decisions, it can remove unused statics, it can propagate execution frequencies over the call tree before optimizing, and some other things. Then it writes it out the unit in the call tree order, which can also lead to better executable layout. One side effect of this is that the order of top level statements gets lost, unless you specify -fno-toplevel-reorder We had to fix Linux for this sometime in early 2.6, late 2.4. Most problems were in top level asm() statements, assuming they had a defined order to other variables. To still support programs doing that gcc added -fno-toplevel-reorder, which avoided such reordering, but also disabled a small number of optimizations. Now 4.x added LTO, where it takes unit-at-a-time one step further and optimizes the complete program in the same way at link time. It actually does not keep it in memory all the time, but uses various tricks to only look at it in pieces and distribute the work to multiple cores. To do that it uses partitioning, where the program is split into different partitions based on its global call tree, and then each partition is assigned to a compiler process. The result is a changed order for everything in the final program. Modern Linux was generally fine with reordering, except for initcalls. We have a lot of initcalls that assume that some other initcalls already ran before them, without using priorities. The order is defined in in the Makefile's object file order for the linker. Linkers generally do not reorder, unless told to. Unfortunately that gets lost with LTO. When I started the LTO patchkit I tried to debug and fix some of these init calls, but it was hopeless. It was like a many-headed hydra. So I needed to use -fno-toplevel-reorder for LTO. In LTO this both gives worse partitioning (so the build is less balanced between different cores) and also disables some optimizations, like eliminating unused variables or some cross file optimizations. gcc 5 finally gained a way to specify the no-toplevel-reorder attribute per symbol with this new attribute. So it can be only done for the initcall symbols, and everything else left alone. That is what this patch is about. It's not needed without LTO, but I belive it's useful documentation even without it. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 9 Apr 2015 01:50:23 +0200 Andi Kleen <andi@firstfloor.org> wrote: > > Head is spinning a bit. As this all appears to be shiny new > > added-by-andi gcc functionality, it would be useful if we could have a > > few more words describing what it's all about. Reordering of what with > > respect to what and why and why is it bad. Why is gcc reordering > > things anyway, and what's the downside of preventing this. Why is the > > compiler reordering things rather than the linker. etc etc etc. > > Ok, let me try. That was super-useful, thanks. I slurped it into the changelog - maybe one day it will provide material for Documentation/lto-stuff.txt. Big picture: do you have a feeling for how much benefit LTO will yield in the kernel, if/when it's all completed? -- To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 10, 2015 at 02:36:29PM -0700, Andrew Morton wrote: > On Thu, 9 Apr 2015 01:50:23 +0200 Andi Kleen <andi@firstfloor.org> wrote: > > > > Head is spinning a bit. As this all appears to be shiny new > > > added-by-andi gcc functionality, it would be useful if we could have a > > > few more words describing what it's all about. Reordering of what with > > > respect to what and why and why is it bad. Why is gcc reordering > > > things anyway, and what's the downside of preventing this. Why is the > > > compiler reordering things rather than the linker. etc etc etc. > > > > Ok, let me try. > > That was super-useful, thanks. I slurped it into the changelog - > maybe one day it will provide material for Documentation/lto-stuff.txt. > > Big picture: do you have a feeling for how much benefit LTO will yield > in the kernel, if/when it's all completed? At least nothing of the stuff I usually run seems to be very kernel compiler dependent in performance. I think other people may benefit from it. Just looking at the code it is often a lot better. We've had great results in code size reduction for small systems though. I also found a range of bugs in the kernel which is good. The merge is also nearly finished, only a smaller number of patches left. There are some future technologies which could benefit from it too. There is still some compile time penalty, although it got a lot better with 5. I wouldn't expect developers to use it day-to-day, but it can be a good release mode. I think it's a good thing to have now, just for the benefits for shrinking kernels. -Andi
diff --git a/include/linux/compiler-gcc5.h b/include/linux/compiler-gcc5.h index efee493..9004c00 100644 --- a/include/linux/compiler-gcc5.h +++ b/include/linux/compiler-gcc5.h @@ -42,6 +42,9 @@ /* Mark a function definition as prohibited from being cloned. */ #define __noclone __attribute__((__noclone__)) +/* Avoid reordering a top level statement */ +#define __noreorder __attribute__((no_reorder)) + /* * Tell the optimizer that something else uses this function or variable. */ diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 1b45e4a..ac639a1 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -334,6 +334,10 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s #define noinline #endif +#ifndef __noreorder +#define __noreorder /* unimplemented */ +#endif + /* * Rather then using noinline to prevent stack consumption, use * noinline_for_stack instead. For documentation reasons. diff --git a/include/linux/init.h b/include/linux/init.h index 2df8e8d..a0a1244 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -191,7 +191,7 @@ extern bool initcall_debug; */ #define __define_initcall(fn, id) \ - static initcall_t __initcall_##fn##id __used \ + static initcall_t __initcall_##fn##id __used __noreorder \ __attribute__((__section__(".initcall" #id ".init"))) = fn; \ LTO_REFERENCE_INITCALL(__initcall_##fn##id)