Message ID | 20231012201743.292149-1-ubizjak@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce %rip-relative addressing to PER_CPU_VAR macro | expand |
On 10/12/23 13:12, Uros Bizjak wrote: > The last patch introduces (%rip) suffix and uses it for x86_64 target, > resulting in a small code size decrease: text data bss dec hex filename > 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685 > 808388 30705702 1d48826 vmlinux-old.o I feel like I'm missing some of the motivation here. 50 bytes is great and all, but it isn't without the cost of changing some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR() confusion. Are there some other side benefits? What else does this enable?
On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 10/12/23 13:12, Uros Bizjak wrote: > > The last patch introduces (%rip) suffix and uses it for x86_64 target, > > resulting in a small code size decrease: text data bss dec hex filename > > 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685 > > 808388 30705702 1d48826 vmlinux-old.o > > I feel like I'm missing some of the motivation here. > > 50 bytes is great and all, but it isn't without the cost of changing > some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR() > confusion. > > Are there some other side benefits? What else does this enable? These changes are necessary to build the kernel as Position Independent Executable (PIE) on x86_64 [1]. And since I was working in percpu area I thought that it was worth implementing them. [1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/ Uros.
On 10/12/23 13:59, Uros Bizjak wrote: > On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote: >> >> On 10/12/23 13:12, Uros Bizjak wrote: >>> The last patch introduces (%rip) suffix and uses it for x86_64 target, >>> resulting in a small code size decrease: text data bss dec hex filename >>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685 >>> 808388 30705702 1d48826 vmlinux-old.o >> >> I feel like I'm missing some of the motivation here. >> >> 50 bytes is great and all, but it isn't without the cost of changing >> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR() >> confusion. >> >> Are there some other side benefits? What else does this enable? > > These changes are necessary to build the kernel as Position > Independent Executable (PIE) on x86_64 [1]. And since I was working in > percpu area I thought that it was worth implementing them. > > [1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/ > Are you PIC-adjusting the percpu variables as well? -hpa
On Thu, Oct 12, 2023 at 11:08 PM H. Peter Anvin <hpa@zytor.com> wrote: > > On 10/12/23 13:59, Uros Bizjak wrote: > > On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote: > >> > >> On 10/12/23 13:12, Uros Bizjak wrote: > >>> The last patch introduces (%rip) suffix and uses it for x86_64 target, > >>> resulting in a small code size decrease: text data bss dec hex filename > >>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685 > >>> 808388 30705702 1d48826 vmlinux-old.o > >> > >> I feel like I'm missing some of the motivation here. > >> > >> 50 bytes is great and all, but it isn't without the cost of changing > >> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR() > >> confusion. > >> > >> Are there some other side benefits? What else does this enable? > > > > These changes are necessary to build the kernel as Position > > Independent Executable (PIE) on x86_64 [1]. And since I was working in > > percpu area I thought that it was worth implementing them. > > > > [1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/ > > > > Are you PIC-adjusting the percpu variables as well? After this patch (and after fixing percpu_stable_op to use "a" operand modifier on GCC), the only *one* remaining absolute reference to percpu variable remain in xen-head.S, where: movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax should be changed to use leaq. All others should then be (%rip)-relative. Uros.
On 10/12/23 14:17, Uros Bizjak wrote: >> >> Are you PIC-adjusting the percpu variables as well? > > After this patch (and after fixing percpu_stable_op to use "a" operand > modifier on GCC), the only *one* remaining absolute reference to > percpu variable remain in xen-head.S, where: > > movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax > > should be changed to use leaq. > > All others should then be (%rip)-relative. > I mean, the symbols themselves are relative, not absolute? -hpa
On Thu, Oct 12, 2023 at 11:22 PM H. Peter Anvin <hpa@zytor.com> wrote: > > On 10/12/23 14:17, Uros Bizjak wrote: > >> > >> Are you PIC-adjusting the percpu variables as well? > > > > After this patch (and after fixing percpu_stable_op to use "a" operand > > modifier on GCC), the only *one* remaining absolute reference to > > percpu variable remain in xen-head.S, where: > > > > movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax > > > > should be changed to use leaq. > > > > All others should then be (%rip)-relative. > > > > I mean, the symbols themselves are relative, not absolute? The reference to the symbol is relative to the segment register, but absolute to the location of the instruction. If the executable changes location, then instruction moves around and reference is not valid anymore. (%rip)-relative reference compensate for changed location of the instruction. Uros.