diff mbox series

ARM: OMAP2+: drop unnecessary adrl

Message ID 5a6807f19fd69f2de6622c794639cc5d70b9563a.1585513949.git.stefan@agner.ch (mailing list archive)
State Mainlined
Commit d85d5247885ef2e8192287b895c2e381fa931b0b
Headers show
Series ARM: OMAP2+: drop unnecessary adrl | expand

Commit Message

Stefan Agner March 29, 2020, 8:33 p.m. UTC
The adrl instruction has been introduced with commit dd31394779aa ("ARM:
omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
file was considerably longer. Today adr seems to have enough reach, even
when inserting about 60 instructions between the use site and the label.
Replace adrl with conventional adr instruction.

This allows to build this file using Clang's integrated assembler (which
does not support the adrl pseudo instruction).

Link: https://github.com/ClangBuiltLinux/linux/issues/430
Signed-off-by: Stefan Agner <stefan@agner.ch>
---
 arch/arm/mach-omap2/sleep34xx.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Nick Desaulniers April 1, 2020, 6:02 p.m. UTC | #1
On Sun, Mar 29, 2020 at 1:33 PM Stefan Agner <stefan@agner.ch> wrote:
>
> The adrl instruction has been introduced with commit dd31394779aa ("ARM:
> omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
> file was considerably longer. Today adr seems to have enough reach, even
> when inserting about 60 instructions between the use site and the label.
> Replace adrl with conventional adr instruction.
>
> This allows to build this file using Clang's integrated assembler (which
> does not support the adrl pseudo instruction).

Context: https://github.com/ClangBuiltLinux/linux/issues/430#issuecomment-476124724
If Peter says it's difficult to implement, I trust him.
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

>
> Link: https://github.com/ClangBuiltLinux/linux/issues/430
> Signed-off-by: Stefan Agner <stefan@agner.ch>
> ---
>  arch/arm/mach-omap2/sleep34xx.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/mach-omap2/sleep34xx.S b/arch/arm/mach-omap2/sleep34xx.S
> index ac1324c6453b..c4e97d35c310 100644
> --- a/arch/arm/mach-omap2/sleep34xx.S
> +++ b/arch/arm/mach-omap2/sleep34xx.S
> @@ -72,7 +72,7 @@ ENTRY(enable_omap3630_toggle_l2_on_restore)
>         stmfd   sp!, {lr}       @ save registers on stack
>         /* Setup so that we will disable and enable l2 */
>         mov     r1, #0x1
> -       adrl    r3, l2dis_3630_offset   @ may be too distant for plain adr
> +       adr     r3, l2dis_3630_offset
>         ldr     r2, [r3]                @ value for offset
>         str     r1, [r2, r3]            @ write to l2dis_3630
>         ldmfd   sp!, {pc}       @ restore regs and return
> --
> 2.25.1
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/5a6807f19fd69f2de6622c794639cc5d70b9563a.1585513949.git.stefan%40agner.ch.
Ard Biesheuvel April 2, 2020, 9:48 a.m. UTC | #2
On Wed, 1 Apr 2020 at 20:02, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Sun, Mar 29, 2020 at 1:33 PM Stefan Agner <stefan@agner.ch> wrote:
> >
> > The adrl instruction has been introduced with commit dd31394779aa ("ARM:
> > omap3: Thumb-2 compatibility for sleep34xx.S"), back when this assembly
> > file was considerably longer. Today adr seems to have enough reach, even
> > when inserting about 60 instructions between the use site and the label.
> > Replace adrl with conventional adr instruction.
> >
> > This allows to build this file using Clang's integrated assembler (which
> > does not support the adrl pseudo instruction).
>
> Context: https://github.com/ClangBuiltLinux/linux/issues/430#issuecomment-476124724
> If Peter says it's difficult to implement, I trust him.
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
>

I take it this implies that the LLVM linker does not support the
R_ARM_ALU_PC_Gn relocations? Since otherwise, adrl could simply be
expanded to a pair of adds with the appropriate relocations, letting
the linker fix up the immediates (and the ADD vs SUB bits)


> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/430
> > Signed-off-by: Stefan Agner <stefan@agner.ch>
> > ---
> >  arch/arm/mach-omap2/sleep34xx.S | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/mach-omap2/sleep34xx.S b/arch/arm/mach-omap2/sleep34xx.S
> > index ac1324c6453b..c4e97d35c310 100644
> > --- a/arch/arm/mach-omap2/sleep34xx.S
> > +++ b/arch/arm/mach-omap2/sleep34xx.S
> > @@ -72,7 +72,7 @@ ENTRY(enable_omap3630_toggle_l2_on_restore)
> >         stmfd   sp!, {lr}       @ save registers on stack
> >         /* Setup so that we will disable and enable l2 */
> >         mov     r1, #0x1
> > -       adrl    r3, l2dis_3630_offset   @ may be too distant for plain adr
> > +       adr     r3, l2dis_3630_offset
> >         ldr     r2, [r3]                @ value for offset
> >         str     r1, [r2, r3]            @ write to l2dis_3630
> >         ldmfd   sp!, {pc}       @ restore regs and return
> > --
> > 2.25.1
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/5a6807f19fd69f2de6622c794639cc5d70b9563a.1585513949.git.stefan%40agner.ch.
>
>
>
> --
> Thanks,
> ~Nick Desaulniers
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Peter Smith April 2, 2020, 11:50 a.m. UTC | #3
> I take it this implies that the LLVM linker does not support the
> R_ARM_ALU_PC_Gn relocations? Since otherwise, adrl could simply be
> expanded to a pair of adds with the appropriate relocations, letting
> the linker fix up the immediates (and the ADD vs SUB bits)

Not at the moment. I have a patch in review to add the G0 variants for these in Arm state at reviews.llvm.org/D75349 . As far as I know LLVM MC does not have support for generating the relocations either. This could be added though. I agree that using the G* relocations with a pair of add/sub instructions would be the ideal solution. The adrl psuedo is essentially that but implemented at assembly time. I think it would be possible to implement in LLVM but at the time (4+ years ago) I wasn't confident in finding someone that would think that adrl support was worth the disruption, for example the current Arm assembly backend can only produce 1 instruction as output and adrl requires two.

I'd be happy to look at group relocation support in LLD, I haven't got a lot of spare time so progress is likely to be slow though.

Peter
Ard Biesheuvel April 2, 2020, 12:05 p.m. UTC | #4
On Thu, 2 Apr 2020 at 13:50, Peter Smith <Peter.Smith@arm.com> wrote:
>
> > I take it this implies that the LLVM linker does not support the
> > R_ARM_ALU_PC_Gn relocations? Since otherwise, adrl could simply be
> > expanded to a pair of adds with the appropriate relocations, letting
> > the linker fix up the immediates (and the ADD vs SUB bits)
>
> Not at the moment. I have a patch in review to add the G0 variants for these in Arm state at reviews.llvm.org/D75349 . As far as I know LLVM MC does not have support for generating the relocations either. This could be added though. I agree that using the G* relocations with a pair of add/sub instructions would be the ideal solution. The adrl psuedo is essentially that but implemented at assembly time. I think it would be possible to implement in LLVM but at the time (4+ years ago) I wasn't confident in finding someone that would think that adrl support was worth the disruption, for example the current Arm assembly backend can only produce 1 instruction as output and adrl requires two.
>
> I'd be happy to look at group relocation support in LLD, I haven't got a lot of spare time so progress is likely to be slow though.
>

For Linux, I have proposed another approach in the past, which is to
define a (Linux-local) adr_l macro with unlimited range [0], which
basically comes down to place relative movw/movt pairs for v7+, and
something along the lines of

        ldr <reg>, 222f
111:    add <reg>, <reg>, pc
        .subsection 1
222:    .long <sym> - (111b + 8)
        .previous

for v6 and earlier. Could you comment on whether Clang's integrated
assembler could support anything like this?


Thanks,
Ard.



[0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=arm-kaslr-latest&id=fd440f1131553a5201ce3b94905419bd067b93b3
Stefan Agner April 2, 2020, 2:34 p.m. UTC | #5
On 2020-04-02 14:05, Ard Biesheuvel wrote:
> On Thu, 2 Apr 2020 at 13:50, Peter Smith <Peter.Smith@arm.com> wrote:
>>
>> > I take it this implies that the LLVM linker does not support the
>> > R_ARM_ALU_PC_Gn relocations? Since otherwise, adrl could simply be
>> > expanded to a pair of adds with the appropriate relocations, letting
>> > the linker fix up the immediates (and the ADD vs SUB bits)
>>
>> Not at the moment. I have a patch in review to add the G0 variants for these in Arm state at reviews.llvm.org/D75349 . As far as I know LLVM MC does not have support for generating the relocations either. This could be added though. I agree that using the G* relocations with a pair of add/sub instructions would be the ideal solution. The adrl psuedo is essentially that but implemented at assembly time. I think it would be possible to implement in LLVM but at the time (4+ years ago) I wasn't confident in finding someone that would think that adrl support was worth the disruption, for example the current Arm assembly backend can only produce 1 instruction as output and adrl requires two.
>>
>> I'd be happy to look at group relocation support in LLD, I haven't got a lot of spare time so progress is likely to be slow though.
>>
> 
> For Linux, I have proposed another approach in the past, which is to
> define a (Linux-local) adr_l macro with unlimited range [0], which
> basically comes down to place relative movw/movt pairs for v7+, and
> something along the lines of
> 
>         ldr <reg>, 222f
> 111:    add <reg>, <reg>, pc
>         .subsection 1
> 222:    .long <sym> - (111b + 8)
>         .previous

Just to confirm: The instance at hand today seems to be working fine
without adrl, so I guess we are fine here, do you agree?

There are a couple more instances of adrl in arch/arm/crypto/, maybe
that is where the adr_l macro could come in.

--
Stefan

> 
> for v6 and earlier. Could you comment on whether Clang's integrated
> assembler could support anything like this?
> 
> 
> Thanks,
> Ard.
> 
> 
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=arm-kaslr-latest&id=fd440f1131553a5201ce3b94905419bd067b93b3
Ard Biesheuvel April 2, 2020, 2:36 p.m. UTC | #6
On Thu, 2 Apr 2020 at 16:34, Stefan Agner <stefan@agner.ch> wrote:
>
> On 2020-04-02 14:05, Ard Biesheuvel wrote:
> > On Thu, 2 Apr 2020 at 13:50, Peter Smith <Peter.Smith@arm.com> wrote:
> >>
> >> > I take it this implies that the LLVM linker does not support the
> >> > R_ARM_ALU_PC_Gn relocations? Since otherwise, adrl could simply be
> >> > expanded to a pair of adds with the appropriate relocations, letting
> >> > the linker fix up the immediates (and the ADD vs SUB bits)
> >>
> >> Not at the moment. I have a patch in review to add the G0 variants for these in Arm state at reviews.llvm.org/D75349 . As far as I know LLVM MC does not have support for generating the relocations either. This could be added though. I agree that using the G* relocations with a pair of add/sub instructions would be the ideal solution. The adrl psuedo is essentially that but implemented at assembly time. I think it would be possible to implement in LLVM but at the time (4+ years ago) I wasn't confident in finding someone that would think that adrl support was worth the disruption, for example the current Arm assembly backend can only produce 1 instruction as output and adrl requires two.
> >>
> >> I'd be happy to look at group relocation support in LLD, I haven't got a lot of spare time so progress is likely to be slow though.
> >>
> >
> > For Linux, I have proposed another approach in the past, which is to
> > define a (Linux-local) adr_l macro with unlimited range [0], which
> > basically comes down to place relative movw/movt pairs for v7+, and
> > something along the lines of
> >
> >         ldr <reg>, 222f
> > 111:    add <reg>, <reg>, pc
> >         .subsection 1
> > 222:    .long <sym> - (111b + 8)
> >         .previous
>
> Just to confirm: The instance at hand today seems to be working fine
> without adrl, so I guess we are fine here, do you agree?
>

I agree. Apologies for hijacking the thread :-)

> There are a couple more instances of adrl in arch/arm/crypto/, maybe
> that is where the adr_l macro could come in.
>

There are various places in the arch code that could be cleaned up
along these lines.

But you're right - this is a separate discussion that deserves a
thread of its own. I was just satisfying my own curiosity.
Peter Smith April 2, 2020, 5:50 p.m. UTC | #7
> On Thu, 2 Apr 2020 at 13:50, Peter Smith <Peter.Smith@arm.com> wrote:
> >
> > > I take it this implies that the LLVM linker does not support the
> > > R_ARM_ALU_PC_Gn relocations? Since otherwise, adrl could simply be
> > > expanded to a pair of adds with the appropriate relocations, letting
> > > the linker fix up the immediates (and the ADD vs SUB bits)
> >
> > Not at the moment. I have a patch in review to add the G0 variants for these in Arm state at reviews.llvm.org/D75349 . As far as I know LLVM MC does not have support for generating the relocations either. This could be added though. I agree that using the G* relocations with a pair of add/sub instructions would be the ideal solution. The adrl psuedo is essentially that but implemented at assembly time. I think it would be possible to implement in LLVM but at the time (4+ years ago) I wasn't confident in finding someone that would think that adrl support was worth the disruption, for example the current Arm assembly backend can only produce 1 instruction as output and adrl requires two.
> >
> > I'd be happy to look at group relocation support in LLD, I haven't got a lot of spare time so progress is likely to be slow though.
> >

> For Linux, I have proposed another approach in the past, which is to
> define a (Linux-local) adr_l macro with unlimited range [0], which
> basically comes down to place relative movw/movt pairs for v7+, and
> something along the lines of

>         ldr <reg>, 222f
> 111:    add <reg>, <reg>, pc
>         .subsection 1
> 222:    .long <sym> - (111b + 8)
>         .previous
>
> for v6 and earlier. Could you comment on whether Clang's integrated
> assembler could support anything like this?

Apologies for the delay in responding.

That looks like it should work. Empirically the following works in both Clang and GNU as. One potential problem here is that if the section is large and the subsections are dumped at the end the ldr is at risk of going out of range.

 .arm
 .macro mylongadrl reg, sym
 ldr \reg, 222f
111:    add \reg, \reg, pc
        .subsection 1
222:    .long \sym - (111b + 8)
 .previous
 .endm       

 .text
foo:     bx lr
bar:     bx lr
 mylongadrl r0 foo
 mylongadrl r0 bar

> Thanks,
> Ard.
>
>
>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=arm-kaslr-latest&id=fd440f1131553a5201ce3b94905419bd067b93b3
Tony Lindgren April 17, 2020, 3:23 p.m. UTC | #8
* Ard Biesheuvel <ardb@kernel.org> [200402 14:37]:
> On Thu, 2 Apr 2020 at 16:34, Stefan Agner <stefan@agner.ch> wrote:
> > Just to confirm: The instance at hand today seems to be working fine
> > without adrl, so I guess we are fine here, do you agree?
> >
> 
> I agree. Apologies for hijacking the thread :-)

Yes this seems to work just fine based on a quick test, will
be applying for v5.8.

Thanks,

Tony
diff mbox series

Patch

diff --git a/arch/arm/mach-omap2/sleep34xx.S b/arch/arm/mach-omap2/sleep34xx.S
index ac1324c6453b..c4e97d35c310 100644
--- a/arch/arm/mach-omap2/sleep34xx.S
+++ b/arch/arm/mach-omap2/sleep34xx.S
@@ -72,7 +72,7 @@  ENTRY(enable_omap3630_toggle_l2_on_restore)
 	stmfd	sp!, {lr}	@ save registers on stack
 	/* Setup so that we will disable and enable l2 */
 	mov	r1, #0x1
-	adrl	r3, l2dis_3630_offset	@ may be too distant for plain adr
+	adr	r3, l2dis_3630_offset
 	ldr	r2, [r3]		@ value for offset
 	str	r1, [r2, r3]		@ write to l2dis_3630
 	ldmfd	sp!, {pc}	@ restore regs and return