diff mbox

v7-M: Fixing XIP when the kernel is in ROM

Message ID 20151026012710.GA17365@laptop (mailing list archive)
State New, archived
Headers show

Commit Message

Ezequiel Garcia Oct. 26, 2015, 1:27 a.m. UTC
I've been trying to make my ARM v7-M LPC43xx board
boot a XIP kernel from flash. Currently, this seems
to be broken in mainline due to this:

arch/arm/mm/proc-v7m.S
[..]
        @ SVC to run the kernel in this mode
        badr    r1, 1f
        ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
        str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
        mov     r6, lr                  @ save LR
        mov     r7, sp                  @ save SP
	ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
        cpsie   i
        svc     #0
1:      cpsid   i
        str     r5, [r12, #11 * 4]      @ restore the original SVC vector entry
        mov     lr, r6                  @ restore LR
        mov     sp, r7                  @ restore SP

Here, a temporary stack is prepared before making a
supervisor call (SVC) to switch to handler mode.

The temporary stack is allocated in the .text.init section
and so this doesn't work when the kernel is executing from ROM.

A similar problem has been reported for v7:

http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/357106.html

While trying to come up with a proper fix, I've noticed how
the stack doesn't seem to be used.

So, I've been trying to understand why the need for the temporary
stack at all, but I still can't get it. 

The below patch seems to work just fine, and allows to boot a
LPC43xx kernel either as XIP from ROM or non-XIP from RAM.

However, I'm still wondering if the stack is really unused or not,
so any lights that can be shed here will be appreciated.

Thanks!

From a7c880c73b8ad2e4c4b07f4d11809ea541a65e1d Mon Sep 17 00:00:00 2001
From: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Date: Sat, 24 Oct 2015 13:27:27 -0300
Subject: [PATCH] ARM: Don't prepare any temporary stack in __v7m_setup

Since __v7m_setup() is implemented as the PROCINFO_INITFUNC
called in head-nommu.S it's called at the very beggining to
do some very basic setup.

The function prepares a temporary stack in the .text.init
section before calling SVC. However, this stack seems to
be completely unused and hence is not needed.

Moreover, this breaks on XIP kernels, when the text is in ROM.
Hence, this commit simply removes the temporary stack setup.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
---
 arch/arm/mm/proc-v7m.S | 6 ------
 1 file changed, 6 deletions(-)

Comments

Uwe Kleine-König Oct. 26, 2015, 8:05 a.m. UTC | #1
Hello,

On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
> I've been trying to make my ARM v7-M LPC43xx board
> boot a XIP kernel from flash. Currently, this seems

I admit I didn't update my efm32 machine for quite some time, but this
can only boot with XIP.

> to be broken in mainline due to this:
> 
> arch/arm/mm/proc-v7m.S
> [..]
>         @ SVC to run the kernel in this mode
>         badr    r1, 1f
>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
>         mov     r6, lr                  @ save LR
>         mov     r7, sp                  @ save SP
> 	ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
If you make the tab above 8 spaces the line will align in the git
commit, too.
Hmm, this line is there from the beginning (i.e. 55bdd6941165 ("ARM: Add
base support for ARMv7-M")).

>         cpsie   i
>         svc     #0
> 1:      cpsid   i
>         str     r5, [r12, #11 * 4]      @ restore the original SVC vector entry
>         mov     lr, r6                  @ restore LR
>         mov     sp, r7                  @ restore SP
> 
> Here, a temporary stack is prepared before making a
> supervisor call (SVC) to switch to handler mode.
> 
> The temporary stack is allocated in the .text.init section
> and so this doesn't work when the kernel is executing from ROM.

If sp isn't used, how does it break you setup? Did you try to put a
watchpoint to the location in question?

Best regards
Uwe
Ezequiel Garcia Oct. 26, 2015, 1:12 p.m. UTC | #2
On 26 October 2015 at 05:05, Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
> Hello,
>
> On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
>> I've been trying to make my ARM v7-M LPC43xx board
>> boot a XIP kernel from flash. Currently, this seems
>
> I admit I didn't update my efm32 machine for quite some time, but this
> can only boot with XIP.
>

Executing directly from read-only memory? Hmm, that's odd.

>> to be broken in mainline due to this:
>>
>> arch/arm/mm/proc-v7m.S
>> [..]
>>         @ SVC to run the kernel in this mode
>>         badr    r1, 1f
>>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
>>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
>>         mov     r6, lr                  @ save LR
>>         mov     r7, sp                  @ save SP
>>       ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
> If you make the tab above 8 spaces the line will align in the git
> commit, too.
> Hmm, this line is there from the beginning (i.e. 55bdd6941165 ("ARM: Add
> base support for ARMv7-M")).
>

Yes, I know. And was there in Catalin's first patches, hence why I'm asking :-)

>>         cpsie   i
>>         svc     #0
>> 1:      cpsid   i
>>         str     r5, [r12, #11 * 4]      @ restore the original SVC vector entry
>>         mov     lr, r6                  @ restore LR
>>         mov     sp, r7                  @ restore SP
>>
>> Here, a temporary stack is prepared before making a
>> supervisor call (SVC) to switch to handler mode.
>>
>> The temporary stack is allocated in the .text.init section
>> and so this doesn't work when the kernel is executing from ROM.
>
> If sp isn't used, how does it break you setup?

Well, the supervisor call uses the stack, but not the Linux code.
From the Application Note 179:

""
2.7. Supervisor Calls (SVC)
[..]
On the Cortex-M3, the core saves the argument registers to the stack
on the initial exception entry.
A late-arriving exception, taken before the first instruction of the
SVC handler executes,
might corrupt the copy of the arguments still held in R0 to R3. This
means that the stack copy
of the arguments must be used by the SVC handler
""

> Did you try to put a watchpoint to the location in question?
>

Sort of. I used a low level assembly call to printuart and followed
the execution
to that instruction. The CPU never seems to get pass the supervisor call.
Ezequiel Garcia Oct. 27, 2015, 3:35 p.m. UTC | #3
(Let's Cc the maintainers for the rest of the cortex-M platforms.
Hopefully they can help review/test the proposed patch.)

On 26 October 2015 at 10:12, Ezequiel Garcia
<ezequiel@vanguardiasur.com.ar> wrote:
> On 26 October 2015 at 05:05, Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
>> Hello,
>>
>> On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
>>> I've been trying to make my ARM v7-M LPC43xx board
>>> boot a XIP kernel from flash. Currently, this seems
>>
>> I admit I didn't update my efm32 machine for quite some time, but this
>> can only boot with XIP.
>>
>
> Executing directly from read-only memory? Hmm, that's odd.
>
>>> to be broken in mainline due to this:
>>>
>>> arch/arm/mm/proc-v7m.S
>>> [..]
>>>         @ SVC to run the kernel in this mode
>>>         badr    r1, 1f
>>>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
>>>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
>>>         mov     r6, lr                  @ save LR
>>>         mov     r7, sp                  @ save SP
>>>       ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
>> If you make the tab above 8 spaces the line will align in the git
>> commit, too.
>> Hmm, this line is there from the beginning (i.e. 55bdd6941165 ("ARM: Add
>> base support for ARMv7-M")).
>>
>
> Yes, I know. And was there in Catalin's first patches, hence why I'm asking :-)
>
>>>         cpsie   i
>>>         svc     #0
>>> 1:      cpsid   i
>>>         str     r5, [r12, #11 * 4]      @ restore the original SVC vector entry
>>>         mov     lr, r6                  @ restore LR
>>>         mov     sp, r7                  @ restore SP
>>>
>>> Here, a temporary stack is prepared before making a
>>> supervisor call (SVC) to switch to handler mode.
>>>
>>> The temporary stack is allocated in the .text.init section
>>> and so this doesn't work when the kernel is executing from ROM.
>>
>> If sp isn't used, how does it break you setup?
>
> Well, the supervisor call uses the stack, but not the Linux code.
> From the Application Note 179:
>
> ""
> 2.7. Supervisor Calls (SVC)
> [..]
> On the Cortex-M3, the core saves the argument registers to the stack
> on the initial exception entry.
> A late-arriving exception, taken before the first instruction of the
> SVC handler executes,
> might corrupt the copy of the arguments still held in R0 to R3. This
> means that the stack copy
> of the arguments must be used by the SVC handler
> ""
>
>> Did you try to put a watchpoint to the location in question?
>>
>
> Sort of. I used a low level assembly call to printuart and followed
> the execution
> to that instruction. The CPU never seems to get pass the supervisor call.
> --
> Ezequiel García, VanguardiaSur
> www.vanguardiasur.com.ar
Maxime Coquelin Oct. 27, 2015, 4:03 p.m. UTC | #4
Hi Ezequiel,

On 10/27/2015 04:35 PM, Ezequiel Garcia wrote:
>> >>>The temporary stack is allocated in the .text.init section
>> >>>and so this doesn't work when the kernel is executing from ROM.
> >>
> >>If sp isn't used, how does it break you setup?
STM32 machine works fine with XIP from internal flash memory, so as Uwe, 
I'm surprised it solves your problem.

I will have a try with your patch later today.

Regards,
Maxime
Uwe Kleine-König Oct. 27, 2015, 8:21 p.m. UTC | #5
Hello Ezequiel,

On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
> I've been trying to make my ARM v7-M LPC43xx board
> boot a XIP kernel from flash. Currently, this seems
> to be broken in mainline due to this:
> 
> arch/arm/mm/proc-v7m.S
> [..]
>         @ SVC to run the kernel in this mode
>         badr    r1, 1f
>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
>         mov     r6, lr                  @ save LR
>         mov     r7, sp                  @ save SP
> 	ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!

How does this fail for you?

>         cpsie   i
>         svc     #0
> 1:      cpsid   i
>         str     r5, [r12, #11 * 4]      @ restore the original SVC vector entry
>         mov     lr, r6                  @ restore LR
>         mov     sp, r7                  @ restore SP
> 
> Here, a temporary stack is prepared before making a
> supervisor call (SVC) to switch to handler mode.

OK, the effect of svc is that something is written to where sp points
to. On my efm32 nothing obvious happens when something random is written
there. I guess if that results in some CFI commands I have a problem
though. What about your machine?

> The temporary stack is allocated in the .text.init section
> and so this doesn't work when the kernel is executing from ROM.
> 
> A similar problem has been reported for v7:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/357106.html
> 
> While trying to come up with a proper fix, I've noticed how
> the stack doesn't seem to be used.
> 
> So, I've been trying to understand why the need for the temporary
> stack at all, but I still can't get it. 
> 
> The below patch seems to work just fine, and allows to boot a
> LPC43xx kernel either as XIP from ROM or non-XIP from RAM.
> 
> However, I'm still wondering if the stack is really unused or not,
> so any lights that can be shed here will be appreciated.
> 
> Thanks!
> 
> From a7c880c73b8ad2e4c4b07f4d11809ea541a65e1d Mon Sep 17 00:00:00 2001
> From: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
> Date: Sat, 24 Oct 2015 13:27:27 -0300
> Subject: [PATCH] ARM: Don't prepare any temporary stack in __v7m_setup
> 
> Since __v7m_setup() is implemented as the PROCINFO_INITFUNC
> called in head-nommu.S it's called at the very beggining to
> do some very basic setup.
> 
> The function prepares a temporary stack in the .text.init
> section before calling SVC. However, this stack seems to
> be completely unused and hence is not needed.
> 
> Moreover, this breaks on XIP kernels, when the text is in ROM.
> Hence, this commit simply removes the temporary stack setup.
> 
> Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
> ---
>  arch/arm/mm/proc-v7m.S | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
> index 67d9209077c6..6a383e619a0c 100644
> --- a/arch/arm/mm/proc-v7m.S
> +++ b/arch/arm/mm/proc-v7m.S
> @@ -103,7 +103,6 @@ __v7m_setup:
>  	str	r1, [r12, #11 * 4]	@ write the temporary SVC vector entry
>  	mov	r6, lr			@ save LR
>  	mov	r7, sp			@ save SP
> -	ldr	sp, =__v7m_setup_stack_top
>  	cpsie	i
>  	svc	#0
>  1:	cpsid	i
> @@ -123,11 +122,6 @@ __v7m_setup:
>  	ret	lr
>  ENDPROC(__v7m_setup)
>  
> -	.align 2
> -__v7m_setup_stack:
> -	.space	4 * 8				@ 8 registers
> -__v7m_setup_stack_top:
> -

The effect of your patch is that the value of sp as it is when
__v7m_setup is entered is used. I didn't check, but I wouldn't be
surprised if that's the value of sp when the boot loader gave control to
Linux. This might or might not work. Something more robust would be
better of course.

Best regards
Uwe
Maxime Coquelin Oct. 27, 2015, 8:25 p.m. UTC | #6
2015-10-27 17:03 GMT+01:00 Maxime Coquelin <maxime.coquelin@st.com>:
> Hi Ezequiel,
>
> On 10/27/2015 04:35 PM, Ezequiel Garcia wrote:
>>>
>>> >>>The temporary stack is allocated in the .text.init section
>>> >>>and so this doesn't work when the kernel is executing from ROM.
>>
>> >>
>> >>If sp isn't used, how does it break you setup?
>
> STM32 machine works fine with XIP from internal flash memory, so as Uwe, I'm
> surprised it solves your problem.
>
> I will have a try with your patch later today.

Just tested, and confirm it works with (and without) your patch in XIP on STM32.

Regards,
Maxime
Stefan Agner Oct. 27, 2015, 8:33 p.m. UTC | #7
On 2015-10-27 13:25, Maxime Coquelin wrote:
> 2015-10-27 17:03 GMT+01:00 Maxime Coquelin <maxime.coquelin@st.com>:
>> Hi Ezequiel,
>>
>> On 10/27/2015 04:35 PM, Ezequiel Garcia wrote:
>>>>
>>>> >>>The temporary stack is allocated in the .text.init section
>>>> >>>and so this doesn't work when the kernel is executing from ROM.
>>>
>>> >>
>>> >>If sp isn't used, how does it break you setup?
>>
>> STM32 machine works fine with XIP from internal flash memory, so as Uwe, I'm
>> surprised it solves your problem.
>>
>> I will have a try with your patch later today.
> 
> Just tested, and confirm it works with (and without) your patch in XIP on STM32.

It probably depends what exactly happens when the CPU core tries to use
the stack. I mean, the CPU core will issue a write to the flash, and how
that behaves is probably implementation specific. However, I would
expect that it should lead to some kind of fault... Do we have fault
handlers at that time?

Anyway, I guess Ezequiel's fix is the right thing to do...

--
Stefan
Ezequiel Garcia Oct. 27, 2015, 8:57 p.m. UTC | #8
On 27 October 2015 at 17:21, Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
> Hello Ezequiel,
>
> On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
>> I've been trying to make my ARM v7-M LPC43xx board
>> boot a XIP kernel from flash. Currently, this seems
>> to be broken in mainline due to this:
>>
>> arch/arm/mm/proc-v7m.S
>> [..]
>>         @ SVC to run the kernel in this mode
>>         badr    r1, 1f
>>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
>>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
>>         mov     r6, lr                  @ save LR
>>         mov     r7, sp                  @ save SP
>>       ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
>
> How does this fail for you?
>

My CPU just seems to stall.
I've added calls to the printch in arch/arm/kernel/debug.S
and can't get past the SVC call.

As Stefan suggested, maybe a fault handler is called,
but I'm not getting any fault message printed.

>>         cpsie   i
>>         svc     #0
>> 1:      cpsid   i
>>         str     r5, [r12, #11 * 4]      @ restore the original SVC vector entry
>>         mov     lr, r6                  @ restore LR
>>         mov     sp, r7                  @ restore SP
>>
>> Here, a temporary stack is prepared before making a
>> supervisor call (SVC) to switch to handler mode.
>
> OK, the effect of svc is that something is written to where sp points
> to. On my efm32 nothing obvious happens when something random is written
> there. I guess if that results in some CFI commands I have a problem
> though. What about your machine?
>

Good, I was under the same impression. My only explanation for the fact
that Maxime and you seem to have no problem was that a store
to the internal flash doesn't stall the MCU.

On my side, I have U-Boot and the DTB on the internal flash,
and the kernel is on SPIFI, which can be configured to allow
code execution.

Haven't checked what the effect is if you try to write there.
In any case, it seems wrong to have the stack point to flash.

>> The temporary stack is allocated in the .text.init section
>> and so this doesn't work when the kernel is executing from ROM.
>>
>> A similar problem has been reported for v7:
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/357106.html
>>
>> While trying to come up with a proper fix, I've noticed how
>> the stack doesn't seem to be used.
>>
>> So, I've been trying to understand why the need for the temporary
>> stack at all, but I still can't get it.
>>
>> The below patch seems to work just fine, and allows to boot a
>> LPC43xx kernel either as XIP from ROM or non-XIP from RAM.
>>
>> However, I'm still wondering if the stack is really unused or not,
>> so any lights that can be shed here will be appreciated.
>>
>> Thanks!
>>
>> From a7c880c73b8ad2e4c4b07f4d11809ea541a65e1d Mon Sep 17 00:00:00 2001
>> From: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
>> Date: Sat, 24 Oct 2015 13:27:27 -0300
>> Subject: [PATCH] ARM: Don't prepare any temporary stack in __v7m_setup
>>
>> Since __v7m_setup() is implemented as the PROCINFO_INITFUNC
>> called in head-nommu.S it's called at the very beggining to
>> do some very basic setup.
>>
>> The function prepares a temporary stack in the .text.init
>> section before calling SVC. However, this stack seems to
>> be completely unused and hence is not needed.
>>
>> Moreover, this breaks on XIP kernels, when the text is in ROM.
>> Hence, this commit simply removes the temporary stack setup.
>>
>> Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
>> ---
>>  arch/arm/mm/proc-v7m.S | 6 ------
>>  1 file changed, 6 deletions(-)
>>
>> diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
>> index 67d9209077c6..6a383e619a0c 100644
>> --- a/arch/arm/mm/proc-v7m.S
>> +++ b/arch/arm/mm/proc-v7m.S
>> @@ -103,7 +103,6 @@ __v7m_setup:
>>       str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
>>       mov     r6, lr                  @ save LR
>>       mov     r7, sp                  @ save SP
>> -     ldr     sp, =__v7m_setup_stack_top
>>       cpsie   i
>>       svc     #0
>>  1:   cpsid   i
>> @@ -123,11 +122,6 @@ __v7m_setup:
>>       ret     lr
>>  ENDPROC(__v7m_setup)
>>
>> -     .align 2
>> -__v7m_setup_stack:
>> -     .space  4 * 8                           @ 8 registers
>> -__v7m_setup_stack_top:
>> -
>
> The effect of your patch is that the value of sp as it is when
> __v7m_setup is entered is used. I didn't check, but I wouldn't be
> surprised if that's the value of sp when the boot loader gave control to
> Linux. This might or might not work. Something more robust would be
> better of course.
>

Well, this is where I'm a bit lost, as I was unable to find where
the stack is setup in Linux.

But on the other side, the current code, just sets a *temporary*
stack to fit the 8 registers needed for the SVC call. The stack
pointer is restored right after that.

So in reality, the effect of the patch above is to require some
additional space for the already in-use stack.

I can't see why it wouldn't be robust.
Uwe Kleine-König Oct. 27, 2015, 9:20 p.m. UTC | #9
Hello Ezequiel,

On Tue, Oct 27, 2015 at 05:57:46PM -0300, Ezequiel Garcia wrote:
> On 27 October 2015 at 17:21, Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> > Hello Ezequiel,
> >
> > On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
> >> I've been trying to make my ARM v7-M LPC43xx board
> >> boot a XIP kernel from flash. Currently, this seems
> >> to be broken in mainline due to this:
> >>
> >> arch/arm/mm/proc-v7m.S
> >> [..]
> >>         @ SVC to run the kernel in this mode
> >>         badr    r1, 1f
> >>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
> >>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
> >>         mov     r6, lr                  @ save LR
> >>         mov     r7, sp                  @ save SP
> >>       ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
> >
> > How does this fail for you?
> >
> 
> My CPU just seems to stall.
> I've added calls to the printch in arch/arm/kernel/debug.S
> and can't get past the SVC call.
Can you try to add something like:

	stmdb	sp!, {r0-r4}

after setting sp as above and check if that makes the machine hang, too?

> As Stefan suggested, maybe a fault handler is called,
> but I'm not getting any fault message printed.

Hmm, the fault handler should be up, didn't verify though.

> >>         cpsie   i
> >>         svc     #0
> >> 1:      cpsid   i
> >>         str     r5, [r12, #11 * 4]      @ restore the original SVC vector entry
> >>         mov     lr, r6                  @ restore LR
> >>         mov     sp, r7                  @ restore SP
> >>
> >> Here, a temporary stack is prepared before making a
> >> supervisor call (SVC) to switch to handler mode.
> >
> > OK, the effect of svc is that something is written to where sp points
> > to. On my efm32 nothing obvious happens when something random is written
> > there. I guess if that results in some CFI commands I have a problem
> > though. What about your machine?
> >
> 
> Good, I was under the same impression. My only explanation for the fact
> that Maxime and you seem to have no problem was that a store
> to the internal flash doesn't stall the MCU.
> 
> On my side, I have U-Boot and the DTB on the internal flash,
> and the kernel is on SPIFI, which can be configured to allow
> code execution.
> 
> Haven't checked what the effect is if you try to write there.
> In any case, it seems wrong to have the stack point to flash.

That's right, so we need some writeable memory there. Where should we
take that from?

> >> The temporary stack is allocated in the .text.init section
> >> and so this doesn't work when the kernel is executing from ROM.
> >> diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
> >> index 67d9209077c6..6a383e619a0c 100644
> >> --- a/arch/arm/mm/proc-v7m.S
> >> +++ b/arch/arm/mm/proc-v7m.S
> >> @@ -103,7 +103,6 @@ __v7m_setup:
> >>       str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
> >>       mov     r6, lr                  @ save LR
> >>       mov     r7, sp                  @ save SP
> >> -     ldr     sp, =__v7m_setup_stack_top
> >>       cpsie   i
> >>       svc     #0
> >>  1:   cpsid   i
> >> @@ -123,11 +122,6 @@ __v7m_setup:
> >>       ret     lr
> >>  ENDPROC(__v7m_setup)
> >>
> >> -     .align 2
> >> -__v7m_setup_stack:
> >> -     .space  4 * 8                           @ 8 registers
> >> -__v7m_setup_stack_top:
> >> -
> >
> > The effect of your patch is that the value of sp as it is when
> > __v7m_setup is entered is used. I didn't check, but I wouldn't be
> > surprised if that's the value of sp when the boot loader gave control to
> > Linux. This might or might not work. Something more robust would be
> > better of course.
> >
> 
> Well, this is where I'm a bit lost, as I was unable to find where
> the stack is setup in Linux.

Hmm, that's long time ago that I worked on that code. Didn't find it but
only looked a few minutes. Looking at arch/arm/kernel/head-nommu.S it
seems __lookup_processor_type is called without explicitly initializing
sp first. Hmm.

> But on the other side, the current code, just sets a *temporary*
> stack to fit the 8 registers needed for the SVC call. The stack
> pointer is restored right after that.
> 
> So in reality, the effect of the patch above is to require some
> additional space for the already in-use stack.

With your patch and on top of arch/arm/kernel/head-nommu.S adding

	ldr	sp, =(some_address_in_ROM)

crashes your kernel, right? This means we either have to find a save
location to put the stack to, or need to document that the sp register
needs to be setup to have some stack available when the kernel image is
jumped into. I don't like the latter.

Best regards
Uwe
Maxime Coquelin Oct. 27, 2015, 9:33 p.m. UTC | #10
2015-10-27 21:33 GMT+01:00 Stefan Agner <stefan@agner.ch>:
> On 2015-10-27 13:25, Maxime Coquelin wrote:
>> 2015-10-27 17:03 GMT+01:00 Maxime Coquelin <maxime.coquelin@st.com>:
>>> Hi Ezequiel,
>>>
>>> On 10/27/2015 04:35 PM, Ezequiel Garcia wrote:
>>>>>
>>>>> >>>The temporary stack is allocated in the .text.init section
>>>>> >>>and so this doesn't work when the kernel is executing from ROM.
>>>>
>>>> >>
>>>> >>If sp isn't used, how does it break you setup?
>>>
>>> STM32 machine works fine with XIP from internal flash memory, so as Uwe, I'm
>>> surprised it solves your problem.
>>>
>>> I will have a try with your patch later today.
>>
>> Just tested, and confirm it works with (and without) your patch in XIP on STM32.
>
> It probably depends what exactly happens when the CPU core tries to use
> the stack. I mean, the CPU core will issue a write to the flash, and how
> that behaves is probably implementation specific. However, I would
> expect that it should lead to some kind of fault... Do we have fault
> handlers at that time?

Yes we have fault handlers at that time.
I agree the behaviour is probably implementation specific.

>
> Anyway, I guess Ezequiel's fix is the right thing to do...
Or, maybe we should put the stack in RAM as Uwe suggest?

Regards,
Maxime
Ezequiel Garcia Oct. 27, 2015, 9:46 p.m. UTC | #11
On 27 October 2015 at 18:33, Maxime Coquelin <mcoquelin.stm32@gmail.com> wrote:
> 2015-10-27 21:33 GMT+01:00 Stefan Agner <stefan@agner.ch>:
>> On 2015-10-27 13:25, Maxime Coquelin wrote:
>>> 2015-10-27 17:03 GMT+01:00 Maxime Coquelin <maxime.coquelin@st.com>:
>>>> Hi Ezequiel,
>>>>
>>>> On 10/27/2015 04:35 PM, Ezequiel Garcia wrote:
>>>>>>
>>>>>> >>>The temporary stack is allocated in the .text.init section
>>>>>> >>>and so this doesn't work when the kernel is executing from ROM.
>>>>>
>>>>> >>
>>>>> >>If sp isn't used, how does it break you setup?
>>>>
>>>> STM32 machine works fine with XIP from internal flash memory, so as Uwe, I'm
>>>> surprised it solves your problem.
>>>>
>>>> I will have a try with your patch later today.
>>>
>>> Just tested, and confirm it works with (and without) your patch in XIP on STM32.
>>
>> It probably depends what exactly happens when the CPU core tries to use
>> the stack. I mean, the CPU core will issue a write to the flash, and how
>> that behaves is probably implementation specific. However, I would
>> expect that it should lead to some kind of fault... Do we have fault
>> handlers at that time?
>
> Yes we have fault handlers at that time.
> I agree the behaviour is probably implementation specific.
>
>>
>> Anyway, I guess Ezequiel's fix is the right thing to do...
> Or, maybe we should put the stack in RAM as Uwe suggest?
>

It seems the current fix would be used as-is (no need to
set a temporary stack before the SVC call).

And on top of that, we need a to set a proper stack as soon
as the kernel takes control.

Am I right?
Maxime Coquelin Oct. 27, 2015, 9:52 p.m. UTC | #12
2015-10-27 22:46 GMT+01:00 Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>:
> On 27 October 2015 at 18:33, Maxime Coquelin <mcoquelin.stm32@gmail.com> wrote:
>> 2015-10-27 21:33 GMT+01:00 Stefan Agner <stefan@agner.ch>:
...
>>> Anyway, I guess Ezequiel's fix is the right thing to do...
>> Or, maybe we should put the stack in RAM as Uwe suggest?
>>
>
> It seems the current fix would be used as-is (no need to
> set a temporary stack before the SVC call).
>
> And on top of that, we need a to set a proper stack as soon
> as the kernel takes control.
>
> Am I right?
IMHO, yes, that's how I understand it.

Maxime
Ezequiel Garcia Oct. 27, 2015, 10:08 p.m. UTC | #13
On 27 October 2015 at 18:52, Maxime Coquelin <mcoquelin.stm32@gmail.com> wrote:
> 2015-10-27 22:46 GMT+01:00 Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>:
>> On 27 October 2015 at 18:33, Maxime Coquelin <mcoquelin.stm32@gmail.com> wrote:
>>> 2015-10-27 21:33 GMT+01:00 Stefan Agner <stefan@agner.ch>:
> ...
>>>> Anyway, I guess Ezequiel's fix is the right thing to do...
>>> Or, maybe we should put the stack in RAM as Uwe suggest?
>>>
>>
>> It seems the current fix would be used as-is (no need to
>> set a temporary stack before the SVC call).
>>
>> And on top of that, we need a to set a proper stack as soon
>> as the kernel takes control.
>>
>> Am I right?
> IMHO, yes, that's how I understand it.
>

After some deeper code parsing, it seems the stack
is set in __mmap_switched.

Although it seems that happens after __v7m_setup,
ARM noMMU has worked like this since ages now, is v7-M
really different to need a special treatment?
Russell King - ARM Linux Oct. 27, 2015, 10:40 p.m. UTC | #14
On Tue, Oct 27, 2015 at 10:20:20PM +0100, Uwe Kleine-König wrote:
> Hello Ezequiel,
> 
> On Tue, Oct 27, 2015 at 05:57:46PM -0300, Ezequiel Garcia wrote:
> > On 27 October 2015 at 17:21, Uwe Kleine-König
> > <u.kleine-koenig@pengutronix.de> wrote:
> > > Hello Ezequiel,
> > >
> > > On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
> > >> I've been trying to make my ARM v7-M LPC43xx board
> > >> boot a XIP kernel from flash. Currently, this seems
> > >> to be broken in mainline due to this:
> > >>
> > >> arch/arm/mm/proc-v7m.S
> > >> [..]
> > >>         @ SVC to run the kernel in this mode
> > >>         badr    r1, 1f
> > >>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
> > >>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
> > >>         mov     r6, lr                  @ save LR
> > >>         mov     r7, sp                  @ save SP
> > >>       ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
> > >
> > > How does this fail for you?
> > >
> > 
> > My CPU just seems to stall.
> > I've added calls to the printch in arch/arm/kernel/debug.S
> > and can't get past the SVC call.
> Can you try to add something like:
> 
> 	stmdb	sp!, {r0-r4}

Wait one moment, and try reading the code.

The call path to this point is: entry text at stext in head-nommu.S
which goes on to call __v7m_setup.

Nothing in that path sets up a stack, so the value of 'sp' is
undefined - it's entirely whatever's left in the register after the
boot loader has passed control to the kernel.

So, it's pointless proc-v7m.S saving and restoring the stack pointer
around this.  It also means that there probably isn't a reliable
stack here.
Uwe Kleine-König Oct. 28, 2015, 7:34 a.m. UTC | #15
Hello Russell,

On Tue, Oct 27, 2015 at 10:40:19PM +0000, Russell King - ARM Linux wrote:
> On Tue, Oct 27, 2015 at 10:20:20PM +0100, Uwe Kleine-König wrote:
> > Hello Ezequiel,
> > 
> > On Tue, Oct 27, 2015 at 05:57:46PM -0300, Ezequiel Garcia wrote:
> > > On 27 October 2015 at 17:21, Uwe Kleine-König
> > > <u.kleine-koenig@pengutronix.de> wrote:
> > > > Hello Ezequiel,
> > > >
> > > > On Sun, Oct 25, 2015 at 10:27:10PM -0300, Ezequiel Garcia wrote:
> > > >> I've been trying to make my ARM v7-M LPC43xx board
> > > >> boot a XIP kernel from flash. Currently, this seems
> > > >> to be broken in mainline due to this:
> > > >>
> > > >> arch/arm/mm/proc-v7m.S
> > > >> [..]
> > > >>         @ SVC to run the kernel in this mode
> > > >>         badr    r1, 1f
> > > >>         ldr     r5, [r12, #11 * 4]      @ read the SVC vector entry
> > > >>         str     r1, [r12, #11 * 4]      @ write the temporary SVC vector entry
> > > >>         mov     r6, lr                  @ save LR
> > > >>         mov     r7, sp                  @ save SP
> > > >>       ldr     sp, =__v7m_setup_stack_top @ <<< Breaks XIP!
> > > >
> > > > How does this fail for you?
> > > >
> > > 
> > > My CPU just seems to stall.
> > > I've added calls to the printch in arch/arm/kernel/debug.S
> > > and can't get past the SVC call.
> > Can you try to add something like:
> > 
> > 	stmdb	sp!, {r0-r4}
> 
> Wait one moment, and try reading the code.
> 
> The call path to this point is: entry text at stext in head-nommu.S
> which goes on to call __v7m_setup.
> 
> Nothing in that path sets up a stack, so the value of 'sp' is
> undefined - it's entirely whatever's left in the register after the
> boot loader has passed control to the kernel.

Yeah, that's what I found, too. With this test I want to find out if
it's really the write to ROM memory that makes the machine hang.

Best regards
Uwe
Uwe Kleine-König Oct. 28, 2015, 7:43 a.m. UTC | #16
Hello Ezequiel,

On Tue, Oct 27, 2015 at 07:08:46PM -0300, Ezequiel Garcia wrote:
> On 27 October 2015 at 18:52, Maxime Coquelin <mcoquelin.stm32@gmail.com> wrote:
> > 2015-10-27 22:46 GMT+01:00 Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>:
> >> On 27 October 2015 at 18:33, Maxime Coquelin <mcoquelin.stm32@gmail.com> wrote:
> >>> 2015-10-27 21:33 GMT+01:00 Stefan Agner <stefan@agner.ch>:
> > ...
> >>>> Anyway, I guess Ezequiel's fix is the right thing to do...
> >>> Or, maybe we should put the stack in RAM as Uwe suggest?
> >>>
> >>
> >> It seems the current fix would be used as-is (no need to
> >> set a temporary stack before the SVC call).
> >>
> >> And on top of that, we need a to set a proper stack as soon
> >> as the kernel takes control.
> >>
> >> Am I right?
> > IMHO, yes, that's how I understand it.
> >
> 
> After some deeper code parsing, it seems the stack
> is set in __mmap_switched.
> 
> Although it seems that happens after __v7m_setup,
> ARM noMMU has worked like this since ages now, is v7-M
> really different to need a special treatment?

Yeah, v7-M needs stack in it's setup function, while earlier no-MMU
machines don't. (I didn't check that though.)

So the current situation is that in svc something is written to
__v7m_setup_stack which doesn't seem to hurt in practise on most
machines, but on your's is provokes a hang.

With your patch the same something is written to whereever sp points to
when the kernel is entered. This is hardly better.

So the right fix is to move __v7m_setup_stack to .data I guess.

Best regards
Uwe
diff mbox

Patch

diff --git a/arch/arm/mm/proc-v7m.S b/arch/arm/mm/proc-v7m.S
index 67d9209077c6..6a383e619a0c 100644
--- a/arch/arm/mm/proc-v7m.S
+++ b/arch/arm/mm/proc-v7m.S
@@ -103,7 +103,6 @@  __v7m_setup:
 	str	r1, [r12, #11 * 4]	@ write the temporary SVC vector entry
 	mov	r6, lr			@ save LR
 	mov	r7, sp			@ save SP
-	ldr	sp, =__v7m_setup_stack_top
 	cpsie	i
 	svc	#0
 1:	cpsid	i
@@ -123,11 +122,6 @@  __v7m_setup:
 	ret	lr
 ENDPROC(__v7m_setup)
 
-	.align 2
-__v7m_setup_stack:
-	.space	4 * 8				@ 8 registers
-__v7m_setup_stack_top:
-
 	define_processor_functions v7m, dabort=nommu_early_abort, pabort=legacy_pabort, nommu=1
 
 	.section ".rodata"