diff mbox series

[v3,1/6] xen/arm: Skip initializing the BSS section when it is empty

Message ID 20241010140351.309922-2-ayan.kumar.halder@amd.com (mailing list archive)
State Superseded
Headers show
Series Enable early bootup of AArch64 MPU systems | expand

Commit Message

Ayan Kumar Halder Oct. 10, 2024, 2:03 p.m. UTC
If the BSS section is empty, then the function can just return.

Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
---
Changes from :-

v1..v2 - New patch introduced in v3. 

 xen/arch/arm/arm64/head.S | 2 ++
 1 file changed, 2 insertions(+)

Comments

Frediano Ziglio Oct. 10, 2024, 2:43 p.m. UTC | #1
On Thu, Oct 10, 2024 at 3:05 PM Ayan Kumar Halder
<ayan.kumar.halder@amd.com> wrote:
>
> If the BSS section is empty, then the function can just return.
>
> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
> ---
> Changes from :-
>
> v1..v2 - New patch introduced in v3.
>
>  xen/arch/arm/arm64/head.S | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 14c3720d80..72c7b24498 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -346,6 +346,8 @@ FUNC_LOCAL(zero_bss)
>          PRINT("- Zero BSS -\r\n")
>          ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
>          ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
> +        cmp   x1, x0
> +        beq   skip_bss
>
>  1:      str   xzr, [x0], #8
>          cmp   x0, x1

Why not just transforming the "do while" loop into a "while" one and
just jump to cmp?

Something like (not tested)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 14c3720d80..987f243578 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -346,9 +346,10 @@ FUNC_LOCAL(zero_bss)
        PRINT("- Zero BSS -\r\n")
        ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
        ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
+        b     2f

1:      str   xzr, [x0], #8
-        cmp   x0, x1
+2:      cmp   x0, x1
        b.lo  1b

skip_bss:

Frediano
Ayan Kumar Halder Oct. 10, 2024, 5:29 p.m. UTC | #2
Hi Frediano,

Appreciate your feedback.

On 10/10/2024 15:43, Frediano Ziglio wrote:
> On Thu, Oct 10, 2024 at 3:05 PM Ayan Kumar Halder
> <ayan.kumar.halder@amd.com> wrote:
>> If the BSS section is empty, then the function can just return.
>>
>> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
>> ---
>> Changes from :-
>>
>> v1..v2 - New patch introduced in v3.
>>
>>   xen/arch/arm/arm64/head.S | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>> index 14c3720d80..72c7b24498 100644
>> --- a/xen/arch/arm/arm64/head.S
>> +++ b/xen/arch/arm/arm64/head.S
>> @@ -346,6 +346,8 @@ FUNC_LOCAL(zero_bss)
>>           PRINT("- Zero BSS -\r\n")
>>           ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
>>           ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
>> +        cmp   x1, x0
>> +        beq   skip_bss
>>
>>   1:      str   xzr, [x0], #8
>>           cmp   x0, x1
> Why not just transforming the "do while" loop into a "while" one and
> just jump to cmp?
>
> Something like (not tested)
>
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 14c3720d80..987f243578 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -346,9 +346,10 @@ FUNC_LOCAL(zero_bss)
>          PRINT("- Zero BSS -\r\n")
>          ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
>          ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
> +        b     2f
>
> 1:      str   xzr, [x0], #8
> -        cmp   x0, x1
> +2:      cmp   x0, x1
>          b.lo  1b

I am not really sure if this implementation is better than the previous 
one. The drawback of my implementation is that we have an extra 'cmp' 
instruction. The drawback of this implementation is that we need an 
extra label and there is an un-conditional branch after ldr (readability 
is difficult). May be I am biased. :)

How does this look ?

FUNC_LOCAL(zero_bss)
         /* Zero BSS only when requested */
         cbnz  x26, skip_bss

         PRINT("- Zero BSS -\r\n")
         ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
         ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end) */
1:     cmp   x1, x0
         beq   skip_bss

         str   xzr, [x0], #8
         b     1b

skip_bss:
         ret
END(zero_bss)

- Ayan
Frediano Ziglio Oct. 10, 2024, 9:31 p.m. UTC | #3
On Thu, Oct 10, 2024 at 6:29 PM Ayan Kumar Halder <ayankuma@amd.com> wrote:
>
> Hi Frediano,
>
> Appreciate your feedback.
>
> On 10/10/2024 15:43, Frediano Ziglio wrote:
> > On Thu, Oct 10, 2024 at 3:05 PM Ayan Kumar Halder
> > <ayan.kumar.halder@amd.com> wrote:
> >> If the BSS section is empty, then the function can just return.
> >>
> >> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
> >> ---
> >> Changes from :-
> >>
> >> v1..v2 - New patch introduced in v3.
> >>
> >>   xen/arch/arm/arm64/head.S | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> >> index 14c3720d80..72c7b24498 100644
> >> --- a/xen/arch/arm/arm64/head.S
> >> +++ b/xen/arch/arm/arm64/head.S
> >> @@ -346,6 +346,8 @@ FUNC_LOCAL(zero_bss)
> >>           PRINT("- Zero BSS -\r\n")
> >>           ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
> >>           ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
> >> +        cmp   x1, x0
> >> +        beq   skip_bss
> >>
> >>   1:      str   xzr, [x0], #8
> >>           cmp   x0, x1
> > Why not just transforming the "do while" loop into a "while" one and
> > just jump to cmp?
> >
> > Something like (not tested)
> >
> > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > index 14c3720d80..987f243578 100644
> > --- a/xen/arch/arm/arm64/head.S
> > +++ b/xen/arch/arm/arm64/head.S
> > @@ -346,9 +346,10 @@ FUNC_LOCAL(zero_bss)
> >          PRINT("- Zero BSS -\r\n")
> >          ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
> >          ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
> > +        b     2f
> >
> > 1:      str   xzr, [x0], #8
> > -        cmp   x0, x1
> > +2:      cmp   x0, x1
> >          b.lo  1b
>
> I am not really sure if this implementation is better than the previous
> one. The drawback of my implementation is that we have an extra 'cmp'
> instruction. The drawback of this implementation is that we need an
> extra label and there is an un-conditional branch after ldr (readability
> is difficult). May be I am biased. :)
>
> How does this look ?
>
> FUNC_LOCAL(zero_bss)
>          /* Zero BSS only when requested */
>          cbnz  x26, skip_bss
>
>          PRINT("- Zero BSS -\r\n")
>          ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
>          ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end) */
> 1:     cmp   x1, x0
>          beq   skip_bss
>
>          str   xzr, [x0], #8
>          b     1b
>
> skip_bss:
>          ret
> END(zero_bss)
>
> - Ayan
>

Not strong about. Your first version was fine too. I suppose just
personal preferences, being not a hard path if you think first version
was more readable fine with it. Between your 2 versions, I prefer the
first. It has fewer instructions in the loop. Probably I've seen too
much compiler generated code (jumping to the condition at the end is
common there) that for me, it got more readable.

Frediano
Frediano Ziglio Oct. 10, 2024, 9:32 p.m. UTC | #4
On Thu, Oct 10, 2024 at 3:05 PM Ayan Kumar Halder
<ayan.kumar.halder@amd.com> wrote:
>
> If the BSS section is empty, then the function can just return.
>
> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
> ---
> Changes from :-
>
> v1..v2 - New patch introduced in v3.
>
>  xen/arch/arm/arm64/head.S | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 14c3720d80..72c7b24498 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -346,6 +346,8 @@ FUNC_LOCAL(zero_bss)
>          PRINT("- Zero BSS -\r\n")
>          ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
>          ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
> +        cmp   x1, x0
> +        beq   skip_bss
>
>  1:      str   xzr, [x0], #8
>          cmp   x0, x1

Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>

Frediano
Luca Fancellu Oct. 14, 2024, 6:41 p.m. UTC | #5
Hi Ayan,

> On 10 Oct 2024, at 15:03, Ayan Kumar Halder <ayan.kumar.halder@amd.com> wrote:
> 
> If the BSS section is empty, then the function can just return.
> 
> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>

Reviewed-by: Luca Fancellu <luca.fancellu@arm.com <mailto:luca.fancellu@arm.com>>
Julien Grall Oct. 18, 2024, 1:41 p.m. UTC | #6
Hi Ayan,

On 10/10/2024 15:03, Ayan Kumar Halder wrote:
> If the BSS section is empty, then the function can just return.

This is more than "can", right? If we don't do it, we will end up to 
zero outside of BSS. This could be critical data...

Also, I am tempted to suggest to add a Fixes tag because even if it is 
unlikely BSS will be zero in the current Xen, it is also not unlikely.

The tag would be:

Fixes: dac84b66cc9a ("xen: arm64: initial build + config changes, start 
of day code")

> 
> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>

I saw the discussion. I don't have a strong opinion on the exact 
approach choosen for zeroing. With the commit message updated:

Acked-by: Julien Grall <jgrall@amazon.com>

> ---
> Changes from :-
> 
> v1..v2 - New patch introduced in v3.
> 
>   xen/arch/arm/arm64/head.S | 2 ++

Don't we need a similar change on the arm32 code?

>   1 file changed, 2 insertions(+)
> 
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 14c3720d80..72c7b24498 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -346,6 +346,8 @@ FUNC_LOCAL(zero_bss)
>           PRINT("- Zero BSS -\r\n")
>           ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
>           ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
> +        cmp   x1, x0
> +        beq   skip_bss
>   
>   1:      str   xzr, [x0], #8
>           cmp   x0, x1

Cheers,
Ayan Kumar Halder Oct. 21, 2024, 11:12 a.m. UTC | #7
On 18/10/2024 14:41, Julien Grall wrote:
> Hi Ayan,
Hi Julien,
>
> On 10/10/2024 15:03, Ayan Kumar Halder wrote:
>> If the BSS section is empty, then the function can just return.
>
> This is more than "can", right? If we don't do it, we will end up to 
> zero outside of BSS. This could be critical data...
s/can just/should
>
> Also, I am tempted to suggest to add a Fixes tag because even if it is 
> unlikely BSS will be zero in the current Xen, it is also not unlikely.
>
> The tag would be:
>
> Fixes: dac84b66cc9a ("xen: arm64: initial build + config changes, 
> start of day code")
Ack.
>
>>
>> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
>
> I saw the discussion. I don't have a strong opinion on the exact 
> approach choosen for zeroing. With the commit message updated:
>
> Acked-by: Julien Grall <jgrall@amazon.com>

I propose that this patch to be committed. The changes to the commit 
message can be done on commit.

>
>> ---
>> Changes from :-
>>
>> v1..v2 - New patch introduced in v3.
>>
>>   xen/arch/arm/arm64/head.S | 2 ++
>
> Don't we need a similar change on the arm32 code?

I haven't looked at the arm32 code. My idea is to get the earlyboot (ie 
the asm part) of Xen working on R82 and then do the similar changes for 
R52 (ie arm32).

So that when we start plumbing the lateboot (ie start_xen() onwards), we 
can test it for both R82 + R52 as a significant part of the code will be 
common.

Let me know if this sounds ok.

- Ayan

>
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>> index 14c3720d80..72c7b24498 100644
>> --- a/xen/arch/arm/arm64/head.S
>> +++ b/xen/arch/arm/arm64/head.S
>> @@ -346,6 +346,8 @@ FUNC_LOCAL(zero_bss)
>>           PRINT("- Zero BSS -\r\n")
>>           ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
>>           ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
>> +        cmp   x1, x0
>> +        beq   skip_bss
>>     1:      str   xzr, [x0], #8
>>           cmp   x0, x1
>
> Cheers,
>
Julien Grall Oct. 21, 2024, 3:24 p.m. UTC | #8
Hi Ayan,

On 21/10/2024 12:12, Ayan Kumar Halder wrote:
> 
> On 18/10/2024 14:41, Julien Grall wrote:
>> Hi Ayan,
> Hi Julien,
>>
>> On 10/10/2024 15:03, Ayan Kumar Halder wrote:
>>> If the BSS section is empty, then the function can just return.
>>
>> This is more than "can", right? If we don't do it, we will end up to 
>> zero outside of BSS. This could be critical data...
> s/can just/should
>>
>> Also, I am tempted to suggest to add a Fixes tag because even if it is 
>> unlikely BSS will be zero in the current Xen, it is also not unlikely.
>>
>> The tag would be:
>>
>> Fixes: dac84b66cc9a ("xen: arm64: initial build + config changes, 
>> start of day code")
> Ack.
>>
>>>
>>> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
>>
>> I saw the discussion. I don't have a strong opinion on the exact 
>> approach choosen for zeroing. With the commit message updated:
>>
>> Acked-by: Julien Grall <jgrall@amazon.com>
> 
> I propose that this patch to be committed. The changes to the commit 
> message can be done on commit.
> 
>>
>>> ---
>>> Changes from :-
>>>
>>> v1..v2 - New patch introduced in v3.
>>>
>>>   xen/arch/arm/arm64/head.S | 2 ++
>>
>> Don't we need a similar change on the arm32 code?
> 
> I haven't looked at the arm32 code. My idea is to get the earlyboot (ie 
> the asm part) of Xen working on R82 and then do the similar changes for 
> R52 (ie arm32).

AFAIU, this change is not related to the MPU. So I would rather prefer 
if we keep this change in sync. I am happy to send a patch for it if you 
don't have time.

Cheers,
Ayan Kumar Halder Oct. 21, 2024, 3:38 p.m. UTC | #9
On 21/10/2024 16:24, Julien Grall wrote:
> Hi Ayan,
Hi Julien,
>
> On 21/10/2024 12:12, Ayan Kumar Halder wrote:
>>
>> On 18/10/2024 14:41, Julien Grall wrote:
>>> Hi Ayan,
>> Hi Julien,
>>>
>>> On 10/10/2024 15:03, Ayan Kumar Halder wrote:
>>>> If the BSS section is empty, then the function can just return.
>>>
>>> This is more than "can", right? If we don't do it, we will end up to 
>>> zero outside of BSS. This could be critical data...
>> s/can just/should
>>>
>>> Also, I am tempted to suggest to add a Fixes tag because even if it 
>>> is unlikely BSS will be zero in the current Xen, it is also not 
>>> unlikely.
>>>
>>> The tag would be:
>>>
>>> Fixes: dac84b66cc9a ("xen: arm64: initial build + config changes, 
>>> start of day code")
>> Ack.
>>>
>>>>
>>>> Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
>>>
>>> I saw the discussion. I don't have a strong opinion on the exact 
>>> approach choosen for zeroing. With the commit message updated:
>>>
>>> Acked-by: Julien Grall <jgrall@amazon.com>
>>
>> I propose that this patch to be committed. The changes to the commit 
>> message can be done on commit.
>>
>>>
>>>> ---
>>>> Changes from :-
>>>>
>>>> v1..v2 - New patch introduced in v3.
>>>>
>>>>   xen/arch/arm/arm64/head.S | 2 ++
>>>
>>> Don't we need a similar change on the arm32 code?
>>
>> I haven't looked at the arm32 code. My idea is to get the earlyboot 
>> (ie the asm part) of Xen working on R82 and then do the similar 
>> changes for R52 (ie arm32).
>
> AFAIU, this change is not related to the MPU. So I would rather prefer 
> if we keep this change in sync. I am happy to send a patch for it if 
> you don't have time.

No worries, I can do it.

I will include the arm32 changes in this patch. So that we have a single 
patch covering the issue for both arm64 and arm32.

- Ayan
diff mbox series

Patch

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 14c3720d80..72c7b24498 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -346,6 +346,8 @@  FUNC_LOCAL(zero_bss)
         PRINT("- Zero BSS -\r\n")
         ldr   x0, =__bss_start       /* x0 := vaddr(__bss_start) */
         ldr   x1, =__bss_end         /* x1 := vaddr(__bss_end)   */
+        cmp   x1, x0
+        beq   skip_bss
 
 1:      str   xzr, [x0], #8
         cmp   x0, x1