diff mbox series

[1/4] arm64: Do not enable IRQs for ct_user_exit

Message ID 1555424556-46023-2-git-send-email-julien.thierry@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: Fix save/restore with IRQ priority masking | expand

Commit Message

Julien Thierry April 16, 2019, 2:22 p.m. UTC
For el0_dbg and el0_error, DAIF bits get explicitly cleared before
calling ct_user_exit.

When context tracking is disabled, DAIF gets set (almost) immediately
after. When context tracking is enabled, among the first things done
is disabling IRQs.

What is actually needed is:
- PSR.D = 0 so the system can be debugged (should be already the case)
- PSR.A = 0 so async error can be handled during context tracking

Do not clear PSR.I in those two locations.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc:Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/entry.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--
1.9.1

Comments

James Morse April 25, 2019, 10:21 a.m. UTC | #1
Hi Julien,

On 16/04/2019 15:22, Julien Thierry wrote:
> For el0_dbg and el0_error, DAIF bits get explicitly cleared before
> calling ct_user_exit.
> 
> When context tracking is disabled, DAIF gets set (almost) immediately
> after.

When we end up in kernel_exit,


> When context tracking is enabled, among the first things done
> is disabling IRQs.

Is this somewhere different?


> What is actually needed is:
> - PSR.D = 0 so the system can be debugged (should be already the case)
> - PSR.A = 0 so async error can be handled during context tracking
> 
> Do not clear PSR.I in those two locations.

Is the motivation here to avoid the time it takes to toggle the I bit, given we're going
to shortly toggle it back? (I would have thought that would get lost in the noise of
togging the other three bits)

Or is it because ct_user_exit should be called with interrupts masked?


If its the later, this is a wider bug we've always had!
746647c75afb switches a fist full of enable_dbg_and_irq to enable_daif.
All the other el0_* sites would need their enable_daif/ct_user_exit calls flipping.


...

It looks like ct_user_exit should be called before we unmask interrupts. It gives RCU a
poke (so interrupts that occur before it may make suspicious use of RCU), and its
modifying per-cpu variables. If we pre-empted and switched to another thread that entered
user-space it would look like we'd entered user-space twice.


> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index c50a7a7..6a38903 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -855,7 +855,7 @@ el0_dbg:
>  	mov	x1, x25
>  	mov	x2, sp
>  	bl	do_debug_exception
> -	enable_daif
> +	enable_da_f
>  	ct_user_exit
>  	b	ret_to_user

If ct_user_exit needs to be called with interrupts masked:
Fixes: 746647c75afb (arm64: entry.S convert el0_sync"")


> @@ -907,7 +907,7 @@ el0_error_naked:
>  	enable_dbg
>  	mov	x0, sp
>  	bl	do_serror
> -	enable_daif
> +	enable_da_f
>  	ct_user_exit
>  	b	ret_to_user
>  ENDPROC(el0_error)

If ct_user_exit needs to be called with interrupts masked:
Fixes: a92d4d1454ab ("arm64: entry.S: move SError handling into a C function for future
expansion")


Thanks,

James
Julien Thierry April 29, 2019, 7:21 a.m. UTC | #2
Hi James,

On 25/04/2019 11:21, James Morse wrote:
> Hi Julien,
> 
> On 16/04/2019 15:22, Julien Thierry wrote:
>> For el0_dbg and el0_error, DAIF bits get explicitly cleared before
>> calling ct_user_exit.
>>
>> When context tracking is disabled, DAIF gets set (almost) immediately
>> after.
> 
> When we end up in kernel_exit,
> 
> 
>> When context tracking is enabled, among the first things done
>> is disabling IRQs.
> 
> Is this somewhere different?
> 
> 
>> What is actually needed is:
>> - PSR.D = 0 so the system can be debugged (should be already the case)
>> - PSR.A = 0 so async error can be handled during context tracking
>>
>> Do not clear PSR.I in those two locations.
> 
> Is the motivation here to avoid the time it takes to toggle the I bit, given we're going
> to shortly toggle it back? (I would have thought that would get lost in the noise of
> togging the other three bits)
> 
> Or is it because ct_user_exit should be called with interrupts masked?
> 

I must admit my main motivation was trying to get that user_exit in a
consistent state so I don't need to toggle PMR values too often once I
introduce the new values (in patch 3).

I don't expect performance improvement from this patch for not setting
the I bit while setting the other two. It just didn't make much sense to
me that the I bit would get cleared at those sites.

> 
> If its the later, this is a wider bug we've always had!
> 746647c75afb switches a fist full of enable_dbg_and_irq to enable_daif.
> All the other el0_* sites would need their enable_daif/ct_user_exit calls flipping.
> 
> 
> ...
> 
> It looks like ct_user_exit should be called before we unmask interrupts. It gives RCU a
> poke (so interrupts that occur before it may make suspicious use of RCU), and its
> modifying per-cpu variables. If we pre-empted and switched to another thread that entered
> user-space it would look like we'd entered user-space twice.
> 

If that is the case, I can add the fixes tag, but I might need to take a
deeper look into it.

Thanks for the pointers.

> 
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index c50a7a7..6a38903 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -855,7 +855,7 @@ el0_dbg:
>>  	mov	x1, x25
>>  	mov	x2, sp
>>  	bl	do_debug_exception
>> -	enable_daif
>> +	enable_da_f
>>  	ct_user_exit
>>  	b	ret_to_user
> 
> If ct_user_exit needs to be called with interrupts masked:
> Fixes: 746647c75afb (arm64: entry.S convert el0_sync"")
> 
> 
>> @@ -907,7 +907,7 @@ el0_error_naked:
>>  	enable_dbg
>>  	mov	x0, sp
>>  	bl	do_serror
>> -	enable_daif
>> +	enable_da_f
>>  	ct_user_exit
>>  	b	ret_to_user
>>  ENDPROC(el0_error)
> 
> If ct_user_exit needs to be called with interrupts masked:
> Fixes: a92d4d1454ab ("arm64: entry.S: move SError handling into a C function for future
> expansion")
> 
> 

Cheers,
diff mbox series

Patch

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index c50a7a7..6a38903 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -855,7 +855,7 @@  el0_dbg:
 	mov	x1, x25
 	mov	x2, sp
 	bl	do_debug_exception
-	enable_daif
+	enable_da_f
 	ct_user_exit
 	b	ret_to_user
 el0_inv:
@@ -907,7 +907,7 @@  el0_error_naked:
 	enable_dbg
 	mov	x0, sp
 	bl	do_serror
-	enable_daif
+	enable_da_f
 	ct_user_exit
 	b	ret_to_user
 ENDPROC(el0_error)