diff mbox

do page fault in atomic bug on arm

Message ID 20171124155649.GT31757@n2100.armlinux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Russell King (Oracle) Nov. 24, 2017, 3:56 p.m. UTC
On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
> 
> >> [   53.302718] softirqs last  enabled at (11474): [<c034c5d0>] __do_softirq+0x280/0x5ac
> >> [   53.310494] softirqs last disabled at (11433): [<c034cc98>] irq_exit+0xf4/0x158
> >> [   53.317837] CPU: 0 PID: 1691 Comm: ftracetest Not tainted 4.9.55-dirty #1
> >> [   53.324652] Hardware name: Generic DRA74X (Flattened Device Tree)
> >> [   53.330857] [<c03114d8>] (unwind_backtrace) from [<c030cb18>] (show_stack+0x10/0x14)
> >> [   53.338644] [<c030cb18>] (show_stack) from [<c067e604>] (dump_stack+0xa4/0xd0)
> >> [   53.345908] [<c067e604>] (dump_stack) from [<c0373808>] (___might_sleep+0x1ac/0x2a0)
> >> [   53.353694] [<c0373808>] (___might_sleep) from [<c0d60ec8>] (do_page_fault+0x25c/0x428)
> >> [   53.361739] [<c0d60ec8>] (do_page_fault) from [<c03013e8>] (do_PrefetchAbort+0x38/0x9c)
> >> [   53.369780] [<c03013e8>] (do_PrefetchAbort) from [<c0d605a8>] (__pabt_svc+0x68/0xa0)
> >> [   53.377557] Exception stack(0xec6fbfa8 to 0xec6fbff0)
> >> [   53.382629] bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
> >> [   53.390845] bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
> >> [   53.399055] bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
> >> [   53.404137] [<c0d605a8>] (__pabt_svc) from [<b6e6d49c>] (0xb6e6d49c)
> > 
> > It also doesn't help that the backtrace stops at this point, and it looks
> > very strange:
> > 
> > 1. the value of PC looks like it's outside of the module space.
> > 2. the CPSR indicates that the CPU was in SVC mode in the parent context
> >    with IRQs disabled.
> > 3. We're right at the top of the kernel stack, which suggests no further
> >    stack frames above this.
> > 
> > We should never be in SVC mode without further stack frames on the kernel
> > stack.
> > 
> > We don't seem to have overflowed the kernel stack, as the thread info
> > seems correct - and it would also be unlikely that the saved SP value
> > would end in ff8 in the exception stack frame.
> 
> Hi Russell,
> 
> Sorry for response late!
> Is this SP was stained by sth? As my understand, SP should be times of
> 32bits. But why stack print out correct with a incorrect SP?

There's nothing wrong with SP.

Looking deeper at this, I think that the kernel stack somehow got
corrupted earlier:

irq event stamp: 12924
hardirqs last  enabled at (12923): [<c0307f10>] no_work_pending+0x4/0x30
hardirqs last disabled at (12924): [<c0d605a0>] __pabt_svc+0x60/0xa0

The hard IRQ disable is as a result of taking a prefetch abort in
SVC mode.  The saved context agrees with that:

                        R0       R1       R2       R3       R4       R5
bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
      R6       R7       R8       R9       R10      FP       IP       SP
bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
      LR       PC       PSR
bfe0: b6e6d49c b6e6d49c 40070093 ffffffff

The PSR lower 5 bits are 0x13, which is SVC mode.  Bit 7 set means IRQs
disabled.  The PC address was 0xb6e6d49c.

The last record we have of interrupts being enabled was in
no_work_pending, which is the exit path to usermode - but if we were
returning to usermode:

(a) how did we get into SVC mode instead
(b) why are interrupts disabled
(c) why was mm->mmap_sem still held

Can you try the following patch to try and catch the problem earlier?
I haven't tested it myself, and adding code may move things around in
the kernel and make this bug disappear.

Comments

Alex Shi Nov. 26, 2017, 2:58 p.m. UTC | #1
On 11/24/2017 11:56 PM, Russell King - ARM Linux wrote:
> On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
>>
>>>> [   53.302718] softirqs last  enabled at (11474): [<c034c5d0>] __do_softirq+0x280/0x5ac
>>>> [   53.310494] softirqs last disabled at (11433): [<c034cc98>] irq_exit+0xf4/0x158
>>>> [   53.317837] CPU: 0 PID: 1691 Comm: ftracetest Not tainted 4.9.55-dirty #1
>>>> [   53.324652] Hardware name: Generic DRA74X (Flattened Device Tree)
>>>> [   53.330857] [<c03114d8>] (unwind_backtrace) from [<c030cb18>] (show_stack+0x10/0x14)
>>>> [   53.338644] [<c030cb18>] (show_stack) from [<c067e604>] (dump_stack+0xa4/0xd0)
>>>> [   53.345908] [<c067e604>] (dump_stack) from [<c0373808>] (___might_sleep+0x1ac/0x2a0)
>>>> [   53.353694] [<c0373808>] (___might_sleep) from [<c0d60ec8>] (do_page_fault+0x25c/0x428)
>>>> [   53.361739] [<c0d60ec8>] (do_page_fault) from [<c03013e8>] (do_PrefetchAbort+0x38/0x9c)
>>>> [   53.369780] [<c03013e8>] (do_PrefetchAbort) from [<c0d605a8>] (__pabt_svc+0x68/0xa0)
>>>> [   53.377557] Exception stack(0xec6fbfa8 to 0xec6fbff0)
>>>> [   53.382629] bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
>>>> [   53.390845] bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
>>>> [   53.399055] bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
>>>> [   53.404137] [<c0d605a8>] (__pabt_svc) from [<b6e6d49c>] (0xb6e6d49c)
>>>
>>> It also doesn't help that the backtrace stops at this point, and it looks
>>> very strange:
>>>
>>> 1. the value of PC looks like it's outside of the module space.
>>> 2. the CPSR indicates that the CPU was in SVC mode in the parent context
>>>    with IRQs disabled.
>>> 3. We're right at the top of the kernel stack, which suggests no further
>>>    stack frames above this.
>>>
>>> We should never be in SVC mode without further stack frames on the kernel
>>> stack.
>>>
>>> We don't seem to have overflowed the kernel stack, as the thread info
>>> seems correct - and it would also be unlikely that the saved SP value
>>> would end in ff8 in the exception stack frame.
>>
>> Hi Russell,
>>
>> Sorry for response late!
>> Is this SP was stained by sth? As my understand, SP should be times of
>> 32bits. But why stack print out correct with a incorrect SP?
> 
> There's nothing wrong with SP.

Got it. Thanks a lot!
> 
> Looking deeper at this, I think that the kernel stack somehow got
> corrupted earlier:
> 
> irq event stamp: 12924
> hardirqs last  enabled at (12923): [<c0307f10>] no_work_pending+0x4/0x30
> hardirqs last disabled at (12924): [<c0d605a0>] __pabt_svc+0x60/0xa0
> 
> The hard IRQ disable is as a result of taking a prefetch abort in
> SVC mode.  The saved context agrees with that:
> 
>                         R0       R1       R2       R3       R4       R5
> bfa0:                   00000001 00000001 ffffffff 00000000 0010ac68 00000007
>       R6       R7       R8       R9       R10      FP       IP       SP
> bfc0: 00000001 0000003f 00000009 0000000c fffffffa be9d27a4 000e31fc ec6fbff8
>       LR       PC       PSR
> bfe0: b6e6d49c b6e6d49c 40070093 ffffffff
> 
> The PSR lower 5 bits are 0x13, which is SVC mode.  Bit 7 set means IRQs
> disabled.  The PC address was 0xb6e6d49c.
> 

Thank you very much for detailed explanations! :)

> The last record we have of interrupts being enabled was in
> no_work_pending, which is the exit path to usermode - but if we were
> returning to usermode:
> 
> (a) how did we get into SVC mode instead

We return to user mode correctly in no_work_pending, the irq enabled
then we get a pabt_svc?

> (b) why are interrupts disabled

it was disabled in __pabt_svc, objdump show a extra irq disable with arm
defconfig.
...
 2a0:   ebfffffe        bl      0 <v7_pabort>
 2a4:   f10c0080        cpsid   i
...
> (c) why was mm->mmap_sem still held

the bug was caught here, just with mmap_sem, the pabt_svc happens.
> 
> Can you try the following patch to try and catch the problem earlier?
> I haven't tested it myself, and adding code may move things around in
> the kernel and make this bug disappear.

Do we still need to try this patch? I saw you tried it and do further more.
> 
> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> index d523cd8439a3..ff577177b286 100644
> --- a/arch/arm/kernel/entry-header.S
> +++ b/arch/arm/kernel/entry-header.S
> @@ -299,6 +299,8 @@
>  	@ ARM mode restore
>  	mov	r2, sp
>  	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
> +	tst	r1, #0xcf
> +	bne	oops
>  	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
>  	msr	spsr_cxsf, r1			@ save in spsr_svc
>  #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
> @@ -314,6 +316,15 @@
>  						@ after ldm {}^
>  	add	sp, sp, #\offset + PT_REGS_SIZE
>  	movs	pc, lr				@ return & move spsr_svc into cpsr
> +oops:	.word	0xe7f001f2

This oops cause allnoconfig with arm failed in build. but it's fine for
a multi_v7_defconfig
arch/arm/kernel/entry-common.S:106: Error: symbol `oops' is already defined

> +	.pushsection .rodata.str, "aMS", %progbits, 1
> +2:	.asciz	"Returning to usermode but unexpected PSR bits set?"
> +	.popsection
> +	.pushsection __bug_table, "aw"
> +	.align	2
> +	.word	oops, 2b
> +	.hword	\@
> +	.popsection
>  #elif defined(CONFIG_CPU_V7M)
>  	@ V7M restore.
>  	@ Note that we don't need to do clrex here as clearing the local
> 
>
Alex Shi Nov. 26, 2017, 3:23 p.m. UTC | #2
Sorry, this patch should be passed off, since we got your new fixing
patches. I shouldn't mentioned here.

Regards
Alex

On 11/26/2017 10:58 PM, Alex Shi wrote:
>> Can you try the following patch to try and catch the problem earlier?
>> I haven't tested it myself, and adding code may move things around in
>> the kernel and make this bug disappear.
> Do we still need to try this patch? I saw you tried it and do further more.
>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>> index d523cd8439a3..ff577177b286 100644
>> --- a/arch/arm/kernel/entry-header.S
>> +++ b/arch/arm/kernel/entry-header.S
>> @@ -299,6 +299,8 @@
>>  	@ ARM mode restore
>>  	mov	r2, sp
>>  	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
>> +	tst	r1, #0xcf
>> +	bne	oops
>>  	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
>>  	msr	spsr_cxsf, r1			@ save in spsr_svc
>>  #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
>> @@ -314,6 +316,15 @@
>>  						@ after ldm {}^
>>  	add	sp, sp, #\offset + PT_REGS_SIZE
>>  	movs	pc, lr				@ return & move spsr_svc into cpsr
>> +oops:	.word	0xe7f001f2
> This oops cause allnoconfig with arm failed in build. but it's fine for
> a multi_v7_defconfig
> arch/arm/kernel/entry-common.S:106: Error: symbol `oops' is already defined
> 
>> +	.pushsection .rodata.str, "aMS", %progbits, 1
>> +2:	.asciz	"Returning to usermode but unexpected PSR bits set?"
>> +	.popsection
>> +	.pushsection __bug_table, "aw"
>> +	.align	2
>> +	.word	oops, 2b
>> +	.hword	\@
>> +	.popsection
>>  #elif defined(CONFIG_CPU_V7M)
>>  	@ V7M restore.
>>  	@ Note that we don't need to do clrex here as clearing the local
diff mbox

Patch

diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index d523cd8439a3..ff577177b286 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -299,6 +299,8 @@ 
 	@ ARM mode restore
 	mov	r2, sp
 	ldr	r1, [r2, #\offset + S_PSR]	@ get calling cpsr
+	tst	r1, #0xcf
+	bne	oops
 	ldr	lr, [r2, #\offset + S_PC]!	@ get pc
 	msr	spsr_cxsf, r1			@ save in spsr_svc
 #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
@@ -314,6 +316,15 @@ 
 						@ after ldm {}^
 	add	sp, sp, #\offset + PT_REGS_SIZE
 	movs	pc, lr				@ return & move spsr_svc into cpsr
+oops:	.word	0xe7f001f2
+	.pushsection .rodata.str, "aMS", %progbits, 1
+2:	.asciz	"Returning to usermode but unexpected PSR bits set?"
+	.popsection
+	.pushsection __bug_table, "aw"
+	.align	2
+	.word	oops, 2b
+	.hword	\@
+	.popsection
 #elif defined(CONFIG_CPU_V7M)
 	@ V7M restore.
 	@ Note that we don't need to do clrex here as clearing the local