diff mbox

Failed to boot ARM64 boards for recent linux-next

Message ID 077cecdd-7982-dd33-454f-e38cc571366c@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marc Zyngier March 20, 2018, 9:01 a.m. UTC
Hi Shawn,

On 20/03/18 08:48, Shawn Lin wrote:
> Hi Marc,
> 
>      I was able to boot my RK3399 board with in linux-next-20180314,
> but not today. My bisect robot shows me it was introduced by
> 
> commit d6062a6d62c643a06c393745d032da3e6441d4bd
> Author: Marc Zyngier <marc.zyngier@arm.com>
> Date:   Fri Mar 9 14:53:19 2018 +0000
> 
>      irqchip/gic-v3: Reset APgRn registers at boot time
> 
>      Booting a crash kernel while in an interrupt handler is likely
>      to leave the Active Priority Registers with some state that
>      is not relevant to the new kernel, and is likely to lead
>      to erratic behaviours such as interrupts not firing as their
>      priority is already active.
> 
>      As a sanity measure, wipe the APRs clean on startup. We make
>      sure to wipe both group 0 and 1 registers in order to avoid
>      any surprise.
> 
> 
> The panic log is here:
> https://paste.ubuntu.com/p/7WrJJDG6JQ/
> 
> Is it a known issue or is there a coming patch for that?

 Interesting. No, that wasn't the intention, but I may have missed a key
detail (group 0 access traps to EL3 if SCR_EL3.FIQ==1). Can you have a
go at the following hack, just to narrow it down:

Let me know if that helps.

Thanks,

	M.

Comments

Shawn Lin March 20, 2018, 9:32 a.m. UTC | #1
Hi Marc,

On 2018/3/20 17:01, Marc Zyngier wrote:
> Hi Shawn,
> 
> On 20/03/18 08:48, Shawn Lin wrote:
>> Hi Marc,
>>
>>       I was able to boot my RK3399 board with in linux-next-20180314,
>> but not today. My bisect robot shows me it was introduced by
>>
>> commit d6062a6d62c643a06c393745d032da3e6441d4bd
>> Author: Marc Zyngier <marc.zyngier@arm.com>
>> Date:   Fri Mar 9 14:53:19 2018 +0000
>>
>>       irqchip/gic-v3: Reset APgRn registers at boot time
>>
>>       Booting a crash kernel while in an interrupt handler is likely
>>       to leave the Active Priority Registers with some state that
>>       is not relevant to the new kernel, and is likely to lead
>>       to erratic behaviours such as interrupts not firing as their
>>       priority is already active.
>>
>>       As a sanity measure, wipe the APRs clean on startup. We make
>>       sure to wipe both group 0 and 1 registers in order to avoid
>>       any surprise.
>>
>>
>> The panic log is here:
>> https://paste.ubuntu.com/p/7WrJJDG6JQ/
>>
>> Is it a known issue or is there a coming patch for that?
> 
>   Interesting. No, that wasn't the intention, but I may have missed a key
> detail (group 0 access traps to EL3 if SCR_EL3.FIQ==1). Can you have a
> go at the following hack, just to narrow it down:
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 5bb7bb22f1c1..f8ff43b1d4f8 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -570,16 +570,12 @@ static void gic_cpu_sys_reg_init(void)
>   	switch(val + 1) {
>   	case 8:
>   	case 7:
> -		write_gicreg(0, ICC_AP0R3_EL1);
>   		write_gicreg(0, ICC_AP1R3_EL1);
> -		write_gicreg(0, ICC_AP0R2_EL1);
>   		write_gicreg(0, ICC_AP1R2_EL1);
>   	case 6:
> -		write_gicreg(0, ICC_AP0R1_EL1);
>   		write_gicreg(0, ICC_AP1R1_EL1);
>   	case 5:
>   	case 4:
> -		write_gicreg(0, ICC_AP0R0_EL1);
>   		write_gicreg(0, ICC_AP1R0_EL1);
>   	}
> 
> Let me know if that helps.
> 

It works for me. Thanks!

> Thanks,
> 
> 	M.
>
Shawn Lin March 20, 2018, 9:39 a.m. UTC | #2
Hi Marc

On 2018/3/20 17:32, Shawn Lin wrote:
> Hi Marc,
> 
> On 2018/3/20 17:01, Marc Zyngier wrote:
>> Hi Shawn,
>>
>> On 20/03/18 08:48, Shawn Lin wrote:
>>> Hi Marc,
>>>
>>>       I was able to boot my RK3399 board with in linux-next-20180314,
>>> but not today. My bisect robot shows me it was introduced by
>>>
>>> commit d6062a6d62c643a06c393745d032da3e6441d4bd
>>> Author: Marc Zyngier <marc.zyngier@arm.com>
>>> Date:   Fri Mar 9 14:53:19 2018 +0000
>>>
>>>       irqchip/gic-v3: Reset APgRn registers at boot time
>>>
>>>       Booting a crash kernel while in an interrupt handler is likely
>>>       to leave the Active Priority Registers with some state that
>>>       is not relevant to the new kernel, and is likely to lead
>>>       to erratic behaviours such as interrupts not firing as their
>>>       priority is already active.
>>>
>>>       As a sanity measure, wipe the APRs clean on startup. We make
>>>       sure to wipe both group 0 and 1 registers in order to avoid
>>>       any surprise.
>>>
>>>
>>> The panic log is here:
>>> https://paste.ubuntu.com/p/7WrJJDG6JQ/
>>>
>>> Is it a known issue or is there a coming patch for that?
>>
>>   Interesting. No, that wasn't the intention, but I may have missed a key
>> detail (group 0 access traps to EL3 if SCR_EL3.FIQ==1). Can you have a
>> go at the following hack, just to narrow it down:
>>
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index 5bb7bb22f1c1..f8ff43b1d4f8 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -570,16 +570,12 @@ static void gic_cpu_sys_reg_init(void)
>>       switch(val + 1) {
>>       case 8:
>>       case 7:
>> -        write_gicreg(0, ICC_AP0R3_EL1);
>>           write_gicreg(0, ICC_AP1R3_EL1);
>> -        write_gicreg(0, ICC_AP0R2_EL1);
>>           write_gicreg(0, ICC_AP1R2_EL1);
>>       case 6:
>> -        write_gicreg(0, ICC_AP0R1_EL1);
>>           write_gicreg(0, ICC_AP1R1_EL1);
>>       case 5:
>>       case 4:
>> -        write_gicreg(0, ICC_AP0R0_EL1);
>>           write_gicreg(0, ICC_AP1R0_EL1);
>>       }
>>
>> Let me know if that helps.
>>
> 
> It works for me. Thanks!


Also another patch warns a lot when booting the kernel. Is there
anything else I could do to let it go? Seems I am using broken
dts for requesting IRQ_TYPE_NONE there?

[    0.000000] WARNING: CPU: 0 PID: 0 at 
drivers/irqchip/irq-gic-v3.c:909 gic_irq_domain_translate+0x84/0xe8
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.16.0-rc6-next-20180320-00006-g841c1d1-dirty #257
[    0.000000] Hardware name: Excavator-RK3399 Board (DT)
[    0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO)
[    0.000000] pc : gic_irq_domain_translate+0x84/0xe8
[    0.000000] lr : irq_create_fwspec_mapping+0x64/0x328
[    0.000000] sp : ffff000009033cb0
[    0.000000] x29: ffff000009033cb0 x28: 0000000000000002
[    0.000000] x27: ffff8000f280fc90 x26: 0000000000000003
[    0.000000] x25: 0000000000000000 x24: ffff8000f280fc80
[    0.000000] x23: ffff00000903c8f8 x22: ffff00000903c000
[    0.000000] x21: ffff000009033d88 x20: ffff000009039000
[    0.000000] x19: ffff8000f2825000 x18: ffffffffffffffff
[    0.000000] x17: 000000000000000a x16: 00000000000007ff
[    0.000000] x15: ffff0000090396c8 x14: 31407570632f7375
[    0.000000] x13: 70632f207b205d31 x12: 5b312d6e6f697469
[    0.000000] x11: 747261702d747075 x10: 727265746e69206e
[    0.000000] x9 : 6f69746974726170 x8 : 407570632f737570
[    0.000000] x7 : 0000000000000000 x6 : 0000000000000002
[    0.000000] x5 : 0000000000000001 x4 : ffff000008c153f8
[    0.000000] x3 : ffff000009033cec x2 : ffff000009033cf0
[    0.000000] x1 : ffff000009033d88 x0 : 0000000000000000
[    0.000000] Call trace:
[    0.000000]  gic_irq_domain_translate+0x84/0xe8
[    0.000000]  gic_populate_ppi_partitions+0x1fc/0x280
[    0.000000]  gic_of_init+0x174/0x214
[    0.000000]  of_irq_init+0x180/0x2e8
[    0.000000]  irqchip_init+0x14/0x38
[    0.000000]  init_IRQ+0xfc/0x130
[    0.000000]  start_kernel+0x284/0x414
[    0.000000] ---[ end trace 5a16819db6b2d5d2 ]---

commit 6ef6386ef7c15bea21afce06f951c87de7e2a562
Author: Marc Zyngier <marc.zyngier@arm.com>
Date:   Fri Mar 16 14:35:17 2018 +0000

     irqchip/gic-v3: Loudly complain about the use of IRQ_TYPE_NONE

     There is a huge number of broken device trees out there. Just
     grepping through the tree for the use of IRQ_TYPE_NONE in conjunction
     with the GIC is scary.

     People just don't realise that IRQ_TYPE_NONE just doesn't exist, and
     you just get whatever junk was there before. So let's make them aware
     of the issue.

     Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>


> 
>> Thanks,
>>
>>     M.
>>
> 
> 
>
Jeffy Chen March 20, 2018, 10:13 a.m. UTC | #3
Hi Shawn,

On 03/20/2018 05:39 PM, Shawn Lin wrote:
>
>
> Also another patch warns a lot when booting the kernel. Is there
> anything else I could do to let it go? Seems I am using broken
> dts for requesting IRQ_TYPE_NONE there?

could be:
https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-gic-v3.c#L1145

		struct irq_fwspec ppi_fwspec = {
			.fwnode		= gic_data.fwnode,
			.param_count	= 3,
			.param		= {
				[0]	= 1,
				[1]	= i,
				[2]	= IRQ_TYPE_NONE, <--
			},
		};

		irq = irq_create_fwspec_mapping(&ppi_fwspec);


>
> [    0.000000] WARNING: CPU: 0 PID: 0 at
> drivers/irqchip/irq-gic-v3.c:909 gic_irq_domain_translate+0x84/0xe8
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 4.16.0-rc6-next-20180320-00006-g841c1d1-dirty #257
> [    0.000000] Hardware name: Excavator-RK3399 Board (DT)
> [    0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO)
> [    0.000000] pc : gic_irq_domain_translate+0x84/0xe8
> [    0.000000] lr : irq_create_fwspec_mapping+0x64/0x328
> [    0.000000] sp : ffff000009033cb0
> [    0.000000] x29: ffff000009033cb0 x28: 0000000000000002
> [    0.000000] x27: ffff8000f280fc90 x26: 0000000000000003
> [    0.000000] x25: 0000000000000000 x24: ffff8000f280fc80
> [    0.000000] x23: ffff00000903c8f8 x22: ffff00000903c000
> [    0.000000] x21: ffff000009033d88 x20: ffff000009039000
> [    0.000000] x19: ffff8000f2825000 x18: ffffffffffffffff
> [    0.000000] x17: 000000000000000a x16: 00000000000007ff
> [    0.000000] x15: ffff0000090396c8 x14: 31407570632f7375
> [    0.000000] x13: 70632f207b205d31 x12: 5b312d6e6f697469
> [    0.000000] x11: 747261702d747075 x10: 727265746e69206e
> [    0.000000] x9 : 6f69746974726170 x8 : 407570632f737570
> [    0.000000] x7 : 0000000000000000 x6 : 0000000000000002
> [    0.000000] x5 : 0000000000000001 x4 : ffff000008c153f8
> [    0.000000] x3 : ffff000009033cec x2 : ffff000009033cf0
> [    0.000000] x1 : ffff000009033d88 x0 : 0000000000000000
> [    0.000000] Call trace:
> [    0.000000]  gic_irq_domain_translate+0x84/0xe8
> [    0.000000]  gic_populate_ppi_partitions+0x1fc/0x280
> [    0.000000]  gic_of_init+0x174/0x214
> [    0.000000]  of_irq_init+0x180/0x2e8
> [    0.000000]  irqchip_init+0x14/0x38
> [    0.000000]  init_IRQ+0xfc/0x130
> [    0.000000]  start_kernel+0x284/0x414
> [    0.000000] ---[ end trace 5a16819db6b2d5d2 ]---
>
> commit 6ef6386ef7c15bea21afce06f951c87de7e2a562
> Author: Marc Zyngier <marc.zyngier@arm.com>
> Date:   Fri Mar 16 14:35:17 2018 +0000
>
>      irqchip/gic-v3: Loudly complain about the use of IRQ_TYPE_NONE
>
>      There is a huge number of broken device trees out there. Just
>      grepping through the tree for the use of IRQ_TYPE_NONE in conjunction
>      with the GIC is scary.
>
>      People just don't realise that IRQ_TYPE_NONE just doesn't exist, and
>      you just get whatever junk was there before. So let's make them aware
>      of the issue.
>
>      Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Marc Zyngier March 20, 2018, 10:33 a.m. UTC | #4
On 20/03/18 09:39, Shawn Lin wrote:
> Hi Marc
> 
> On 2018/3/20 17:32, Shawn Lin wrote:
>> Hi Marc,
>>
>> On 2018/3/20 17:01, Marc Zyngier wrote:
>>> Hi Shawn,
>>>
>>> On 20/03/18 08:48, Shawn Lin wrote:
>>>> Hi Marc,
>>>>
>>>>       I was able to boot my RK3399 board with in linux-next-20180314,
>>>> but not today. My bisect robot shows me it was introduced by
>>>>
>>>> commit d6062a6d62c643a06c393745d032da3e6441d4bd
>>>> Author: Marc Zyngier <marc.zyngier@arm.com>
>>>> Date:   Fri Mar 9 14:53:19 2018 +0000
>>>>
>>>>       irqchip/gic-v3: Reset APgRn registers at boot time
>>>>
>>>>       Booting a crash kernel while in an interrupt handler is likely
>>>>       to leave the Active Priority Registers with some state that
>>>>       is not relevant to the new kernel, and is likely to lead
>>>>       to erratic behaviours such as interrupts not firing as their
>>>>       priority is already active.
>>>>
>>>>       As a sanity measure, wipe the APRs clean on startup. We make
>>>>       sure to wipe both group 0 and 1 registers in order to avoid
>>>>       any surprise.
>>>>
>>>>
>>>> The panic log is here:
>>>> https://paste.ubuntu.com/p/7WrJJDG6JQ/
>>>>
>>>> Is it a known issue or is there a coming patch for that?
>>>
>>>   Interesting. No, that wasn't the intention, but I may have missed a key
>>> detail (group 0 access traps to EL3 if SCR_EL3.FIQ==1). Can you have a
>>> go at the following hack, just to narrow it down:
>>>
>>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>>> index 5bb7bb22f1c1..f8ff43b1d4f8 100644
>>> --- a/drivers/irqchip/irq-gic-v3.c
>>> +++ b/drivers/irqchip/irq-gic-v3.c
>>> @@ -570,16 +570,12 @@ static void gic_cpu_sys_reg_init(void)
>>>       switch(val + 1) {
>>>       case 8:
>>>       case 7:
>>> -        write_gicreg(0, ICC_AP0R3_EL1);
>>>           write_gicreg(0, ICC_AP1R3_EL1);
>>> -        write_gicreg(0, ICC_AP0R2_EL1);
>>>           write_gicreg(0, ICC_AP1R2_EL1);
>>>       case 6:
>>> -        write_gicreg(0, ICC_AP0R1_EL1);
>>>           write_gicreg(0, ICC_AP1R1_EL1);
>>>       case 5:
>>>       case 4:
>>> -        write_gicreg(0, ICC_AP0R0_EL1);
>>>           write_gicreg(0, ICC_AP1R0_EL1);
>>>       }
>>>
>>> Let me know if that helps.
>>>
>>
>> It works for me. Thanks!
> 
> 
> Also another patch warns a lot when booting the kernel. Is there
> anything else I could do to let it go? Seems I am using broken
> dts for requesting IRQ_TYPE_NONE there?

Indeed. IRQ_TYPE_NONE is not something you should ever feed to the GIC.

Thanks,

	M.
Marc Zyngier March 20, 2018, 1:34 p.m. UTC | #5
On 20/03/18 10:13, JeffyChen wrote:
> Hi Shawn,
> 
> On 03/20/2018 05:39 PM, Shawn Lin wrote:
>>
>>
>> Also another patch warns a lot when booting the kernel. Is there
>> anything else I could do to let it go? Seems I am using broken
>> dts for requesting IRQ_TYPE_NONE there?
> 
> could be:
> https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-gic-v3.c#L1145
> 
> 		struct irq_fwspec ppi_fwspec = {
> 			.fwnode		= gic_data.fwnode,
> 			.param_count	= 3,
> 			.param		= {
> 				[0]	= 1,
> 				[1]	= i,
> 				[2]	= IRQ_TYPE_NONE, <--
> 			},
> 		};
> 
> 		irq = irq_create_fwspec_mapping(&ppi_fwspec);

Probably is. Caught at my own game, fun! ;-)

Turning that NONE into LEVEL will work, as there is no known PPIs
configured as edge (specially in a partitioned system), but the general
case isn't pretty. I'll queue a workaround for now, and will look at
addressing the more general issue.

Thanks,

	M.
John Garry March 20, 2018, 3:52 p.m. UTC | #6
On 20/03/2018 13:34, Marc Zyngier wrote:
> On 20/03/18 10:13, JeffyChen wrote:
>> Hi Shawn,
>>
>> On 03/20/2018 05:39 PM, Shawn Lin wrote:
>>>
>>>
>>> Also another patch warns a lot when booting the kernel. Is there
>>> anything else I could do to let it go? Seems I am using broken
>>> dts for requesting IRQ_TYPE_NONE there?
>>
>> could be:
>> https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-gic-v3.c#L1145
>>
>> 		struct irq_fwspec ppi_fwspec = {
>> 			.fwnode		= gic_data.fwnode,
>> 			.param_count	= 3,
>> 			.param		= {
>> 				[0]	= 1,
>> 				[1]	= i,
>> 				[2]	= IRQ_TYPE_NONE, <--
>> 			},
>> 		};
>>
>> 		irq = irq_create_fwspec_mapping(&ppi_fwspec);
>
> Probably is. Caught at my own game, fun! ;-)
>
> Turning that NONE into LEVEL will work, as there is no known PPIs
> configured as edge (specially in a partitioned system), but the general
> case isn't pretty. I'll queue a workaround for now, and will look at
> addressing the more general issue.
>

JFYI, reverting the original patch mentioned by Shawn resolved the boot 
hang I was seeing on my Huawei D03. My D05 was fine without the revert.

Thanks,
John

> Thanks,
>
> 	M.
>
Marc Zyngier March 20, 2018, 4:06 p.m. UTC | #7
Hi John,

On 20/03/18 15:52, John Garry wrote:
> On 20/03/2018 13:34, Marc Zyngier wrote:
>> On 20/03/18 10:13, JeffyChen wrote:
>>> Hi Shawn,
>>>
>>> On 03/20/2018 05:39 PM, Shawn Lin wrote:
>>>>
>>>>
>>>> Also another patch warns a lot when booting the kernel. Is there
>>>> anything else I could do to let it go? Seems I am using broken
>>>> dts for requesting IRQ_TYPE_NONE there?
>>>
>>> could be:
>>> https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-gic-v3.c#L1145
>>>
>>> 		struct irq_fwspec ppi_fwspec = {
>>> 			.fwnode		= gic_data.fwnode,
>>> 			.param_count	= 3,
>>> 			.param		= {
>>> 				[0]	= 1,
>>> 				[1]	= i,
>>> 				[2]	= IRQ_TYPE_NONE, <--
>>> 			},
>>> 		};
>>>
>>> 		irq = irq_create_fwspec_mapping(&ppi_fwspec);
>>
>> Probably is. Caught at my own game, fun! ;-)
>>
>> Turning that NONE into LEVEL will work, as there is no known PPIs
>> configured as edge (specially in a partitioned system), but the general
>> case isn't pretty. I'll queue a workaround for now, and will look at
>> addressing the more general issue.
>>
> 
> JFYI, reverting the original patch mentioned by Shawn resolved the boot 
> hang I was seeing on my Huawei D03. My D05 was fine without the revert.

I guess your D05 doesn't set SCR_EL3.FIQ, meaning that no Group-0
interrupts can reach the firmware. Hopefully it doesn't need them.

Would you mind testing the last patch I posted earlier today[1]?

Thanks,

	M.

[1] https://www.spinics.net/lists/arm-kernel/msg642440.html
John Garry March 20, 2018, 4:52 p.m. UTC | #8
On 20/03/2018 16:06, Marc Zyngier wrote:
> Hi John,
>
> On 20/03/18 15:52, John Garry wrote:
>> On 20/03/2018 13:34, Marc Zyngier wrote:
>>> On 20/03/18 10:13, JeffyChen wrote:
>>>> Hi Shawn,
>>>>
>>>> On 03/20/2018 05:39 PM, Shawn Lin wrote:
>>>>>
>>>>>
>>>>> Also another patch warns a lot when booting the kernel. Is there
>>>>> anything else I could do to let it go? Seems I am using broken
>>>>> dts for requesting IRQ_TYPE_NONE there?
>>>>
>>>> could be:
>>>> https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-gic-v3.c#L1145
>>>>
>>>> 		struct irq_fwspec ppi_fwspec = {
>>>> 			.fwnode		= gic_data.fwnode,
>>>> 			.param_count	= 3,
>>>> 			.param		= {
>>>> 				[0]	= 1,
>>>> 				[1]	= i,
>>>> 				[2]	= IRQ_TYPE_NONE, <--
>>>> 			},
>>>> 		};
>>>>
>>>> 		irq = irq_create_fwspec_mapping(&ppi_fwspec);
>>>
>>> Probably is. Caught at my own game, fun! ;-)
>>>
>>> Turning that NONE into LEVEL will work, as there is no known PPIs
>>> configured as edge (specially in a partitioned system), but the general
>>> case isn't pretty. I'll queue a workaround for now, and will look at
>>> addressing the more general issue.
>>>
>>
>> JFYI, reverting the original patch mentioned by Shawn resolved the boot
>> hang I was seeing on my Huawei D03. My D05 was fine without the revert.
>
> I guess your D05 doesn't set SCR_EL3.FIQ, meaning that no Group-0
> interrupts can reach the firmware. Hopefully it doesn't need them.
>
> Would you mind testing the last patch I posted earlier today[1]?
>

Hi Marc,

Yes, [1] allows my D03 to boot. D05 is still ok.

If you're going to send this same patch to the list then feel free to add:
Tested-by: John Garry <john.garry@huawei.com>

Cheers,
John

> Thanks,
>
> 	M.
>
> [1] https://www.spinics.net/lists/arm-kernel/msg642440.html
>
Marc Zyngier March 20, 2018, 5 p.m. UTC | #9
On 20/03/18 16:52, John Garry wrote:
> On 20/03/2018 16:06, Marc Zyngier wrote:
>> Hi John,
>>
>> On 20/03/18 15:52, John Garry wrote:
>>> On 20/03/2018 13:34, Marc Zyngier wrote:
>>>> On 20/03/18 10:13, JeffyChen wrote:
>>>>> Hi Shawn,
>>>>>
>>>>> On 03/20/2018 05:39 PM, Shawn Lin wrote:
>>>>>>
>>>>>>
>>>>>> Also another patch warns a lot when booting the kernel. Is there
>>>>>> anything else I could do to let it go? Seems I am using broken
>>>>>> dts for requesting IRQ_TYPE_NONE there?
>>>>>
>>>>> could be:
>>>>> https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-gic-v3.c#L1145
>>>>>
>>>>> 		struct irq_fwspec ppi_fwspec = {
>>>>> 			.fwnode		= gic_data.fwnode,
>>>>> 			.param_count	= 3,
>>>>> 			.param		= {
>>>>> 				[0]	= 1,
>>>>> 				[1]	= i,
>>>>> 				[2]	= IRQ_TYPE_NONE, <--
>>>>> 			},
>>>>> 		};
>>>>>
>>>>> 		irq = irq_create_fwspec_mapping(&ppi_fwspec);
>>>>
>>>> Probably is. Caught at my own game, fun! ;-)
>>>>
>>>> Turning that NONE into LEVEL will work, as there is no known PPIs
>>>> configured as edge (specially in a partitioned system), but the general
>>>> case isn't pretty. I'll queue a workaround for now, and will look at
>>>> addressing the more general issue.
>>>>
>>>
>>> JFYI, reverting the original patch mentioned by Shawn resolved the boot
>>> hang I was seeing on my Huawei D03. My D05 was fine without the revert.
>>
>> I guess your D05 doesn't set SCR_EL3.FIQ, meaning that no Group-0
>> interrupts can reach the firmware. Hopefully it doesn't need them.
>>
>> Would you mind testing the last patch I posted earlier today[1]?
>>
> 
> Hi Marc,
> 
> Yes, [1] allows my D03 to boot. D05 is still ok.
> 
> If you're going to send this same patch to the list then feel free to add:
> Tested-by: John Garry <john.garry@huawei.com>

OK, thanks for that. There is still a nit on my Chromebook, and I cannot
yet explain why.

Digging.

	M.
Shawn Lin March 21, 2018, 12:34 a.m. UTC | #10
Hi Marc

On 2018/3/21 1:00, Marc Zyngier wrote:
> On 20/03/18 16:52, John Garry wrote:
>> On 20/03/2018 16:06, Marc Zyngier wrote:
>>> Hi John,
>>>
>>> On 20/03/18 15:52, John Garry wrote:
>>>> On 20/03/2018 13:34, Marc Zyngier wrote:
>>>>> On 20/03/18 10:13, JeffyChen wrote:
>>>>>> Hi Shawn,
>>>>>>
>>>>>> On 03/20/2018 05:39 PM, Shawn Lin wrote:
>>>>>>>
>>>>>>>
>>>>>>> Also another patch warns a lot when booting the kernel. Is there
>>>>>>> anything else I could do to let it go? Seems I am using broken
>>>>>>> dts for requesting IRQ_TYPE_NONE there?
>>>>>>
>>>>>> could be:
>>>>>> https://github.com/torvalds/linux/blob/master/drivers/irqchip/irq-gic-v3.c#L1145
>>>>>>
>>>>>> 		struct irq_fwspec ppi_fwspec = {
>>>>>> 			.fwnode		= gic_data.fwnode,
>>>>>> 			.param_count	= 3,
>>>>>> 			.param		= {
>>>>>> 				[0]	= 1,
>>>>>> 				[1]	= i,
>>>>>> 				[2]	= IRQ_TYPE_NONE, <--
>>>>>> 			},
>>>>>> 		};
>>>>>>
>>>>>> 		irq = irq_create_fwspec_mapping(&ppi_fwspec);
>>>>>
>>>>> Probably is. Caught at my own game, fun! ;-)
>>>>>
>>>>> Turning that NONE into LEVEL will work, as there is no known PPIs
>>>>> configured as edge (specially in a partitioned system), but the general
>>>>> case isn't pretty. I'll queue a workaround for now, and will look at
>>>>> addressing the more general issue.
>>>>>
>>>>
>>>> JFYI, reverting the original patch mentioned by Shawn resolved the boot
>>>> hang I was seeing on my Huawei D03. My D05 was fine without the revert.
>>>
>>> I guess your D05 doesn't set SCR_EL3.FIQ, meaning that no Group-0
>>> interrupts can reach the firmware. Hopefully it doesn't need them.
>>>
>>> Would you mind testing the last patch I posted earlier today[1]?
>>>
>>
>> Hi Marc,
>>
>> Yes, [1] allows my D03 to boot. D05 is still ok.
>>
>> If you're going to send this same patch to the list then feel free to add:
>> Tested-by: John Garry <john.garry@huawei.com>
> 
> OK, thanks for that. There is still a nit on my Chromebook, and I cannot
> yet explain why.

It works fine now with your updated patch.

For rk3399-sapphire-excavator board,
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>


@Jeffy

Could you help test Marc's latest patch on your RK3399 kevin/Gru or
whatever Chromebook?

> 
> Digging.
> 
> 	M.
>
Jeffy Chen March 21, 2018, 6:10 a.m. UTC | #11
Hi Shawn,

On 03/21/2018 08:34 AM, Shawn Lin wrote:
>
> @Jeffy
>
> Could you help test Marc's latest patch on your RK3399 kevin/Gru or
> whatever Chromebook?
tested on my chromebook kevin, it works with that patch.
Marc Zyngier March 21, 2018, 8:52 a.m. UTC | #12
Hi Jeffy,

On 21/03/18 06:10, JeffyChen wrote:
> Hi Shawn,
> 
> On 03/21/2018 08:34 AM, Shawn Lin wrote:
>>
>> @Jeffy
>>
>> Could you help test Marc's latest patch on your RK3399 kevin/Gru or
>> whatever Chromebook?
> tested on my chromebook kevin, it works with that patch.

This is very odd. It completely fails on mine, which is aannoying.
Do you have any special firmware? Or the stock firmware?

Thanks,

	M.
Jeffy Chen March 21, 2018, 9:18 a.m. UTC | #13
Hi Marc,

On 03/21/2018 04:52 PM, Marc Zyngier wrote:
> Hi Jeffy,
>
> On 21/03/18 06:10, JeffyChen wrote:
>> Hi Shawn,
>>
>> On 03/21/2018 08:34 AM, Shawn Lin wrote:
>>>
>>> @Jeffy
>>>
>>> Could you help test Marc's latest patch on your RK3399 kevin/Gru or
>>> whatever Chromebook?
>> tested on my chromebook kevin, it works with that patch.
>
> This is very odd. It completely fails on mine, which is aannoying.
> Do you have any special firmware? Or the stock firmware?
>

hmmm, my kevin's firmware is official:
localhost / # cat /var/log/bios_info.txt
version              | Google_Kevin.8785.241.0
ro bios version      | Google_Kevin.8785.241.0

and i also test it on my dru(another rk3399 chromebook) (force using 
kevin's dtb), it works too(although it dead at the end after i2c probed).


my kernel:
a9be5aeb6a31 irqchip/gic-v3: Check availability of Group0 before 
resetting AP0Rn
170b1830a3e7 Add linux-next specific files for 20180320
90385ca35f2f Merge branch 'akpm/master'
0a54b4dd1975 sparc64: NG4 memset 32 bits overflow

my kernel image and config:
https://emailattachment.net/sites/default/files/files/public/kernel.img
https://emailattachment.net/sites/default/files/files/public/config.config



> Thanks,
>
> 	M.
>
Marc Zyngier March 21, 2018, 9:33 a.m. UTC | #14
On 21/03/18 09:18, JeffyChen wrote:
> Hi Marc,
> 
> On 03/21/2018 04:52 PM, Marc Zyngier wrote:
>> Hi Jeffy,
>>
>> On 21/03/18 06:10, JeffyChen wrote:
>>> Hi Shawn,
>>>
>>> On 03/21/2018 08:34 AM, Shawn Lin wrote:
>>>>
>>>> @Jeffy
>>>>
>>>> Could you help test Marc's latest patch on your RK3399 kevin/Gru or
>>>> whatever Chromebook?
>>> tested on my chromebook kevin, it works with that patch.
>>
>> This is very odd. It completely fails on mine, which is aannoying.
>> Do you have any special firmware? Or the stock firmware?
>>
> 
> hmmm, my kevin's firmware is official:
> localhost / # cat /var/log/bios_info.txt
> version              | Google_Kevin.8785.241.0
> ro bios version      | Google_Kevin.8785.241.0

(booting into ChromeOS) Definitely not what I have on mine:
8785.220.0 as the runtime version, and 8785.94.7 as the RO one.

This is a retail machine, bought in November. So what you have cannot be
the stock firmware.

What changes related to the GIC have been applied to this FW?

Thanks,

	M.
Jeffy Chen March 21, 2018, 12:04 p.m. UTC | #15
Hi Marc,

On 03/21/2018 05:33 PM, Marc Zyngier wrote:
>>>>> >>>>@Jeffy
>>>>> >>>>
>>>>> >>>>Could you help test Marc's latest patch on your RK3399 kevin/Gru or
>>>>> >>>>whatever Chromebook?
>>>> >>>tested on my chromebook kevin, it works with that patch.
>>> >>
>>> >>This is very odd. It completely fails on mine, which is aannoying.
>>> >>Do you have any special firmware? Or the stock firmware?
>>> >>
>> >
>> >hmmm, my kevin's firmware is official:
>> >localhost / # cat /var/log/bios_info.txt
>> >version              | Google_Kevin.8785.241.0
>> >ro bios version      | Google_Kevin.8785.241.0
> (booting into ChromeOS) Definitely not what I have on mine:
> 8785.220.0 as the runtime version, and 8785.94.7 as the RO one.
>
> This is a retail machine, bought in November. So what you have cannot be
> the stock firmware.
>
> What changes related to the GIC have been applied to this FW?


i tried to flash my firmware back to 8785.220.0, and still works...:
localhost tmp # head -2 /var/log/bios_info.txt
version              | Google_Kevin.8785.220.0
ro bios version      | Google_Kevin.8785.220.0


dmesg:
https://emailattachment.net/sites/default/files/files/public/dmesg.dmesg

firmware 8785.220.0:
https://emailattachment.net/sites/default/files/files/public/image.dev.bin

>
> Thanks,
>
> 	M.
> -- Jazz is not dead. It just smells funny...
Marc Zyngier March 21, 2018, 12:20 p.m. UTC | #16
Hi Jeffy,

On 21/03/18 12:04, JeffyChen wrote:
> Hi Marc,
> 
> On 03/21/2018 05:33 PM, Marc Zyngier wrote:
>>>>>>>>>> @Jeffy
>>>>>>>>>>
>>>>>>>>>> Could you help test Marc's latest patch on your RK3399 kevin/Gru or
>>>>>>>>>> whatever Chromebook?
>>>>>>>> tested on my chromebook kevin, it works with that patch.
>>>>>>
>>>>>> This is very odd. It completely fails on mine, which is aannoying.
>>>>>> Do you have any special firmware? Or the stock firmware?
>>>>>>
>>>>
>>>> hmmm, my kevin's firmware is official:
>>>> localhost / # cat /var/log/bios_info.txt
>>>> version              | Google_Kevin.8785.241.0
>>>> ro bios version      | Google_Kevin.8785.241.0
>> (booting into ChromeOS) Definitely not what I have on mine:
>> 8785.220.0 as the runtime version, and 8785.94.7 as the RO one.
>>
>> This is a retail machine, bought in November. So what you have cannot be
>> the stock firmware.
>>
>> What changes related to the GIC have been applied to this FW?
> 
> 
> i tried to flash my firmware back to 8785.220.0, and still works...:
> localhost tmp # head -2 /var/log/bios_info.txt
> version              | Google_Kevin.8785.220.0
> ro bios version      | Google_Kevin.8785.220.0
Interesting. That patch was breaking my machine yesterday, and it
doesn't seem to break today. Still investigating.

	M.
diff mbox

Patch

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 5bb7bb22f1c1..f8ff43b1d4f8 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -570,16 +570,12 @@  static void gic_cpu_sys_reg_init(void)
 	switch(val + 1) {
 	case 8:
 	case 7:
-		write_gicreg(0, ICC_AP0R3_EL1);
 		write_gicreg(0, ICC_AP1R3_EL1);
-		write_gicreg(0, ICC_AP0R2_EL1);
 		write_gicreg(0, ICC_AP1R2_EL1);
 	case 6:
-		write_gicreg(0, ICC_AP0R1_EL1);
 		write_gicreg(0, ICC_AP1R1_EL1);
 	case 5:
 	case 4:
-		write_gicreg(0, ICC_AP0R0_EL1);
 		write_gicreg(0, ICC_AP1R0_EL1);
 	}