diff mbox series

GICv3: Add restart handler to detach CPU from GICv3

Message ID 20221216162128.10808-1-joakim.tjernlund@infinera.com (mailing list archive)
State New, archived
Headers show
Series GICv3: Add restart handler to detach CPU from GICv3 | expand

Commit Message

Joakim Tjernlund Dec. 16, 2022, 4:21 p.m. UTC
Needed for reboot without resetting the whole GIC

Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
---
 drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Comments

Joakim Tjernlund Jan. 3, 2023, 4:27 p.m. UTC | #1
On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:

Ping?

> Needed for reboot without resetting the whole GIC
> 
> Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
> ---
>  drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index dd7e7f061e8f..1989ea3d2db4 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -23,6 +23,7 @@
>  #include <linux/irqchip/arm-gic-common.h>
>  #include <linux/irqchip/arm-gic-v3.h>
>  #include <linux/irqchip/irq-partition-percpu.h>
> +#include <linux/reboot.h>
>  
>  #include <asm/cputype.h>
>  #include <asm/exception.h>
> @@ -1359,6 +1360,22 @@ static void gic_cpu_pm_init(void)
>  static inline void gic_cpu_pm_init(void) { }
>  #endif /* CONFIG_CPU_PM */
>  
> +static int gicv3_restart_notify(struct notifier_block *nb,
> +                              unsigned long mode, void *cmd)
> +{
> +       if (gic_dist_security_disabled()) {
> +               gic_write_grpen1(0);
> +               gic_enable_redist(false);
> +       }
> +
> +       return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block gicv3_restart_nb = {
> +       .notifier_call = gicv3_restart_notify,
> +       .priority = 255, /* Call last */
> +};
> +
>  static struct irq_chip gic_chip = {
>  	.name			= "GICv3",
>  	.irq_mask		= gic_mask_irq,
> @@ -1849,6 +1866,7 @@ static int __init gic_init_bases(void __iomem *dist_base,
>  	gic_cpu_init();
>  	gic_smp_init();
>  	gic_cpu_pm_init();
> +	register_restart_handler(&gicv3_restart_nb);
>  
>  	if (gic_dist_supports_lpis()) {
>  		its_init(handle, &gic_data.rdists, gic_data.domain);
Mark Rutland Jan. 4, 2023, 4:04 p.m. UTC | #2
On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> 
> Ping?

To whom?

You don't appeared to have Cc'd any relevant maintainer, and people are still
on holiday, so it's extremely likely this will be missed.

For the maintainer, please use scripts/get_maintainer.pl, e.g.

| [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
| Thomas Gleixner <tglx@linutronix.de> (maintainer:IRQCHIP DRIVERS)
| Marc Zyngier <maz@kernel.org> (maintainer:IRQCHIP DRIVERS)
| linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS)

Note: I've Cc'd Marc, who wrote the GICv3 driver.

> > Needed for reboot without resetting the whole GIC

This doesn't really explain what you're trying to do nor why.

Why do you need to "reboot without resetting the whole GIC" ?

Do you encounter a problem if we try to reset the whole GIC?

Is this for kexec?

Is this for some use-case enabled by out-of-tree code?

Thanks,
Mark.

> > 
> > Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
> > ---
> >  drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> > 
> > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > index dd7e7f061e8f..1989ea3d2db4 100644
> > --- a/drivers/irqchip/irq-gic-v3.c
> > +++ b/drivers/irqchip/irq-gic-v3.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/irqchip/arm-gic-common.h>
> >  #include <linux/irqchip/arm-gic-v3.h>
> >  #include <linux/irqchip/irq-partition-percpu.h>
> > +#include <linux/reboot.h>
> >  
> >  #include <asm/cputype.h>
> >  #include <asm/exception.h>
> > @@ -1359,6 +1360,22 @@ static void gic_cpu_pm_init(void)
> >  static inline void gic_cpu_pm_init(void) { }
> >  #endif /* CONFIG_CPU_PM */
> >  
> > +static int gicv3_restart_notify(struct notifier_block *nb,
> > +                              unsigned long mode, void *cmd)
> > +{
> > +       if (gic_dist_security_disabled()) {
> > +               gic_write_grpen1(0);
> > +               gic_enable_redist(false);
> > +       }
> > +
> > +       return NOTIFY_DONE;
> > +}
> > +
> > +static struct notifier_block gicv3_restart_nb = {
> > +       .notifier_call = gicv3_restart_notify,
> > +       .priority = 255, /* Call last */
> > +};
> > +
> >  static struct irq_chip gic_chip = {
> >  	.name			= "GICv3",
> >  	.irq_mask		= gic_mask_irq,
> > @@ -1849,6 +1866,7 @@ static int __init gic_init_bases(void __iomem *dist_base,
> >  	gic_cpu_init();
> >  	gic_smp_init();
> >  	gic_cpu_pm_init();
> > +	register_restart_handler(&gicv3_restart_nb);
> >  
> >  	if (gic_dist_supports_lpis()) {
> >  		its_init(handle, &gic_data.rdists, gic_data.domain);
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Marc Zyngier Jan. 4, 2023, 4:50 p.m. UTC | #3
On Wed, 04 Jan 2023 16:04:14 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> > 
> > Ping?
> 
> To whom?
> 
> You don't appeared to have Cc'd any relevant maintainer, and people are still
> on holiday, so it's extremely likely this will be missed.

That, plus nobody reads the list looking for this sort of things.

> 
> For the maintainer, please use scripts/get_maintainer.pl, e.g.
> 
> | [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
> | Thomas Gleixner <tglx@linutronix.de> (maintainer:IRQCHIP DRIVERS)
> | Marc Zyngier <maz@kernel.org> (maintainer:IRQCHIP DRIVERS)
> | linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS)
> 
> Note: I've Cc'd Marc, who wrote the GICv3 driver.

Cheers Mark, much appreciated.

> 
> > > Needed for reboot without resetting the whole GIC
> 
> This doesn't really explain what you're trying to do nor why.
> 
> Why do you need to "reboot without resetting the whole GIC" ?
> 
> Do you encounter a problem if we try to reset the whole GIC?
> 
> Is this for kexec?
> 
> Is this for some use-case enabled by out-of-tree code?

All valid questions. This smells of a terrible hack...

The interesting aspect is that this is only done when DS=1, probably
meaning that they are doing this in a VM. it also rely on some
(unbounded) UNPRED behaviour as ProcessorSleep is entered without
any consideration for Group0... Good luck with that.

Anyway, I don't think we want any of this stuff. Certainly not without
a cast-iron justification.

	M.
Joakim Tjernlund Jan. 4, 2023, 5:23 p.m. UTC | #4
On Wed, 2023-01-04 at 16:50 +0000, Marc Zyngier wrote:
> On Wed, 04 Jan 2023 16:04:14 +0000,
> Mark Rutland <mark.rutland@arm.com> wrote:
> > 
> > On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> > > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> > > 
> > > Ping?
> > 
> > To whom?
> > 
> > You don't appeared to have Cc'd any relevant maintainer, and people are still
> > on holiday, so it's extremely likely this will be missed.
> 
> That, plus nobody reads the list looking for this sort of things.
> 
> > 
> > For the maintainer, please use scripts/get_maintainer.pl, e.g.
> > 
> > > [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
> > > Thomas Gleixner <tglx@linutronix.de> (maintainer:IRQCHIP DRIVERS)
> > > Marc Zyngier <maz@kernel.org> (maintainer:IRQCHIP DRIVERS)
> > > linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS)
> > 
> > Note: I've Cc'd Marc, who wrote the GICv3 driver.
> 
> Cheers Mark, much appreciated.

Sorry for missing that extra maintainer CC:

> 
> > 
> > > > Needed for reboot without resetting the whole GIC
> > 
> > This doesn't really explain what you're trying to do nor why.
> > 
> > Why do you need to "reboot without resetting the whole GIC" ?
> > 
> > Do you encounter a problem if we try to reset the whole GIC?
> > 
> > Is this for kexec?
> > 
> > Is this for some use-case enabled by out-of-tree code?
> 
> All valid questions. This smells of a terrible hack...

Yes, all god Q's.

> 
> The interesting aspect is that this is only done when DS=1, probably
> meaning that they are doing this in a VM. it also rely on some

Nope, on our custom target.

> (unbounded) UNPRED behaviour as ProcessorSleep is entered without
> any consideration for Group0... Good luck with that.

hmm, I am doing the same as PM does which also needs DS=1 so I figured
this was uncontroversial. 

> 
> Anyway, I don't think we want any of this stuff. Certainly not without
> a cast-iron justification.

We use several cores but only one run Linux so a Linux reboot will
only reset its own core.

Without this patch I either need to reset the GIC as part of the reboot or
I get RWP timeout when linux starts again. Resetting the GIC kills IRQ on the other cores
for a long time which is unwanted.

The RWP timeout comes from lost HW handshake between core and GIC.
Is there another way to regain the HW handshake ?

 Jocke
Marc Zyngier Jan. 4, 2023, 6:48 p.m. UTC | #5
On Wed, 04 Jan 2023 17:23:42 +0000,
Joakim Tjernlund <Joakim.Tjernlund@infinera.com> wrote:
> 
> On Wed, 2023-01-04 at 16:50 +0000, Marc Zyngier wrote:
> > On Wed, 04 Jan 2023 16:04:14 +0000,
> > Mark Rutland <mark.rutland@arm.com> wrote:
> > > 
> > > On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> > > > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> > > > 
> > > > Ping?
> > > 
> > > To whom?
> > > 
> > > You don't appeared to have Cc'd any relevant maintainer, and people are still
> > > on holiday, so it's extremely likely this will be missed.
> > 
> > That, plus nobody reads the list looking for this sort of things.
> > 
> > > 
> > > For the maintainer, please use scripts/get_maintainer.pl, e.g.
> > > 
> > > > [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
> > > > Thomas Gleixner <tglx@linutronix.de> (maintainer:IRQCHIP DRIVERS)
> > > > Marc Zyngier <maz@kernel.org> (maintainer:IRQCHIP DRIVERS)
> > > > linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS)
> > > 
> > > Note: I've Cc'd Marc, who wrote the GICv3 driver.
> > 
> > Cheers Mark, much appreciated.
> 
> Sorry for missing that extra maintainer CC:
> 
> > 
> > > 
> > > > > Needed for reboot without resetting the whole GIC
> > > 
> > > This doesn't really explain what you're trying to do nor why.
> > > 
> > > Why do you need to "reboot without resetting the whole GIC" ?
> > > 
> > > Do you encounter a problem if we try to reset the whole GIC?
> > > 
> > > Is this for kexec?
> > > 
> > > Is this for some use-case enabled by out-of-tree code?
> > 
> > All valid questions. This smells of a terrible hack...
> 
> Yes, all god Q's.

And no answer?

> 
> > 
> > The interesting aspect is that this is only done when DS=1, probably
> > meaning that they are doing this in a VM. it also rely on some
> 
> Nope, on our custom target.

And you run with DS=1? On bare metal? Humpf...

> 
> > (unbounded) UNPRED behaviour as ProcessorSleep is entered without
> > any consideration for Group0... Good luck with that.
> 
> hmm, I am doing the same as PM does which also needs DS=1 so I figured
> this was uncontroversial.

I seriously doubt anyone is actually using that code in anger. I
always found it dodgy, and I'm pretty sure it is totally broken.

> 
> > 
> > Anyway, I don't think we want any of this stuff. Certainly not without
> > a cast-iron justification.
> 
> We use several cores but only one run Linux so a Linux reboot will
> only reset its own core.

And what happens with the distributor? One of the key assumption of
the GIC architecture is that there is only one operating system in
control of it, the whole of it. The only way to share it is by
virtualising it. Shades of Jailhouse...

> 
> Without this patch I either need to reset the GIC as part of the
> reboot or I get RWP timeout when linux starts again. Resetting the
> GIC kills IRQ on the other cores for a long time which is unwanted.

But that's what happens anyway when Linux boots (we reinitialise the
whole distributor). So what is this about?

> 
> The RWP timeout comes from lost HW handshake between core and GIC.
> Is there another way to regain the HW handshake ?

That's the firmware's job. But overall, getting these timeouts means
your GIC has locked up. It would be more interesting to understand
*why* you get in that situation, which I presume is due to the way the
driver initialises itself.

Assuming you use kexec to reboot your Linux instance, does the
following help (totally untested)?

	M.

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 997104d4338e..0db35a07ffdb 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1186,12 +1186,25 @@ static int gic_dist_supports_lpis(void)
 static void gic_cpu_init(void)
 {
 	void __iomem *rbase;
+	bool g0;
 	int i;
 
 	/* Register ourselves with the rest of the world */
 	if (gic_populate_rdist())
 		return;
 
+	g0 = gic_has_group0();
+
+	if (read_gicreg(ICC_IGRPEN1_EL1) ||
+	    (g0 && read_gicreg(ICC_IGRPEN0_EL1))) {
+		if (g0)
+			write_gicreg(0, ICC_IGRPEN0_EL1);
+
+		write_gicreg(0, ICC_IGRPEN1_EL1);
+		isb();
+		gic_enable_redist(false);
+	}
+
 	gic_enable_redist(true);
 
 	WARN((gic_data.ppi_nr > 16 || GIC_ESPI_NR != 0) &&
Joakim Tjernlund Jan. 4, 2023, 7:52 p.m. UTC | #6
On Wed, 2023-01-04 at 18:48 +0000, Marc Zyngier wrote:
> On Wed, 04 Jan 2023 17:23:42 +0000,
> Joakim Tjernlund <Joakim.Tjernlund@infinera.com> wrote:
> > 
> > On Wed, 2023-01-04 at 16:50 +0000, Marc Zyngier wrote:
> > > On Wed, 04 Jan 2023 16:04:14 +0000,
> > > Mark Rutland <mark.rutland@arm.com> wrote:
> > > > 
> > > > On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> > > > > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> > > > > 
> > > > > Ping?
> > > > 
> > > > To whom?
> > > > 
> > > > You don't appeared to have Cc'd any relevant maintainer, and people are still
> > > > on holiday, so it's extremely likely this will be missed.
> > > 
> > > That, plus nobody reads the list looking for this sort of things.
> > > 
> > > > 
> > > > For the maintainer, please use scripts/get_maintainer.pl, e.g.
> > > > 
> > > > > [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
> > > > > Thomas Gleixner <tglx@linutronix.de> (maintainer:IRQCHIP DRIVERS)
> > > > > Marc Zyngier <maz@kernel.org> (maintainer:IRQCHIP DRIVERS)
> > > > > linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS)
> > > > 
> > > > Note: I've Cc'd Marc, who wrote the GICv3 driver.
> > > 
> > > Cheers Mark, much appreciated.
> > 
> > Sorry for missing that extra maintainer CC:
> > 
> > > 
> > > > 
> > > > > > Needed for reboot without resetting the whole GIC
> > > > 
> > > > This doesn't really explain what you're trying to do nor why.
> > > > 
> > > > Why do you need to "reboot without resetting the whole GIC" ?
> > > > 
> > > > Do you encounter a problem if we try to reset the whole GIC?
> > > > 
> > > > Is this for kexec?
> > > > 
> > > > Is this for some use-case enabled by out-of-tree code?
> > > 
> > > All valid questions. This smells of a terrible hack...
> > 
> > Yes, all god Q's.
> 
> And no answer?

No kexec, out of tree kernel with minor mods
> 
> > 
> > > 
> > > The interesting aspect is that this is only done when DS=1, probably
> > > meaning that they are doing this in a VM. it also rely on some
> > 
> > Nope, on our custom target.
> 
> And you run with DS=1? On bare metal? Humpf...

Yes, we control all SW running on this target.

> 
> > 
> > > (unbounded) UNPRED behaviour as ProcessorSleep is entered without
> > > any consideration for Group0... Good luck with that.

No Group0 IRQ used(or so I think but I could be mistaken)

> > 
> > hmm, I am doing the same as PM does which also needs DS=1 so I figured
> > this was uncontroversial.
> 
> I seriously doubt anyone is actually using that code in anger. I
> always found it dodgy, and I'm pretty sure it is totally broken.

I see

> 
> > 
> > > 
> > > Anyway, I don't think we want any of this stuff. Certainly not without
> > > a cast-iron justification.
> > 
> > We use several cores but only one run Linux so a Linux reboot will
> > only reset its own core.
> 
> And what happens with the distributor? One of the key assumption of
> the GIC architecture is that there is only one operating system in
> control of it, the whole of it. The only way to share it is by
> virtualising it. Shades of Jailhouse...

Distributor is not touched until Linux is going up again

> 
> > 
> > Without this patch I either need to reset the GIC as part of the
> > reboot or I get RWP timeout when linux starts again. Resetting the
> > GIC kills IRQ on the other cores for a long time which is unwanted.
> 
> But that's what happens anyway when Linux boots (we reinitialise the
> whole distributor). So what is this about?

At that point IRQ will only be lost shortly until the other cores are reloaded by Linux user space

> 
> > 
> > The RWP timeout comes from lost HW handshake between core and GIC.
> > Is there another way to regain the HW handshake ?
> 
> That's the firmware's job. But overall, getting these timeouts means
> your GIC has locked up. It would be more interesting to understand

FW here is u-boot starting from EL3

> *why* you get in that situation, which I presume is due to the way the
> driver initialises itself.
> 
> Assuming you use kexec to reboot your Linux instance, does the
> following help (totally untested)?

no kexec and no it did not help.
I get the first RWP here:
static void __init gic_dist_init(void)
{
	unsigned int i;
	u64 affinity;
	void __iomem *base = gic_data.dist_base;
	u32 val;

	/* Disable the distributor */
	writel_relaxed(0, base + GICD_CTLR);
	gic_dist_wait_for_rwp(); <------ RWP timeout

I guess it is impossible to recover GIC HW handshake once it is lost ?


> 
> 	M.
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 997104d4338e..0db35a07ffdb 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1186,12 +1186,25 @@ static int gic_dist_supports_lpis(void)
>  static void gic_cpu_init(void)
>  {
>  	void __iomem *rbase;
> +	bool g0;
>  	int i;
>  
>  	/* Register ourselves with the rest of the world */
>  	if (gic_populate_rdist())
>  		return;
>  
> +	g0 = gic_has_group0();
> +
> +	if (read_gicreg(ICC_IGRPEN1_EL1) ||
> +	    (g0 && read_gicreg(ICC_IGRPEN0_EL1))) {
> +		if (g0)
> +			write_gicreg(0, ICC_IGRPEN0_EL1);
> +
> +		write_gicreg(0, ICC_IGRPEN1_EL1);
> +		isb();
> +		gic_enable_redist(false);
> +	}
> +
>  	gic_enable_redist(true);
>  
>  	WARN((gic_data.ppi_nr > 16 || GIC_ESPI_NR != 0) &&
>
Marc Zyngier Jan. 5, 2023, 10:37 a.m. UTC | #7
On Wed, 04 Jan 2023 19:52:33 +0000,
Joakim Tjernlund <Joakim.Tjernlund@infinera.com> wrote:
> 
> On Wed, 2023-01-04 at 18:48 +0000, Marc Zyngier wrote:
> > On Wed, 04 Jan 2023 17:23:42 +0000,
> > Joakim Tjernlund <Joakim.Tjernlund@infinera.com> wrote:
> > > 
> > > On Wed, 2023-01-04 at 16:50 +0000, Marc Zyngier wrote:
> > > > On Wed, 04 Jan 2023 16:04:14 +0000,
> > > > Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > 
> > > > > On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> > > > > > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> > > > > > 
> > > > > > Ping?
> > > > > 
> > > > > To whom?
> > > > > 
> > > > > You don't appeared to have Cc'd any relevant maintainer, and people are still
> > > > > on holiday, so it's extremely likely this will be missed.
> > > > 
> > > > That, plus nobody reads the list looking for this sort of things.
> > > > 
> > > > > 
> > > > > For the maintainer, please use scripts/get_maintainer.pl, e.g.
> > > > > 
> > > > > > [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
> > > > > > Thomas Gleixner <tglx@linutronix.de> (maintainer:IRQCHIP DRIVERS)
> > > > > > Marc Zyngier <maz@kernel.org> (maintainer:IRQCHIP DRIVERS)
> > > > > > linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS)
> > > > > 
> > > > > Note: I've Cc'd Marc, who wrote the GICv3 driver.
> > > > 
> > > > Cheers Mark, much appreciated.
> > > 
> > > Sorry for missing that extra maintainer CC:
> > > 
> > > > 
> > > > > 
> > > > > > > Needed for reboot without resetting the whole GIC
> > > > > 
> > > > > This doesn't really explain what you're trying to do nor why.
> > > > > 
> > > > > Why do you need to "reboot without resetting the whole GIC" ?
> > > > > 
> > > > > Do you encounter a problem if we try to reset the whole GIC?
> > > > > 
> > > > > Is this for kexec?
> > > > > 
> > > > > Is this for some use-case enabled by out-of-tree code?
> > > > 
> > > > All valid questions. This smells of a terrible hack...
> > > 
> > > Yes, all god Q's.
> > 
> > And no answer?
> 
> No kexec, out of tree kernel with minor mods
> > 
> > > 
> > > > 
> > > > The interesting aspect is that this is only done when DS=1, probably
> > > > meaning that they are doing this in a VM. it also rely on some
> > > 
> > > Nope, on our custom target.
> > 
> > And you run with DS=1? On bare metal? Humpf...
> 
> Yes, we control all SW running on this target.
> 
> > 
> > > 
> > > > (unbounded) UNPRED behaviour as ProcessorSleep is entered without
> > > > any consideration for Group0... Good luck with that.
> 
> No Group0 IRQ used(or so I think but I could be mistaken)
> 
> > > 
> > > hmm, I am doing the same as PM does which also needs DS=1 so I figured
> > > this was uncontroversial.
> > 
> > I seriously doubt anyone is actually using that code in anger. I
> > always found it dodgy, and I'm pretty sure it is totally broken.
> 
> I see
> 
> > 
> > > 
> > > > 
> > > > Anyway, I don't think we want any of this stuff. Certainly not without
> > > > a cast-iron justification.
> > > 
> > > We use several cores but only one run Linux so a Linux reboot will
> > > only reset its own core.
> > 
> > And what happens with the distributor? One of the key assumption of
> > the GIC architecture is that there is only one operating system in
> > control of it, the whole of it. The only way to share it is by
> > virtualising it. Shades of Jailhouse...
> 
> Distributor is not touched until Linux is going up again
> 
> > 
> > > 
> > > Without this patch I either need to reset the GIC as part of the
> > > reboot or I get RWP timeout when linux starts again. Resetting the
> > > GIC kills IRQ on the other cores for a long time which is unwanted.
> > 
> > But that's what happens anyway when Linux boots (we reinitialise the
> > whole distributor). So what is this about?
> 
> At that point IRQ will only be lost shortly until the other cores
> are reloaded by Linux user space

How different is that from reseting the GIC altogether?

> 
> > 
> > > 
> > > The RWP timeout comes from lost HW handshake between core and GIC.
> > > Is there another way to regain the HW handshake ?
> > 
> > That's the firmware's job. But overall, getting these timeouts means
> > your GIC has locked up. It would be more interesting to understand
> 
> FW here is u-boot starting from EL3
> 
> > *why* you get in that situation, which I presume is due to the way the
> > driver initialises itself.
> > 
> > Assuming you use kexec to reboot your Linux instance, does the
> > following help (totally untested)?
> 
> no kexec and no it did not help.
> I get the first RWP here:
> static void __init gic_dist_init(void)
> {
> 	unsigned int i;
> 	u64 affinity;
> 	void __iomem *base = gic_data.dist_base;
> 	u32 val;
> 
> 	/* Disable the distributor */
> 	writel_relaxed(0, base + GICD_CTLR);
> 	gic_dist_wait_for_rwp(); <------ RWP timeout
> 
> I guess it is impossible to recover GIC HW handshake once it is lost ?

This is trying to teardown the distributor, but the downstream
redistributor (and CPU interface) are still up. The architecture
doesn't specify what can happen in this case, and I wouldn't be
surprised if you were triggering some really bad HW bugs. Please
provide the exact version of your GIC (at least the GICD_IIDR value).

Since you're in control of your firmware, you should reset things
there, because even if we hack something here for the boot CPU, this
says nothing of the other CPUs.

	M.
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index dd7e7f061e8f..1989ea3d2db4 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -23,6 +23,7 @@ 
 #include <linux/irqchip/arm-gic-common.h>
 #include <linux/irqchip/arm-gic-v3.h>
 #include <linux/irqchip/irq-partition-percpu.h>
+#include <linux/reboot.h>
 
 #include <asm/cputype.h>
 #include <asm/exception.h>
@@ -1359,6 +1360,22 @@  static void gic_cpu_pm_init(void)
 static inline void gic_cpu_pm_init(void) { }
 #endif /* CONFIG_CPU_PM */
 
+static int gicv3_restart_notify(struct notifier_block *nb,
+                              unsigned long mode, void *cmd)
+{
+       if (gic_dist_security_disabled()) {
+               gic_write_grpen1(0);
+               gic_enable_redist(false);
+       }
+
+       return NOTIFY_DONE;
+}
+
+static struct notifier_block gicv3_restart_nb = {
+       .notifier_call = gicv3_restart_notify,
+       .priority = 255, /* Call last */
+};
+
 static struct irq_chip gic_chip = {
 	.name			= "GICv3",
 	.irq_mask		= gic_mask_irq,
@@ -1849,6 +1866,7 @@  static int __init gic_init_bases(void __iomem *dist_base,
 	gic_cpu_init();
 	gic_smp_init();
 	gic_cpu_pm_init();
+	register_restart_handler(&gicv3_restart_nb);
 
 	if (gic_dist_supports_lpis()) {
 		its_init(handle, &gic_data.rdists, gic_data.domain);