Message ID | 20191107033709.17575-1-peng.fan@nxp.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | arch: arm: vgic-v3: fix GICD_ISACTIVER range | expand |
On Thu, 7 Nov 2019, Peng Fan wrote: > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Juergen, I think this fix should be in the release (and also backported to stable trees.) > --- > xen/arch/arm/vgic-v3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > index 422b94f902..e802f2055a 100644 > --- a/xen/arch/arm/vgic-v3.c > +++ b/xen/arch/arm/vgic-v3.c > @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v, > goto read_as_zero; > > /* Read the active status of an IRQ via GICD/GICR is not supported */ > - case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER): > + case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): > case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN): > goto read_as_zero; > > -- > 2.16.4 >
Hi, Sorry for the formatting. On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@kernel.org> wrote: > On Thu, 7 Nov 2019, Peng Fan wrote: > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > To be honest, I am not sure the code is correct. A read to those registers should tell you the list of interrupts active. As we always return 0, this will not return the correct state of the GIC. I know that returning the list of actives interrupts is complicated with the old vGIC, but I don't think silently ignoring it is a good idea. The question here is why the guest accessed those registers? What is it trying to figure out? > Juergen, I think this fix should be in the release (and also > backported to stable trees.) > Without an understanding of the problem, I disagree with this request (see above). As an aside, the range ISPENDR has the same issue. Cheers, > > > > --- > > xen/arch/arm/vgic-v3.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > > index 422b94f902..e802f2055a 100644 > > --- a/xen/arch/arm/vgic-v3.c > > +++ b/xen/arch/arm/vgic-v3.c > > @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const > char *name, struct vcpu *v, > > goto read_as_zero; > > > > /* Read the active status of an IRQ via GICD/GICR is not supported > */ > > - case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER): > > + case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): > > case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN): > > goto read_as_zero; > > > > -- > > 2.16.4 > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel
On Sat, 9 Nov 2019, Julien Grall wrote: > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@kernel.org> wrote: > On Thu, 7 Nov 2019, Peng Fan wrote: > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > > > To be honest, I am not sure the code is correct. A read to those registers should tell you the list of interrupts active. As we always > return 0, this will not return the correct state of the GIC. > > I know that returning the list of actives interrupts is complicated with the old vGIC, but I don't think silently ignoring it is a good > idea. > The question here is why the guest accessed those registers? What is it trying to figure out? We are not going to solve the general problem at this stage. At the moment the code: - ignore the first register only - print an error and return an IO_ABORT error for the other regs For the inconsistency alone the second option is undesirable. Also it doesn't match the write implementation, which does the same thing for all the GICD_ISACTIVER* regs instead of having a special treatment for the first one only. It looks like a typo in the original patch to me. The proposed patch switches the behavior to: - silently ignore all the GICD_ISACTIVER* regs (as proposed) is an improvement. > Juergen, I think this fix should be in the release (and also > backported to stable trees.) > > > Without an understanding of the problem, I disagree with this request (see above). > > As an aside, the range ISPENDR has the same issue. You meant GICD_ICPENDR, right? Yep, that one is suffering from the same typo mistake too. > > --- > > xen/arch/arm/vgic-v3.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > > index 422b94f902..e802f2055a 100644 > > --- a/xen/arch/arm/vgic-v3.c > > +++ b/xen/arch/arm/vgic-v3.c > > @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v, > > goto read_as_zero; > > > > /* Read the active status of an IRQ via GICD/GICR is not supported */ > > - case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER): > > + case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): > > case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN): > > goto read_as_zero; > > > > -- > > 2.16.4 > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel > > >
Hi Julien, Inline marked with [Peng Fan] From: Julien Grall <julien.grall.oss@gmail.com> Sent: 2019年11月9日 6:44 To: Stefano Stabellini <sstabellini@kernel.org>; Andre Przywara <andre.przywara@arm.com> Cc: Peng Fan <peng.fan@nxp.com>; Jürgen Groß <jgross@suse.com>; julien.grall@arm.com; xen-devel@lists.xen.org Subject: Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range Hi, Sorry for the formatting. On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <mailto:sstabellini@kernel.org> wrote: On Thu, 7 Nov 2019, Peng Fan wrote: > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > Signed-off-by: Peng Fan <mailto:peng.fan@nxp.com> Reviewed-by: Stefano Stabellini <mailto:sstabellini@kernel.org> To be honest, I am not sure the code is correct. A read to those registers should tell you the list of interrupts active. As we always return 0, this will not return the correct state of the GIC. I know that returning the list of actives interrupts is complicated with the old vGIC, but I don't think silently ignoring it is a good idea. The question here is why the guest accessed those registers? What is it trying to figure out? [Peng Fan] I am running Linux 5.4 kernel dom0, gic_peek_irq triggers abort. Juergen, I think this fix should be in the release (and also backported to stable trees.) Without an understanding of the problem, I disagree with this request (see above). As an aside, the range ISPENDR has the same issue. [Peng Fan] Should I include this change in v2? Or develop new method to fix the issue? But at least dom0 abort when boot. Thanks, Peng. Cheers, > --- > xen/arch/arm/vgic-v3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > index 422b94f902..e802f2055a 100644 > --- a/xen/arch/arm/vgic-v3.c > +++ b/xen/arch/arm/vgic-v3.c > @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v, > goto read_as_zero; > > /* Read the active status of an IRQ via GICD/GICR is not supported */ > - case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER): > + case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): > case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN): > goto read_as_zero; > > -- > 2.16.4 >
On Mon, 11 Nov 2019 11:01:07 -0800 (PST) Stefano Stabellini <sstabellini@kernel.org> wrote: Hi, > On Sat, 9 Nov 2019, Julien Grall wrote: > > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@kernel.org> wrote: > > On Thu, 7 Nov 2019, Peng Fan wrote: > > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > > > > > > To be honest, I am not sure the code is correct. A read to those registers should tell you the list of interrupts active. As we always > > return 0, this will not return the correct state of the GIC. > > > > I know that returning the list of actives interrupts is complicated with the old vGIC, but I don't think silently ignoring it is a good > > idea. > > The question here is why the guest accessed those registers? What is it trying to figure out? I see Linux querying the active state (IRQCHIP_STATE_ACTIVE) at two relevant points for ARM: - In kernel/irq/manage.c, in __synchronize_hardirq(). - In KVM's arch timer emulation code. I think the latter is of no concern (yet), but the first might actually trigger. At the moment it's beyond me what it actually does, but maybe some IRQ changes (RT, threaded IRQs?) trigger this now? > We are not going to solve the general problem at this stage. At the > moment the code: > > - ignore the first register only > - print an error and return an IO_ABORT error for the other regs > > For the inconsistency alone the second option is undesirable. Also it > doesn't match the write implementation, which does the same thing for > all the GICD_ISACTIVER* regs instead of having a special treatment for > the first one only. It looks like a typo in the original patch to me. > > The proposed patch switches the behavior to: > > - silently ignore all the GICD_ISACTIVER* regs (as proposed) > > is an improvement. Yeah, I agree. Getting the actual active state of a virtual IRQ is at least expensive, as you have to bring all VCPUs back, to sync the LR content back into something Xen can access. This is an architectural property of the GIC virtualisation, as normally the acknowledge happens without exiting, also the EOI, so Xen cannot know which state an IRQ is in while the VCPU is running. Think: Schrödinger ;-) Regarding this patch: The original code looks indeed like a typo to me: On the read side both ISACTIVERx and ICACTIVERx behave the same, so they should be handled the same. And returning 0 is probably the best approximation we can do at the moment. The other solution is to add GICv3 support to the new VGIC ;-), as this at least solves the case when we deliberately inject an active IRQ. We could extend this logic to find out which IRQs in this block *could* possibly be active, then bring those VCPUs back to Xen. Cheers, Andre. > > Juergen, I think this fix should be in the release (and also > > backported to stable trees.) > > > > > > Without an understanding of the problem, I disagree with this request (see above). > > > > As an aside, the range ISPENDR has the same issue. > > You meant GICD_ICPENDR, right? Yep, that one is suffering from the same > typo mistake too. > > > > > --- > > > xen/arch/arm/vgic-v3.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > > > index 422b94f902..e802f2055a 100644 > > > --- a/xen/arch/arm/vgic-v3.c > > > +++ b/xen/arch/arm/vgic-v3.c > > > @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v, > > > goto read_as_zero; > > > > > > /* Read the active status of an IRQ via GICD/GICR is not supported */ > > > - case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER): > > > + case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): > > > case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN): > > > goto read_as_zero; > > > > > > -- > > > 2.16.4 > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xenproject.org > > https://lists.xenproject.org/mailman/listinfo/xen-devel > > > > > >
On Tue, 12 Nov 2019, Peng Fan wrote: > Hi Julien, > > Inline marked with [Peng Fan] Please use plain text emails on xen-devel (and other open source development mailing lists.) > From: Julien Grall <julien.grall.oss@gmail.com> > Sent: 2019年11月9日 6:44 > To: Stefano Stabellini <sstabellini@kernel.org>; Andre Przywara <andre.przywara@arm.com> > Cc: Peng Fan <peng.fan@nxp.com>; Jürgen Groß <jgross@suse.com>; julien.grall@arm.com; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range > > Hi, > > Sorry for the formatting. > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <mailto:sstabellini@kernel.org> wrote: > On Thu, 7 Nov 2019, Peng Fan wrote: > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > Signed-off-by: Peng Fan <mailto:peng.fan@nxp.com> > > Reviewed-by: Stefano Stabellini <mailto:sstabellini@kernel.org> > > To be honest, I am not sure the code is correct. A read to those registers should tell you the list of interrupts active. As we always return 0, this will not return the correct state of the GIC. > > I know that returning the list of actives interrupts is complicated with the old vGIC, but I don't think silently ignoring it is a good idea. > > The question here is why the guest accessed those registers? What is it trying to figure out? > > [Peng Fan] I am running Linux 5.4 kernel dom0, gic_peek_irq triggers abort. > > > > Juergen, I think this fix should be in the release (and also > backported to stable trees.) > > Without an understanding of the problem, I disagree with this request (see above). > > As an aside, the range ISPENDR has the same issue. > > [Peng Fan] Should I include this change in v2? Or develop new method to fix the issue? > But at least dom0 abort when boot. Also considering Andre's reply, yes, please send another patch to fix for ISPENDR too. It doesn't have to be the same patch. Thank you! > > > > > --- > > xen/arch/arm/vgic-v3.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > > index 422b94f902..e802f2055a 100644 > > --- a/xen/arch/arm/vgic-v3.c > > +++ b/xen/arch/arm/vgic-v3.c > > @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v, > > goto read_as_zero; > > > > /* Read the active status of an IRQ via GICD/GICR is not supported */ > > - case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER): > > + case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): > > case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN): > > goto read_as_zero; > > > > -- > > 2.16.4 > > > > _______________________________________________ > Xen-devel mailing list > mailto:Xen-devel@lists.xenproject.org > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.xenproject.org%2Fmailman%2Flistinfo%2Fxen-devel&data=02%7C01%7Cpeng.fan%40nxp.com%7C33f2e907cdc84ed0a48608d7649d359e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637088498678782239&sdata=G3FA2vefr56FeUX5QVZQwSzG22nfv1m%2F0fKIDOnfuFQ%3D&reserved=0 >
On Tue, 12 Nov 2019, 04:01 Stefano Stabellini, <sstabellini@kernel.org> wrote: > On Sat, 9 Nov 2019, Julien Grall wrote: > > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@kernel.org> > wrote: > > On Thu, 7 Nov 2019, Peng Fan wrote: > > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > > > > > > To be honest, I am not sure the code is correct. A read to those > registers should tell you the list of interrupts active. As we always > > return 0, this will not return the correct state of the GIC. > > > > I know that returning the list of actives interrupts is complicated with > the old vGIC, but I don't think silently ignoring it is a good > > idea. > > The question here is why the guest accessed those registers? What is it > trying to figure out? > > We are not going to solve the general problem at this stage. At the > moment the code: > > - ignore the first register only > - print an error and return an IO_ABORT error for the other regs > > For the inconsistency alone the second option is undesirable. Also it > doesn't match the write implementation, which does the same thing for > all the GICD_ISACTIVER* regs instead of having a special treatment for > the first one only. It looks like a typo in the original patch to me. > > The proposed patch switches the behavior to: > > - silently ignore all the GICD_ISACTIVER* regs (as proposed) > is an improvement. > Peng mentioned that Linux is accessing it, so the worst thing we can do is lying to the guest (as you suggest here). I would definitely not call that an improvement. In the current state, it is a Nack. If there were a warning, then I would be more inclined to see this patch going through. Cheers,
On 13/11/2019 01:08, Julien Grall wrote: Hi, > On Tue, 12 Nov 2019, 04:01 Stefano Stabellini, <sstabellini@kernel.org > <mailto:sstabellini@kernel.org>> wrote: > > On Sat, 9 Nov 2019, Julien Grall wrote: > > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, > <sstabellini@kernel.org <mailto:sstabellini@kernel.org>> wrote: > > On Thu, 7 Nov 2019, Peng Fan wrote: > > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com > <mailto:peng.fan@nxp.com>> > > > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org > <mailto:sstabellini@kernel.org>> > > > > > > To be honest, I am not sure the code is correct. A read to those > registers should tell you the list of interrupts active. As we always > > return 0, this will not return the correct state of the GIC. > > > > I know that returning the list of actives interrupts is > complicated with the old vGIC, but I don't think silently ignoring > it is a good > > idea. > > The question here is why the guest accessed those registers? What > is it trying to figure out? > > We are not going to solve the general problem at this stage. At the > moment the code: > > - ignore the first register only > - print an error and return an IO_ABORT error for the other regs > > For the inconsistency alone the second option is undesirable. Also it > doesn't match the write implementation, which does the same thing for > all the GICD_ISACTIVER* regs instead of having a special treatment for > the first one only. It looks like a typo in the original patch to me. > > The proposed patch switches the behavior to: > > - silently ignore all the GICD_ISACTIVER* regs (as proposed) > > > is an improvement. > > > Peng mentioned that Linux is accessing it, so the worst thing we can do > is lying to the guest (as you suggest here). I would definitely not call > that an improvement. The ISACTIVER range is wrong in the description, it covers only one register, not multiple. This is obviously a typo, since it's correct in both GICv2 and in the high level switch/case in GICv3. Reading from outside of any range will inject an abort into the guest, which runs in kernel space. This will probably result in a guest crash. I would consider not crashing an improvement. About "lying" to the guest: Typically an IRQ is just active for a very short time, so 0 is a very good answer, actually. The old VGIC in KVM did exactly the same: vgic_reg_access(mmio, NULL, offset, ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED); The proper solution would be: 1) Track the state of the active bit when we can observe it, so when the guest exits with an active IRQ. The new VGIC does that. 2) Kick out all VCPUs that have IRQs in that given rank, and sample the active bit from the LRs. Sounds pretty horrible, and chances are very high you will get all 0s there. So if I compare "fix those two typos and preserve the state that the Xen VGIC has been in for years" to "create a lot of racy code for a rare corner case", the first one surely wins. That doesn't mean we should never try it, but surely this fix needs to go in meanwhile. > In the current state, it is a Nack. If there were a warning, then I > would be more inclined to see this patch going through. Do you mean a warning that we are about to lie to the guest? That sounds pretty useless, since nobody can do anything about it. Plus we have already those warnings on writing to these registers, and this always confuses people and triggered pointless bug reports. I think the old VGIC has bigger problems ;-) Cheers, Andre
On Wed, 13 Nov 2019, 10:55 André Przywara, <andre.przywara@arm.com> wrote: > On 13/11/2019 01:08, Julien Grall wrote: > > Hi, > > > On Tue, 12 Nov 2019, 04:01 Stefano Stabellini, <sstabellini@kernel.org > > <mailto:sstabellini@kernel.org>> wrote: > > > > On Sat, 9 Nov 2019, Julien Grall wrote: > > > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, > > <sstabellini@kernel.org <mailto:sstabellini@kernel.org>> wrote: > > > On Thu, 7 Nov 2019, Peng Fan wrote: > > > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com > > <mailto:peng.fan@nxp.com>> > > > > > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org > > <mailto:sstabellini@kernel.org>> > > > > > > > > > To be honest, I am not sure the code is correct. A read to those > > registers should tell you the list of interrupts active. As we always > > > return 0, this will not return the correct state of the GIC. > > > > > > I know that returning the list of actives interrupts is > > complicated with the old vGIC, but I don't think silently ignoring > > it is a good > > > idea. > > > The question here is why the guest accessed those registers? What > > is it trying to figure out? > > > > We are not going to solve the general problem at this stage. At the > > moment the code: > > > > - ignore the first register only > > - print an error and return an IO_ABORT error for the other regs > > > > For the inconsistency alone the second option is undesirable. Also it > > doesn't match the write implementation, which does the same thing for > > all the GICD_ISACTIVER* regs instead of having a special treatment > for > > the first one only. It looks like a typo in the original patch to me. > > > > The proposed patch switches the behavior to: > > > > - silently ignore all the GICD_ISACTIVER* regs (as proposed) > > > > > > is an improvement. > > > > > > Peng mentioned that Linux is accessing it, so the worst thing we can do > > is lying to the guest (as you suggest here). I would definitely not call > > that an improvement. > > The ISACTIVER range is wrong in the description, it covers only one > register, not multiple. This is obviously a typo, since it's correct in > both GICv2 and in the high level switch/case in GICv3. Reading from > outside of any range will inject an abort into the guest, which runs in > kernel space. This will probably result in a guest crash. I would > consider not crashing an improvement. > It is not. Neither the current approach to silently doing it. > About "lying" to the guest: Typically an IRQ is just active for a very > short time, so 0 is a very good answer, actually. So why does Linux is checking it? What will happen if there were actually an active interrupt but don't report it? The old VGIC in KVM > did exactly the same: > vgic_reg_access(mmio, NULL, offset, > ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED); > > The proper solution would be: > 1) Track the state of the active bit when we can observe it, so when the > guest exits with an active IRQ. The new VGIC does that. > 2) Kick out all VCPUs that have IRQs in that given rank, and sample the > active bit from the LRs. Sounds pretty horrible, and chances are very > high you will get all 0s there. > > So if I compare "fix those two typos and preserve the state that the Xen > VGIC has been in for years" to "create a lot of racy code for a rare > corner case", the first one surely wins. > That doesn't mean we should never try it, but surely this fix needs to > go in meanwhile. > I don't believe this patch to go in is the correct solution not from a technical PoV but to get things properly fixed. > > In the current state, it is a Nack. If there were a warning, then I > > would be more inclined to see this patch going through. > > Do you mean a warning that we are about to lie to the guest? That sounds > pretty useless, since nobody can do anything about it. Plus we have > already those warnings on writing to these registers, and this always > confuses people and triggered pointless bug reports. > Well, the warning has the benefits to annoy people. If we do it silently, then we don't encourage to fix it. > I think the old VGIC has bigger problems ;-) > I agree, but nobody seems to be willing to fix it... My only leverage here is pushing for a warning to annoy the user. So I maintain my request for a warning. Cheers, > Cheers, > Andre >
On Tue, 12 Nov 2019, 11:45 Peng Fan, <peng.fan@nxp.com> wrote: > Hi Julien, > > Inline marked with [Peng Fan] > > From: Julien Grall <julien.grall.oss@gmail.com> > Sent: 2019年11月9日 6:44 > To: Stefano Stabellini <sstabellini@kernel.org>; Andre Przywara < > andre.przywara@arm.com> > Cc: Peng Fan <peng.fan@nxp.com>; Jürgen Groß <jgross@suse.com>; > julien.grall@arm.com; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER > range > > Hi, > > Sorry for the formatting. > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <mailto: > sstabellini@kernel.org> wrote: > On Thu, 7 Nov 2019, Peng Fan wrote: > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > Signed-off-by: Peng Fan <mailto:peng.fan@nxp.com> > > Reviewed-by: Stefano Stabellini <mailto:sstabellini@kernel.org> > > To be honest, I am not sure the code is correct. A read to those registers > should tell you the list of interrupts active. As we always return 0, this > will not return the correct state of the GIC. > > I know that returning the list of actives interrupts is complicated with > the old vGIC, but I don't think silently ignoring it is a good idea. > > The question here is why the guest accessed those registers? What is it > trying to figure out? > > [Peng Fan] I am running Linux 5.4 kernel dom0, gic_peek_irq triggers abort. > Do you have a call stack trace for this? Cheers,
> Do you have a call stack trace for this?
(XEN) d0v1: vGICD: unhandled read r1 offset 0x000324
(XEN) traps.c:1994:d0v1 HSR=0x93810006 pc=0xffff8000104f2bb4 gva=0xffff800010010324 gpa=0x00000051a00324
[ 1.780564] Unhandled fault at 0xffff800010010324
[ 1.785771] Mem abort info:
[ 1.788899] ESR = 0x96000000
[ 1.792320] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1.798196] SET = 0, FnV = 0
[ 1.801615] EA = 0, S1PTW = 0
[ 1.805124] Data abort info:
[ 1.808350] ISV = 0, ISS = 0x00000000
[ 1.812620] CM = 0, WnR = 0
[ 1.815943] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a17f6000
[ 1.823344] [ffff800010010324] pgd=000000095ffff003, pud=000000095fffe003, pmd=000000095fffd003, pte=00e8000051a00707
[ 1.835016] Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
[ 1.842983] Modules linked in:
[ 1.846403] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc5-03191-g20646f88034a-dirty #1774
[ 1.855980] Hardware name: Freescale i.MX8QM MEK DOM0 (DT)
[ 1.862059] pstate: a0000085 (NzCv daIf -PAN -UAO)
[ 1.867374] pc : gic_peek_irq+0x6c/0xb0
[ 1.871638] lr : gic_peek_irq+0x28/0xb0
[ 1.875899] sp : ffff80001002ba80
[ 1.879599] x29: ffff80001002ba80 x28: 0000000000000000
[ 1.885483] x27: 0000000000000000 x26: ffff800011800514
[ 1.891369] x25: 0000000000000000 x24: 0000000000000000
[ 1.897246] x23: ffff0008dc3e5400 x22: ffff0008dc3e5458
[ 1.903129] x21: ffff800011d798c8 x20: 0000000000000001
[ 1.909011] x19: ffff80001002bb07 x18: 0000000000000004
[ 1.914893] x17: ffff8000112243c0 x16: 00000000deadbeef
[ 1.920780] x15: ffff0008dc01b490 x14: ffff0008dbcde5c0
[ 1.926658] x13: ffffffffffffff00 x12: ffffffffffffffff
[ 1.932540] x11: 0000000000000008 x10: ffffffffffffffff
[ 1.938422] x9 : c1c1c1c1c1c1c1c1 x8 : 0000000000000000
[ 1.944304] x7 : ffffffffffffffff x6 : ffff0008dc3e5458
[ 1.950186] x5 : ffff800011d798c8 x4 : 0000000000000008
[ 1.956069] x3 : 0000000000000324 x2 : fffffffffffffd03
[ 1.961951] x1 : ffff800010010324 x0 : 0000000000000300
[ 1.967835] Call trace:
[ 1.970588] gic_peek_irq+0x6c/0xb0
[ 1.974481] gic_irq_get_irqchip_state+0x70/0xa0
[ 1.979607] __synchronize_hardirq+0xd8/0xe0
[ 1.984343] __free_irq+0x13c/0x2b8
[ 1.988232] free_irq+0x30/0x70
[ 1.991749] devm_irq_release+0x14/0x20
[ 1.996019] release_nodes+0x1b0/0x220
[ 2.000189] devres_release_all+0x34/0x50
[ 2.004652] really_probe+0x1b8/0x308
[ 2.008728] driver_probe_device+0x54/0xe8
[ 2.013281] device_driver_attach+0x6c/0x78
[ 2.017929] __driver_attach+0x54/0xd0
[ 2.022106] bus_for_each_dev+0x70/0xc0
[ 2.026372] driver_attach+0x20/0x28
[ 2.030359] bus_add_driver+0x178/0x1d8
[ 2.034630] driver_register+0x60/0x110
[ 2.038900] __platform_driver_register+0x44/0x50
[ 2.044121] fec_driver_init+0x18/0x20
[ 2.048294] do_one_initcall+0x74/0x1b0
[ 2.052564] kernel_init_freeable+0x194/0x22c
[ 2.057403] kernel_init+0x10/0x100
[ 2.061293] ret_from_fork+0x10/0x18
[ 2.065279] Code: 53057c63 d37e6863 8b204063 8b030021 (b9400021)
[ 2.072024] ---[ end trace 62c8d1d1d10ae3de ]---
[ 2.077158] note: swapper/0[1] exited with preempt_count 1
[ 2.083296] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 2.091659] SMP: stopping secondary CPUs
[ 2.096037] Kernel Offset: disabled
[ 2.099914] CPU features: 0x0002,2000200c
[ 2.104371] Memory Limit: none
[ 2.107788] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
Regards,
Peng.
Hi Andre, On 12/11/2019 12:46, Andre Przywara wrote: > On Mon, 11 Nov 2019 11:01:07 -0800 (PST) > Stefano Stabellini <sstabellini@kernel.org> wrote: >> On Sat, 9 Nov 2019, Julien Grall wrote: >>> On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@kernel.org> wrote: >>> On Thu, 7 Nov 2019, Peng Fan wrote: >>> > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. >>> > >>> > Signed-off-by: Peng Fan <peng.fan@nxp.com> >>> >>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> >>> >>> >>> To be honest, I am not sure the code is correct. A read to those registers should tell you the list of interrupts active. As we always >>> return 0, this will not return the correct state of the GIC. >>> >>> I know that returning the list of actives interrupts is complicated with the old vGIC, but I don't think silently ignoring it is a good >>> idea. >>> The question here is why the guest accessed those registers? What is it trying to figure out? > > I see Linux querying the active state (IRQCHIP_STATE_ACTIVE) at two relevant points for ARM: > - In kernel/irq/manage.c, in __synchronize_hardirq(). > - In KVM's arch timer emulation code. > > I think the latter is of no concern (yet), but the first might actually trigger. At the moment it's beyond me what it actually does, but maybe some IRQ changes (RT, threaded IRQs?) trigger this now?\ It happens I am sitting right next to Marc now, so I had a chat with him about this :). Let me try to summarize the discussion here. __synchronize_hardirq() is used to ensure that all active interrupts have been handled before continuing. When sync_chip == false, we only ensure that all in progress interrupts (from Linux PoV) are handled before continue. sync_chip == true will additionally ensure that any interrupt that were acknowledge but not yet marked as in progress by the kernel are also handled. The assumption is this is called after the interrupt has been masked/disabled. The call to __synchronize_hardirq() in free_irq() (as reported by Peng stack trace) was introduced recently (see [1]). It is not entirely clear whether this would affect anyone using Linux 5.4 or just a limited subset of users. Anyhow, this is a genuine bug and I think returning 0 is only papering over the bug with no long-term resolution. As Marc pointed out, Even the old vGIC in KVM was not spec compliant and thankfully this was fixed in the new vGIC. As I said in a different sub-thread, I would reluctanly be ok to see returning 0 as long as we have add a warning for *every* access. Long-term, the current vGIC should really get killed. Cheers, [1] 62e0468650c30f0298822c580f382b16328119f6 "genirq: Add optional hardware synchronization for shutdown"
On Tue, 19 Nov 2019, Julien Grall wrote: > Hi Andre, > > On 12/11/2019 12:46, Andre Przywara wrote: > > On Mon, 11 Nov 2019 11:01:07 -0800 (PST) > > Stefano Stabellini <sstabellini@kernel.org> wrote: > > > On Sat, 9 Nov 2019, Julien Grall wrote: > > > > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@kernel.org> > > > > wrote: > > > > On Thu, 7 Nov 2019, Peng Fan wrote: > > > > > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. > > > > > > > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > > > > > > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > > > > > > > > > > > > To be honest, I am not sure the code is correct. A read to those > > > > registers should tell you the list of interrupts active. As we always > > > > return 0, this will not return the correct state of the GIC. > > > > > > > > I know that returning the list of actives interrupts is complicated with > > > > the old vGIC, but I don't think silently ignoring it is a good > > > > idea. > > > > The question here is why the guest accessed those registers? What is it > > > > trying to figure out? > > > > I see Linux querying the active state (IRQCHIP_STATE_ACTIVE) at two relevant > > points for ARM: > > - In kernel/irq/manage.c, in __synchronize_hardirq(). > > - In KVM's arch timer emulation code. > > > > I think the latter is of no concern (yet), but the first might actually > > trigger. At the moment it's beyond me what it actually does, but maybe some > > IRQ changes (RT, threaded IRQs?) trigger this now?\ > It happens I am sitting right next to Marc now, so I had a chat with him about > this :). Let me try to summarize the discussion here. > > __synchronize_hardirq() is used to ensure that all active interrupts have been > handled before continuing. When sync_chip == false, we only ensure that all in > progress interrupts (from Linux PoV) are handled before continue. > > sync_chip == true will additionally ensure that any interrupt that were > acknowledge but not yet marked as in progress by the kernel are also handled. > The assumption is this is called after the interrupt has been masked/disabled. > > The call to __synchronize_hardirq() in free_irq() (as reported by Peng stack > trace) was introduced recently (see [1]). It is not entirely clear whether > this would affect anyone using Linux 5.4 or just a limited subset of users. > > Anyhow, this is a genuine bug and I think returning 0 is only papering over > the bug with no long-term resolution. As Marc pointed out, Even the old vGIC > in KVM was not spec compliant and thankfully this was fixed in the new vGIC. > > As I said in a different sub-thread, I would reluctanly be ok to see returning > 0 as long as we have add a warning for *every* access. Long-term, the current > vGIC should really get killed. I appreciate your intention of raising awareness and getting help in fixing things in the community, which is the right thing to do. However, I am doubtful of the usefulness of a warning to achieve the goal. Maybe it would be best to curate an "open issues" section somewhere, even just as an email after every release or as part of the release notes, or as a jira ticket, or a wikipage, or a document under docs/. Actually, an "open issues" document under docs/ could be a good idea for this and other similar items. A warning is a blunt instrument that lacks the subtleties necessary to raise the attention in the right way and achieve the goal of getting help and contributions. Especially it has the risk of causing problems and confusion with less knowledgeable users. Maybe a dprintk warning message (only DEBUG builds) could avoid some of the issues, while still gaining attention of savvy developers who could understand what it means. But I think that the "open issues" document would be more effective.
Hi Stefano, On 21/11/2019 18:50, Stefano Stabellini wrote: > On Tue, 19 Nov 2019, Julien Grall wrote: >> Hi Andre, >> >> On 12/11/2019 12:46, Andre Przywara wrote: >>> On Mon, 11 Nov 2019 11:01:07 -0800 (PST) >>> Stefano Stabellini <sstabellini@kernel.org> wrote: >>>> On Sat, 9 Nov 2019, Julien Grall wrote: >>>>> On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@kernel.org> >>>>> wrote: >>>>> On Thu, 7 Nov 2019, Peng Fan wrote: >>>>> > The end should be GICD_ISACTIVERN not GICD_ISACTIVER. >>>>> > >>>>> > Signed-off-by: Peng Fan <peng.fan@nxp.com> >>>>> >>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> >>>>> >>>>> >>>>> To be honest, I am not sure the code is correct. A read to those >>>>> registers should tell you the list of interrupts active. As we always >>>>> return 0, this will not return the correct state of the GIC. >>>>> >>>>> I know that returning the list of actives interrupts is complicated with >>>>> the old vGIC, but I don't think silently ignoring it is a good >>>>> idea. >>>>> The question here is why the guest accessed those registers? What is it >>>>> trying to figure out? >>> >>> I see Linux querying the active state (IRQCHIP_STATE_ACTIVE) at two relevant >>> points for ARM: >>> - In kernel/irq/manage.c, in __synchronize_hardirq(). >>> - In KVM's arch timer emulation code. >>> >>> I think the latter is of no concern (yet), but the first might actually >>> trigger. At the moment it's beyond me what it actually does, but maybe some >>> IRQ changes (RT, threaded IRQs?) trigger this now?\ >> It happens I am sitting right next to Marc now, so I had a chat with him about >> this :). Let me try to summarize the discussion here. >> >> __synchronize_hardirq() is used to ensure that all active interrupts have been >> handled before continuing. When sync_chip == false, we only ensure that all in >> progress interrupts (from Linux PoV) are handled before continue. >> >> sync_chip == true will additionally ensure that any interrupt that were >> acknowledge but not yet marked as in progress by the kernel are also handled. >> The assumption is this is called after the interrupt has been masked/disabled. >> >> The call to __synchronize_hardirq() in free_irq() (as reported by Peng stack >> trace) was introduced recently (see [1]). It is not entirely clear whether >> this would affect anyone using Linux 5.4 or just a limited subset of users. >> >> Anyhow, this is a genuine bug and I think returning 0 is only papering over >> the bug with no long-term resolution. As Marc pointed out, Even the old vGIC >> in KVM was not spec compliant and thankfully this was fixed in the new vGIC. >> >> As I said in a different sub-thread, I would reluctanly be ok to see returning >> 0 as long as we have add a warning for *every* access. Long-term, the current >> vGIC should really get killed. > > I appreciate your intention of raising awareness and getting help in > fixing things in the community, which is the right thing to do. However, > I am doubtful of the usefulness of a warning to achieve the goal. Maybe > it would be best to curate an "open issues" section somewhere, even just > as an email after every release or as part of the release notes, or as a > jira ticket, or a wikipage, or a document under docs/. The state of the brokeness of the vGIC have been documented numerous time on the ML (see [1]) and on JIRA (see XEN-92). And there are probably more not reported (Marc mentionned a new one during the week). Yet no-one ever looked at finishing the new vGIC. So allow me to doubt that the documentation is going to help here... > > Actually, an "open issues" document under docs/ could be a good idea for > this and other similar items. > > A warning is a blunt instrument that lacks the subtleties necessary to > raise the attention in the right way and achieve the goal of getting > help and contributions. Especially it has the risk of causing problems > and confusion with less knowledgeable users. Maybe a dprintk warning > message (only DEBUG builds) could avoid some of the issues, while still > gaining attention of savvy developers who could understand what it means. > But I think that the "open issues" document would be more effective. See above, this is not the first time the state of the vGIC was raised. While I agree this is a blunt instruction, all the other options had no effect so far. So with this solution maybe someone will soon realized there are effort to put in the core architecture of Xen. Now, if you suggest me a plan (a person, a date of completion...) for fixing properly the vGIC then I would be more willing to accept the workaround suggested here. Cheers, [1] https://lists.xenproject.org/archives/html/xen-devel/2018-02/msg00784.html
On Thu, 21 Nov 2019, Julien Grall wrote: > > > As I said in a different sub-thread, I would reluctanly be ok to see > > > returning > > > 0 as long as we have add a warning for *every* access. Long-term, the > > > current > > > vGIC should really get killed. After speaking with Julien and reading more about the registers in question and the Linux changes in 5.4, I changed my mind and agree with the warning. Peng, would you be willing to submit a new update of the patch, adding a warning also? You can reuse the words used in the warning from vgic_v2_distr_mmio_write, something like: printk(XENLOG_G_ERR "%pv: vGICD: unhandled read from ISACTIVER%d\n", v, gicd_reg - GICD_ISACTIVER); Juergen, this is an easy-to-understand typo fix, plus a new warning. It is important to be able to run Linux 5.4 (see https://marc.info/?l=xen-devel&m=157372234429588). I think it should go in 4.13.
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 422b94f902..e802f2055a 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v, goto read_as_zero; /* Read the active status of an IRQ via GICD/GICR is not supported */ - case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER): + case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN): case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN): goto read_as_zero;
The end should be GICD_ISACTIVERN not GICD_ISACTIVER. Signed-off-by: Peng Fan <peng.fan@nxp.com> --- xen/arch/arm/vgic-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)