Message ID | 1406302152-32335-1-git-send-email-will.deacon@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/25/2014 10:29 AM, Will Deacon wrote: > If the physical address of GICV isn't page-aligned, then we end up > creating a stage-2 mapping of the page containing it, which causes us to > map neighbouring memory locations directly into the guest. > > As an example, consider a platform with GICV at physical 0x2c02f000 > running a 64k-page host kernel. If qemu maps this into the guest at > 0x80010000, then guest physical addresses 0x80010000 - 0x8001efff will > map host physical region 0x2c020000 - 0x2c02efff. Accesses to these > physical regions may cause UNPREDICTABLE behaviour, for example, on the > Juno platform this will cause an SError exception to EL3, which brings > down the entire physical CPU resulting in RCU stalls / HYP panics / host > crashing / wasted weeks of debugging. No denying this is a problem. > > SBSA recommends that systems alias the 4k GICV across the bounding 64k > region, in which case GICV physical could be described as 0x2c020000 in > the above scenario. The problem with this patch is the gicv is really 8K. The reason you would map at a 60K offset (0xf000), and why we do on our SOC, is so that the 8K gicv would pick up the last 4K from the first page and the first 4K from the next page. With your patch it is impossible to map all 8K of the gicv with 64K pages. My SOC which works fine with kvm now will go to not working with kvm after this patch. > > This patch fixes the problem by failing the vgic probe if the physical > base address or the size of GICV aren't page-aligned. Note that this > generated a warning in dmesg about freeing enabled IRQs, so I had to > move the IRQ enabling later in the probe. > > Cc: Christoffer Dall <christoffer.dall@linaro.org> > Cc: Marc Zyngier <marc.zyngier@arm.com> > Cc: Gleb Natapov <gleb@kernel.org> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Joel Schopp <joel.schopp@amd.com> > Cc: Don Dutile <ddutile@redhat.com> > Acked-by: Peter Maydell <peter.maydell@linaro.org> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > > v1 ->v2 : Added size alignment check and Peter's ack. Could this go in > for 3.16 please? > > virt/kvm/arm/vgic.c | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > index 56ff9bebb577..476d3bf540a8 100644 > --- a/virt/kvm/arm/vgic.c > +++ b/virt/kvm/arm/vgic.c > @@ -1526,17 +1526,33 @@ int kvm_vgic_hyp_init(void) > goto out_unmap; > } > > - kvm_info("%s@%llx IRQ%d\n", vgic_node->name, > - vctrl_res.start, vgic_maint_irq); > - on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1); > - > if (of_address_to_resource(vgic_node, 3, &vcpu_res)) { > kvm_err("Cannot obtain VCPU resource\n"); > ret = -ENXIO; > goto out_unmap; > } > + > + if (!PAGE_ALIGNED(vcpu_res.start)) { > + kvm_err("GICV physical address 0x%llx not page aligned\n", > + (unsigned long long)vcpu_res.start); > + ret = -ENXIO; > + goto out_unmap; > + } > + > + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { > + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", > + (unsigned long long)resource_size(&vcpu_res), > + PAGE_SIZE); > + ret = -ENXIO; > + goto out_unmap; > + } > + > vgic_vcpu_base = vcpu_res.start; > > + kvm_info("%s@%llx IRQ%d\n", vgic_node->name, > + vctrl_res.start, vgic_maint_irq); > + on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1); > + > goto out; > > out_unmap: -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 25 July 2014 16:56, Joel Schopp <joel.schopp@amd.com> wrote: > The problem with this patch is the gicv is really 8K. The reason you > would map at a 60K offset (0xf000), and why we do on our SOC, is so that > the 8K gicv would pick up the last 4K from the first page and the first > 4K from the next page. With your patch it is impossible to map all 8K > of the gicv with 64K pages. > > My SOC which works fine with kvm now will go to not working with kvm > after this patch. Your SOC currently works by fluke because the guest doesn't look at the last 4K of the GICC. If you're happy with it continuing to work by fluke you could make your device tree say it had a 64K GICV region with a 64K-aligned base. To make it work not by fluke but actually correctly requires Marc's patchset, at a minimum. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 25, 2014 at 04:56:18PM +0100, Joel Schopp wrote: > > On 07/25/2014 10:29 AM, Will Deacon wrote: > > If the physical address of GICV isn't page-aligned, then we end up > > creating a stage-2 mapping of the page containing it, which causes us to > > map neighbouring memory locations directly into the guest. > > > > As an example, consider a platform with GICV at physical 0x2c02f000 > > running a 64k-page host kernel. If qemu maps this into the guest at > > 0x80010000, then guest physical addresses 0x80010000 - 0x8001efff will > > map host physical region 0x2c020000 - 0x2c02efff. Accesses to these > > physical regions may cause UNPREDICTABLE behaviour, for example, on the > > Juno platform this will cause an SError exception to EL3, which brings > > down the entire physical CPU resulting in RCU stalls / HYP panics / host > > crashing / wasted weeks of debugging. > No denying this is a problem. > > SBSA recommends that systems alias the 4k GICV across the bounding 64k > > region, in which case GICV physical could be described as 0x2c020000 in > > the above scenario. > The problem with this patch is the gicv is really 8K. The reason you > would map at a 60K offset (0xf000), and why we do on our SOC, is so that > the 8K gicv would pick up the last 4K from the first page and the first > 4K from the next page. With your patch it is impossible to map all 8K > of the gicv with 64K pages. Please, help me with an alternative. If we drop the size alignment check, then we can miss some dangerous cases such as the one highlighted previously by Peter. > My SOC which works fine with kvm now will go to not working with kvm > after this patch. Right, but my only alternative is have CONFIG_KVM depends on !64K_PAGES, which sucks for everybody. Your device-tree entry has to change *anyway*, because as it stands we're mapping 60k of unknown stuff into the guest, which the kernel needs to know is safe. Will -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/25/2014 11:02 AM, Peter Maydell wrote: > On 25 July 2014 16:56, Joel Schopp <joel.schopp@amd.com> wrote: >> The problem with this patch is the gicv is really 8K. The reason you >> would map at a 60K offset (0xf000), and why we do on our SOC, is so that >> the 8K gicv would pick up the last 4K from the first page and the first >> 4K from the next page. With your patch it is impossible to map all 8K >> of the gicv with 64K pages. >> >> My SOC which works fine with kvm now will go to not working with kvm >> after this patch. > Your SOC currently works by fluke because the guest doesn't > look at the last 4K of the GICC. If you're happy with it continuing > to work by fluke you could make your device tree say it had a > 64K GICV region with a 64K-aligned base. > > To make it work not by fluke but actually correctly requires > Marc's patchset, at a minimum. Since we aren't actually using the last 4K of the gicv at the moment I supppose I could drop my objections to this patch and change my device tree until Marc's patchset provides a proper solution for the gicv's second 4K that works for everybody. Acked-by: Joel Schopp <joel.schopp@amd.com> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 25, 2014 at 05:24:18PM +0100, Joel Schopp wrote: > > On 07/25/2014 11:02 AM, Peter Maydell wrote: > > On 25 July 2014 16:56, Joel Schopp <joel.schopp@amd.com> wrote: > >> The problem with this patch is the gicv is really 8K. The reason you > >> would map at a 60K offset (0xf000), and why we do on our SOC, is so that > >> the 8K gicv would pick up the last 4K from the first page and the first > >> 4K from the next page. With your patch it is impossible to map all 8K > >> of the gicv with 64K pages. > >> > >> My SOC which works fine with kvm now will go to not working with kvm > >> after this patch. > > Your SOC currently works by fluke because the guest doesn't > > look at the last 4K of the GICC. If you're happy with it continuing > > to work by fluke you could make your device tree say it had a > > 64K GICV region with a 64K-aligned base. > > > > To make it work not by fluke but actually correctly requires > > Marc's patchset, at a minimum. > > Since we aren't actually using the last 4K of the gicv at the moment I > supppose I could drop my objections to this patch and change my device > tree until Marc's patchset provides a proper solution for the gicv's > second 4K that works for everybody. > > Acked-by: Joel Schopp <joel.schopp@amd.com> Thanks, Joel. Will -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 25 2014 at 4:29:12 pm BST, Will Deacon <will.deacon@arm.com> wrote: > If the physical address of GICV isn't page-aligned, then we end up > creating a stage-2 mapping of the page containing it, which causes us to > map neighbouring memory locations directly into the guest. > > As an example, consider a platform with GICV at physical 0x2c02f000 > running a 64k-page host kernel. If qemu maps this into the guest at > 0x80010000, then guest physical addresses 0x80010000 - 0x8001efff will > map host physical region 0x2c020000 - 0x2c02efff. Accesses to these > physical regions may cause UNPREDICTABLE behaviour, for example, on the > Juno platform this will cause an SError exception to EL3, which brings > down the entire physical CPU resulting in RCU stalls / HYP panics / host > crashing / wasted weeks of debugging. > > SBSA recommends that systems alias the 4k GICV across the bounding 64k > region, in which case GICV physical could be described as 0x2c020000 in > the above scenario. > > This patch fixes the problem by failing the vgic probe if the physical > base address or the size of GICV aren't page-aligned. Note that this > generated a warning in dmesg about freeing enabled IRQs, so I had to > move the IRQ enabling later in the probe. > > Cc: Christoffer Dall <christoffer.dall@linaro.org> > Cc: Marc Zyngier <marc.zyngier@arm.com> > Cc: Gleb Natapov <gleb@kernel.org> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Joel Schopp <joel.schopp@amd.com> > Cc: Don Dutile <ddutile@redhat.com> > Acked-by: Peter Maydell <peter.maydell@linaro.org> > Signed-off-by: Will Deacon <will.deacon@arm.com> Looks good to me: Acked-by: Marc Zyngier <marc.zyngier@arm.com> Christoffer, can you please take this as an urgent fix? Thanks, M.
On Wed, Jul 30, 2014 at 11:47:40AM +0100, Marc Zyngier wrote: > On Fri, Jul 25 2014 at 4:29:12 pm BST, Will Deacon <will.deacon@arm.com> wrote: > > If the physical address of GICV isn't page-aligned, then we end up > > creating a stage-2 mapping of the page containing it, which causes us to > > map neighbouring memory locations directly into the guest. > > > > As an example, consider a platform with GICV at physical 0x2c02f000 > > running a 64k-page host kernel. If qemu maps this into the guest at > > 0x80010000, then guest physical addresses 0x80010000 - 0x8001efff will > > map host physical region 0x2c020000 - 0x2c02efff. Accesses to these > > physical regions may cause UNPREDICTABLE behaviour, for example, on the > > Juno platform this will cause an SError exception to EL3, which brings > > down the entire physical CPU resulting in RCU stalls / HYP panics / host > > crashing / wasted weeks of debugging. > > > > SBSA recommends that systems alias the 4k GICV across the bounding 64k > > region, in which case GICV physical could be described as 0x2c020000 in > > the above scenario. > > > > This patch fixes the problem by failing the vgic probe if the physical > > base address or the size of GICV aren't page-aligned. Note that this > > generated a warning in dmesg about freeing enabled IRQs, so I had to > > move the IRQ enabling later in the probe. > > > > Cc: Christoffer Dall <christoffer.dall@linaro.org> > > Cc: Marc Zyngier <marc.zyngier@arm.com> > > Cc: Gleb Natapov <gleb@kernel.org> > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Cc: Joel Schopp <joel.schopp@amd.com> > > Cc: Don Dutile <ddutile@redhat.com> > > Acked-by: Peter Maydell <peter.maydell@linaro.org> > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > Looks good to me: > > Acked-by: Marc Zyngier <marc.zyngier@arm.com> > > Christoffer, can you please take this as an urgent fix? > Yes, sorry for the delay, Applied to master and notified the KVM guys to try and get it into 3.16. Thanks, -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 56ff9bebb577..476d3bf540a8 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -1526,17 +1526,33 @@ int kvm_vgic_hyp_init(void) goto out_unmap; } - kvm_info("%s@%llx IRQ%d\n", vgic_node->name, - vctrl_res.start, vgic_maint_irq); - on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1); - if (of_address_to_resource(vgic_node, 3, &vcpu_res)) { kvm_err("Cannot obtain VCPU resource\n"); ret = -ENXIO; goto out_unmap; } + + if (!PAGE_ALIGNED(vcpu_res.start)) { + kvm_err("GICV physical address 0x%llx not page aligned\n", + (unsigned long long)vcpu_res.start); + ret = -ENXIO; + goto out_unmap; + } + + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", + (unsigned long long)resource_size(&vcpu_res), + PAGE_SIZE); + ret = -ENXIO; + goto out_unmap; + } + vgic_vcpu_base = vcpu_res.start; + kvm_info("%s@%llx IRQ%d\n", vgic_node->name, + vctrl_res.start, vgic_maint_irq); + on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1); + goto out; out_unmap: