Message ID | 20140725093127.GB5269@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 25 July 2014 10:31, Will Deacon <will.deacon@arm.com> wrote: > On Thu, Jul 24, 2014 at 09:05:39PM +0100, Peter Maydell wrote: >> On 24 July 2014 20:55, Will Deacon <will.deacon@arm.com> wrote: >> > Again, that can be solved by introduced Marc's attr for determining the >> > GICV offset within the 64k page. I don't think that's -stable material. >> >> Agreed that we don't want to put Marc's patchset in -stable >> (and that without it systems with GICV in their host devicetree >> at pagebase+60K are unusable, so we're not actually regressing >> anything if we put this into stable). But... >> >> >> I can't think of any way of determining whether a particular >> >> system gets this right or wrong automatically, which suggests >> >> perhaps we need to allow the device tree to specify that the >> >> GICV is 64k-page-safe... >> > >> > When we support such systems, I also think we'll need a device-tree change. >> > My main concern right now is stopping the ability to hose the entire machine >> > by trying to instantiate a virtual GIC. >> >> ...I don't see how your patch prevents instantiating a VGIC >> and hosing the machine on a system where the 64K >> with the GICV registers in it goes >> [GICV registers] [machine blows up if you read this] >> 0K 8K 64K > > True, if such a machine existed, then this patch wouldn't detect it. I don't > think we support anything like that in mainline at the moment, but the > following additional diff should solve the problem, no? Well, the apm-storm.dtsi specifies a 64K aligned GICV base but a size of 8K, so presumably it qualifies. (If the GICV size should really be 64K and that's an APM device tree bug then this patch is going to make KVM on 64K page hosts stop working on that board... somebody with access to the h/w docs for the APM boards needs to check that, I guess.) > diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > index fa9a95b3ed19..476d3bf540a8 100644 > --- a/virt/kvm/arm/vgic.c > +++ b/virt/kvm/arm/vgic.c > @@ -1539,6 +1539,14 @@ int kvm_vgic_hyp_init(void) > goto out_unmap; > } > > + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { > + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", > + (unsigned long long)resource_size(&vcpu_res), > + PAGE_SIZE); > + ret = -ENXIO; > + goto out_unmap; > + } > + > vgic_vcpu_base = vcpu_res.start; > > kvm_info("%s@%llx IRQ%d\n", vgic_node->name, Yes, I think if we check both start and size are page aligned then this will both: (a) avoid using the VGIC on systems where it's going to allow the guest to take down the machine (b) avoid using the VGIC if it's not at a page boundary, pending Marc's patchset landing and making that case actually work rather than do weird things. thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> I can't think of any way of determining whether a particular >>>> system gets this right or wrong automatically, which suggests >>>> perhaps we need to allow the device tree to specify that the >>>> GICV is 64k-page-safe... >>> When we support such systems, I also think we'll need a device-tree change. >>> My main concern right now is stopping the ability to hose the entire machine >>> by trying to instantiate a virtual GIC. >> ...I don't see how your patch prevents instantiating a VGIC >> and hosing the machine on a system where the 64K >> with the GICV registers in it goes >> [GICV registers] [machine blows up if you read this] >> 0K 8K 64K > True, if such a machine existed, then this patch wouldn't detect it. I don't > think we support anything like that in mainline at the moment, but the > following additional diff should solve the problem, no? > > diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > index fa9a95b3ed19..476d3bf540a8 100644 > --- a/virt/kvm/arm/vgic.c > +++ b/virt/kvm/arm/vgic.c > @@ -1539,6 +1539,14 @@ int kvm_vgic_hyp_init(void) > goto out_unmap; > } > > + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { > + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", > + (unsigned long long)resource_size(&vcpu_res), > + PAGE_SIZE); > + ret = -ENXIO; > + goto out_unmap; > + } > + > vgic_vcpu_base = vcpu_res.start; > > kvm_info("%s@%llx IRQ%d\n", vgic_node->name, This would break with my SOC device tree which looks like this. Note this device tree works just fine without checks. gic: interrupt-controller@e1101000 { compatible = "arm,gic-400-v2m"; #interrupt-cells = <3>; #address-cells = <0>; interrupt-controller; msi-controller; reg = <0x0 0xe1110000 0 0x1000>, /* gic dist */ <0x0 0xe112f000 0 0x2000>, /* gic cpu */ <0x0 0xe114f000 0 0x2000>, /* gic virtual ic*/ <0x0 0xe116f000 0 0x2000>, /* gic virtual cpu*/ <0x0 0xe1180000 0 0x1000>; /* gic msi */ interrupts = <1 8 0xf04>; }; -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Joel, On Fri, Jul 25, 2014 at 03:02:58PM +0100, Joel Schopp wrote: > >>>> I can't think of any way of determining whether a particular > >>>> system gets this right or wrong automatically, which suggests > >>>> perhaps we need to allow the device tree to specify that the > >>>> GICV is 64k-page-safe... > >>> When we support such systems, I also think we'll need a device-tree change. > >>> My main concern right now is stopping the ability to hose the entire machine > >>> by trying to instantiate a virtual GIC. > >> ...I don't see how your patch prevents instantiating a VGIC > >> and hosing the machine on a system where the 64K > >> with the GICV registers in it goes > >> [GICV registers] [machine blows up if you read this] > >> 0K 8K 64K > > True, if such a machine existed, then this patch wouldn't detect it. I don't > > think we support anything like that in mainline at the moment, but the > > following additional diff should solve the problem, no? > > > > diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > > index fa9a95b3ed19..476d3bf540a8 100644 > > --- a/virt/kvm/arm/vgic.c > > +++ b/virt/kvm/arm/vgic.c > > @@ -1539,6 +1539,14 @@ int kvm_vgic_hyp_init(void) > > goto out_unmap; > > } > > > > + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { > > + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", > > + (unsigned long long)resource_size(&vcpu_res), > > + PAGE_SIZE); > > + ret = -ENXIO; > > + goto out_unmap; > > + } > > + > > vgic_vcpu_base = vcpu_res.start; > > > > kvm_info("%s@%llx IRQ%d\n", vgic_node->name, > This would break with my SOC device tree which looks like this. Note > this device tree works just fine without checks. > > gic: interrupt-controller@e1101000 { > compatible = "arm,gic-400-v2m"; > #interrupt-cells = <3>; > #address-cells = <0>; > interrupt-controller; > msi-controller; > reg = <0x0 0xe1110000 0 0x1000>, /* gic dist */ > <0x0 0xe112f000 0 0x2000>, /* gic cpu */ > <0x0 0xe114f000 0 0x2000>, /* gic virtual ic*/ > <0x0 0xe116f000 0 0x2000>, /* gic virtual cpu*/ > <0x0 0xe1180000 0 0x1000>; /* gic msi */ > interrupts = <1 8 0xf04>; > }; I appreciate it may work, but that's only because the kernel is actually using an alias of GICV at 0xe1160000 by accident. I would say that you're getting away with passing an incorrect description. Will -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 25 July 2014 15:02, Joel Schopp <joel.schopp@amd.com> wrote: >> --- a/virt/kvm/arm/vgic.c >> +++ b/virt/kvm/arm/vgic.c >> @@ -1539,6 +1539,14 @@ int kvm_vgic_hyp_init(void) >> goto out_unmap; >> } >> >> + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { >> + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", >> + (unsigned long long)resource_size(&vcpu_res), >> + PAGE_SIZE); >> + ret = -ENXIO; >> + goto out_unmap; >> + } >> + >> vgic_vcpu_base = vcpu_res.start; >> >> kvm_info("%s@%llx IRQ%d\n", vgic_node->name, > This would break with my SOC device tree which looks like this. Note > this device tree works just fine without checks. > > gic: interrupt-controller@e1101000 { > compatible = "arm,gic-400-v2m"; > #interrupt-cells = <3>; > #address-cells = <0>; > interrupt-controller; > msi-controller; > reg = <0x0 0xe1110000 0 0x1000>, /* gic dist */ > <0x0 0xe112f000 0 0x2000>, /* gic cpu */ > <0x0 0xe114f000 0 0x2000>, /* gic virtual ic*/ > <0x0 0xe116f000 0 0x2000>, /* gic virtual cpu*/ > <0x0 0xe1180000 0 0x1000>; /* gic msi */ > interrupts = <1 8 0xf04>; > }; That has a non-aligned GICV -- does it really work on a mainline kernel without Marc's patches to cope with GICV which aren't at the base of a page? I can't see how... thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/25/2014 09:08 AM, Will Deacon wrote: > Hi Joel, > > On Fri, Jul 25, 2014 at 03:02:58PM +0100, Joel Schopp wrote: >>>>>> I can't think of any way of determining whether a particular >>>>>> system gets this right or wrong automatically, which suggests >>>>>> perhaps we need to allow the device tree to specify that the >>>>>> GICV is 64k-page-safe... >>>>> When we support such systems, I also think we'll need a device-tree change. >>>>> My main concern right now is stopping the ability to hose the entire machine >>>>> by trying to instantiate a virtual GIC. >>>> ...I don't see how your patch prevents instantiating a VGIC >>>> and hosing the machine on a system where the 64K >>>> with the GICV registers in it goes >>>> [GICV registers] [machine blows up if you read this] >>>> 0K 8K 64K >>> True, if such a machine existed, then this patch wouldn't detect it. I don't >>> think we support anything like that in mainline at the moment, but the >>> following additional diff should solve the problem, no? >>> >>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c >>> index fa9a95b3ed19..476d3bf540a8 100644 >>> --- a/virt/kvm/arm/vgic.c >>> +++ b/virt/kvm/arm/vgic.c >>> @@ -1539,6 +1539,14 @@ int kvm_vgic_hyp_init(void) >>> goto out_unmap; >>> } >>> >>> + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { >>> + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", >>> + (unsigned long long)resource_size(&vcpu_res), >>> + PAGE_SIZE); >>> + ret = -ENXIO; >>> + goto out_unmap; >>> + } >>> + >>> vgic_vcpu_base = vcpu_res.start; >>> >>> kvm_info("%s@%llx IRQ%d\n", vgic_node->name, >> This would break with my SOC device tree which looks like this. Note >> this device tree works just fine without checks. >> >> gic: interrupt-controller@e1101000 { >> compatible = "arm,gic-400-v2m"; >> #interrupt-cells = <3>; >> #address-cells = <0>; >> interrupt-controller; >> msi-controller; >> reg = <0x0 0xe1110000 0 0x1000>, /* gic dist */ >> <0x0 0xe112f000 0 0x2000>, /* gic cpu */ >> <0x0 0xe114f000 0 0x2000>, /* gic virtual ic*/ >> <0x0 0xe116f000 0 0x2000>, /* gic virtual cpu*/ >> <0x0 0xe1180000 0 0x1000>; /* gic msi */ >> interrupts = <1 8 0xf04>; >> }; > I appreciate it may work, but that's only because the kernel is actually > using an alias of GICV at 0xe1160000 by accident. I would say that you're > getting away with passing an incorrect description. The problem is that by the spec the size is 0x2000 and was never properly rearchitected from 4K to variable page size support. The workaround is that each of the 4K in the 64K page (16 of them) are really mapped to the same location in hardware. So the contents of 0xe1160000 is the same as the contents of 0xe116f000. See “Appendix F: GIC-400 and 64KB Translation Granule” in v2.1 of the ARM _Server Base System Architecture_ specification. Now if we were to change the entry to <0x0 0xe1160000 0 0x2000> it would be obviously broken because the second half would map to a duplicate of the first half. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 25, 2014 at 03:16:15PM +0100, Joel Schopp wrote: > On 07/25/2014 09:08 AM, Will Deacon wrote: > >> This would break with my SOC device tree which looks like this. Note > >> this device tree works just fine without checks. > >> > >> gic: interrupt-controller@e1101000 { > >> compatible = "arm,gic-400-v2m"; > >> #interrupt-cells = <3>; > >> #address-cells = <0>; > >> interrupt-controller; > >> msi-controller; > >> reg = <0x0 0xe1110000 0 0x1000>, /* gic dist */ > >> <0x0 0xe112f000 0 0x2000>, /* gic cpu */ > >> <0x0 0xe114f000 0 0x2000>, /* gic virtual ic*/ > >> <0x0 0xe116f000 0 0x2000>, /* gic virtual cpu*/ > >> <0x0 0xe1180000 0 0x1000>; /* gic msi */ > >> interrupts = <1 8 0xf04>; > >> }; > > I appreciate it may work, but that's only because the kernel is actually > > using an alias of GICV at 0xe1160000 by accident. I would say that you're > > getting away with passing an incorrect description. > The problem is that by the spec the size is 0x2000 and was never > properly rearchitected from 4K to variable page size support. The > workaround is that each of the 4K in the 64K page (16 of them) are > really mapped to the same location in hardware. So the contents of > 0xe1160000 is the same as the contents of 0xe116f000. See “Appendix F: > GIC-400 and 64KB Translation Granule” in v2.1 of the ARM _Server Base > System Architecture_ specification. I've read that document, but it's not mandated and I don't think I have a single piece of hardware that actually follows it. Even the CPUs don't seem to perform the aliasing suggesting there (take a look at the A57 and A53 TRMs -- there are reserved regions in there). > Now if we were to change the entry to <0x0 0xe1160000 0 0x2000> it would > be obviously broken because the second half would map to a duplicate of > the first half. I think you'd need something like <0x0 0xe1160000 0 0x20000> and we'd have to change the GIC driver to do the right thing. What we currently have is unsafe on most hardware, yet happens to work on your system. Will -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/25/2014 09:23 AM, Will Deacon wrote: > On Fri, Jul 25, 2014 at 03:16:15PM +0100, Joel Schopp wrote: >> On 07/25/2014 09:08 AM, Will Deacon wrote: >>>> This would break with my SOC device tree which looks like this. Note >>>> this device tree works just fine without checks. >>>> >>>> gic: interrupt-controller@e1101000 { >>>> compatible = "arm,gic-400-v2m"; >>>> #interrupt-cells = <3>; >>>> #address-cells = <0>; >>>> interrupt-controller; >>>> msi-controller; >>>> reg = <0x0 0xe1110000 0 0x1000>, /* gic dist */ >>>> <0x0 0xe112f000 0 0x2000>, /* gic cpu */ >>>> <0x0 0xe114f000 0 0x2000>, /* gic virtual ic*/ >>>> <0x0 0xe116f000 0 0x2000>, /* gic virtual cpu*/ >>>> <0x0 0xe1180000 0 0x1000>; /* gic msi */ >>>> interrupts = <1 8 0xf04>; >>>> }; >>> I appreciate it may work, but that's only because the kernel is actually >>> using an alias of GICV at 0xe1160000 by accident. I would say that you're >>> getting away with passing an incorrect description. >> The problem is that by the spec the size is 0x2000 and was never >> properly rearchitected from 4K to variable page size support. The >> workaround is that each of the 4K in the 64K page (16 of them) are >> really mapped to the same location in hardware. So the contents of >> 0xe1160000 is the same as the contents of 0xe116f000. See “Appendix F: >> GIC-400 and 64KB Translation Granule” in v2.1 of the ARM _Server Base >> System Architecture_ specification. > I've read that document, but it's not mandated and I don't think I have a > single piece of hardware that actually follows it. Even the CPUs don't seem > to perform the aliasing suggesting there (take a look at the A57 and A53 > TRMs -- there are reserved regions in there). Not mandated, but it is obviously highly encouraged, and at a minimum valid. The SOC sitting under my desk follows it. > >> Now if we were to change the entry to <0x0 0xe1160000 0 0x2000> it would >> be obviously broken because the second half would map to a duplicate of >> the first half. > I think you'd need something like <0x0 0xe1160000 0 0x20000> and we'd have > to change the GIC driver to do the right thing. What we currently have is > unsafe on most hardware, yet happens to work on your system. If you change the GIC driver,kvm, kvmtool, and qemu to do the right thing I'd be happy to change the device tree entry to <0x0 0xe1160000 0 0x20000> or any other value of your choosing. I really have no opinion here on how it should be done other than what is there now currently works and I'd like to have whatever we do in the future continue to work. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index fa9a95b3ed19..476d3bf540a8 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -1539,6 +1539,14 @@ int kvm_vgic_hyp_init(void) goto out_unmap; } + if (!PAGE_ALIGNED(resource_size(&vcpu_res))) { + kvm_err("GICV size 0x%llx not a multiple of page size 0x%lx\n", + (unsigned long long)resource_size(&vcpu_res), + PAGE_SIZE); + ret = -ENXIO; + goto out_unmap; + } + vgic_vcpu_base = vcpu_res.start; kvm_info("%s@%llx IRQ%d\n", vgic_node->name,