Message ID | 20190818140710.23920-1-maz@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm/arm64: vgic: Allow more than 256 vcpus for KVM_IRQ_LINE | expand |
On Sun, Aug 18, 2019 at 03:07:10PM +0100, Marc Zyngier wrote: > While parts of the VGIC support a large number of vcpus (we > bravely allow up to 512), other parts are more limited. > > One of these limits is visible in the KVM_IRQ_LINE ioctl, which > only allows 256 vcpus to be signalled when using the CPU or PPI > types. Unfortunately, we've cornered ourselves badly by allocating > all the bits in the irq field. > > Since the irq_type subfield (8 bit wide) is currently only taking > the values 0, 1 and 2 (and we have been careful not to allow anything > else), let's reduce this field to only 4 bits, and allocate the > remaining 4 bits to a vcpu2_index, which acts as a multiplier: > > vcpu_id = 256 * vcpu2_index + vcpu_index > > With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) > allowing this to be discovered, it becomes possible to inject > PPIs to up to 4096 vcpus. But please just don't. Do you actually need a new capability for this? Older kernels reject non-zero upper bits in the 'irq_type', so isn't that enough to probe for this directly? Will
On 19/08/2019 08:41, Will Deacon wrote: > On Sun, Aug 18, 2019 at 03:07:10PM +0100, Marc Zyngier wrote: >> While parts of the VGIC support a large number of vcpus (we >> bravely allow up to 512), other parts are more limited. >> >> One of these limits is visible in the KVM_IRQ_LINE ioctl, which >> only allows 256 vcpus to be signalled when using the CPU or PPI >> types. Unfortunately, we've cornered ourselves badly by allocating >> all the bits in the irq field. >> >> Since the irq_type subfield (8 bit wide) is currently only taking >> the values 0, 1 and 2 (and we have been careful not to allow anything >> else), let's reduce this field to only 4 bits, and allocate the >> remaining 4 bits to a vcpu2_index, which acts as a multiplier: >> >> vcpu_id = 256 * vcpu2_index + vcpu_index >> >> With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) >> allowing this to be discovered, it becomes possible to inject >> PPIs to up to 4096 vcpus. But please just don't. > > Do you actually need a new capability for this? Older kernels reject > non-zero upper bits in the 'irq_type', so isn't that enough to probe > for this directly? 'Probing' is a bit of an overstatement. You'll get an error back when userspace will try to inject a PPI into a vcpu whose ID is in the new range. But nothing at VM creation time will indicate the interrupt injection API supports more than 256 vcpus. I think userspace should be able to fail the creation of such large VM immediately, before actually running it. M.
Hi Marc, On 8/18/19 4:07 PM, Marc Zyngier wrote: > While parts of the VGIC support a large number of vcpus (we > bravely allow up to 512), other parts are more limited. > > One of these limits is visible in the KVM_IRQ_LINE ioctl, which > only allows 256 vcpus to be signalled when using the CPU or PPI > types. Unfortunately, we've cornered ourselves badly by allocating > all the bits in the irq field. > > Since the irq_type subfield (8 bit wide) is currently only taking > the values 0, 1 and 2 (and we have been careful not to allow anything > else), let's reduce this field to only 4 bits, and allocate the > remaining 4 bits to a vcpu2_index, which acts as a multiplier: > > vcpu_id = 256 * vcpu2_index + vcpu_index > > With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) > allowing this to be discovered, it becomes possible to inject > PPIs to up to 4096 vcpus. But please just don't. > > Reported-by: Zenghui Yu <yuzenghui@huawei.com> > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > Documentation/virt/kvm/api.txt | 8 ++++++-- > arch/arm/include/uapi/asm/kvm.h | 4 +++- > arch/arm64/include/uapi/asm/kvm.h | 4 +++- > include/uapi/linux/kvm.h | 1 + > virt/kvm/arm/arm.c | 2 ++ > 5 files changed, 15 insertions(+), 4 deletions(-) > > diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt > index 2d067767b617..85518bfb2a99 100644 > --- a/Documentation/virt/kvm/api.txt > +++ b/Documentation/virt/kvm/api.txt > @@ -753,8 +753,8 @@ in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to > use PPIs designated for specific cpus. The irq field is interpreted > like this: > > - bits: | 31 ... 24 | 23 ... 16 | 15 ... 0 | > - field: | irq_type | vcpu_index | irq_id | > + bits: | 31 ... 28 | 27 ... 24 | 23 ... 16 | 15 ... 0 | > + field: | vcpu2_index | irq_type | vcpu_index | irq_id | > > The irq_type field has the following values: > - irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ > @@ -766,6 +766,10 @@ The irq_type field has the following values: > > In both cases, level is used to assert/deassert the line. > > +When KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 is supported, the target vcpu is > +identified as (256 * vcpu2_index + vcpu_index). Otherwise, vcpu2_index > +must be zero. > + > struct kvm_irq_level { > union { > __u32 irq; /* GSI */ > diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h > index a4217c1a5d01..2769360f195c 100644 > --- a/arch/arm/include/uapi/asm/kvm.h > +++ b/arch/arm/include/uapi/asm/kvm.h > @@ -266,8 +266,10 @@ struct kvm_vcpu_events { > #define KVM_DEV_ARM_ITS_CTRL_RESET 4 > > /* KVM_IRQ_LINE irq field index values */ > +#define KVM_ARM_IRQ_VCPU2_SHIFT 28 > +#define KVM_ARM_IRQ_VCPU2_MASK 0xf > #define KVM_ARM_IRQ_TYPE_SHIFT 24 > -#define KVM_ARM_IRQ_TYPE_MASK 0xff > +#define KVM_ARM_IRQ_TYPE_MASK 0xf > #define KVM_ARM_IRQ_VCPU_SHIFT 16 > #define KVM_ARM_IRQ_VCPU_MASK 0xff > #define KVM_ARM_IRQ_NUM_SHIFT 0 > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h > index 9a507716ae2f..67c21f9bdbad 100644 > --- a/arch/arm64/include/uapi/asm/kvm.h > +++ b/arch/arm64/include/uapi/asm/kvm.h > @@ -325,8 +325,10 @@ struct kvm_vcpu_events { > #define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1 > > /* KVM_IRQ_LINE irq field index values */ > +#define KVM_ARM_IRQ_VCPU2_SHIFT 28 > +#define KVM_ARM_IRQ_VCPU2_MASK 0xf > #define KVM_ARM_IRQ_TYPE_SHIFT 24 > -#define KVM_ARM_IRQ_TYPE_MASK 0xff > +#define KVM_ARM_IRQ_TYPE_MASK 0xf > #define KVM_ARM_IRQ_VCPU_SHIFT 16 > #define KVM_ARM_IRQ_VCPU_MASK 0xff > #define KVM_ARM_IRQ_NUM_SHIFT 0 > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 5e3f12d5359e..5414b6588fbb 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -996,6 +996,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171 > #define KVM_CAP_ARM_PTRAUTH_GENERIC 172 > #define KVM_CAP_PMU_EVENT_FILTER 173 > +#define KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 174 > > #ifdef KVM_CAP_IRQ_ROUTING > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index 35a069815baf..c1385911de69 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -182,6 +182,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > int r; > switch (ext) { > case KVM_CAP_IRQCHIP: > + case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2: > r = vgic_present; > break; > case KVM_CAP_IOEVENTFD: > @@ -888,6 +889,7 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, > > irq_type = (irq >> KVM_ARM_IRQ_TYPE_SHIFT) & KVM_ARM_IRQ_TYPE_MASK; > vcpu_idx = (irq >> KVM_ARM_IRQ_VCPU_SHIFT) & KVM_ARM_IRQ_VCPU_MASK; > + vcpu_idx += ((irq >> KVM_ARM_IRQ_VCPU2_SHIFT) & KVM_ARM_IRQ_VCPU2_MASK) * (KVM_ARM_IRQ_VCPU_MASK + 1); > irq_num = (irq >> KVM_ARM_IRQ_NUM_SHIFT) & KVM_ARM_IRQ_NUM_MASK; > > trace_kvm_irq_line(irq_type, vcpu_idx, irq_num, irq_level->level); > Thank you for the patch! Reviewed-by: Eric Auger <eric.auger@redhat.com> Eric
Hi Marc, On 2019/8/18 22:07, Marc Zyngier wrote: > While parts of the VGIC support a large number of vcpus (we > bravely allow up to 512), other parts are more limited. > > One of these limits is visible in the KVM_IRQ_LINE ioctl, which > only allows 256 vcpus to be signalled when using the CPU or PPI > types. Unfortunately, we've cornered ourselves badly by allocating > all the bits in the irq field. > > Since the irq_type subfield (8 bit wide) is currently only taking > the values 0, 1 and 2 (and we have been careful not to allow anything > else), let's reduce this field to only 4 bits, and allocate the > remaining 4 bits to a vcpu2_index, which acts as a multiplier: > > vcpu_id = 256 * vcpu2_index + vcpu_index > > With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) > allowing this to be discovered, it becomes possible to inject > PPIs to up to 4096 vcpus. But please just don't. > > Reported-by: Zenghui Yu <yuzenghui@huawei.com> > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> And tested together with Eric's patches (KVM+QEMU). Thanks, zenghui
On Sun, 18 Aug 2019 at 15:07, Marc Zyngier <maz@kernel.org> wrote: > > While parts of the VGIC support a large number of vcpus (we > bravely allow up to 512), other parts are more limited. > > One of these limits is visible in the KVM_IRQ_LINE ioctl, which > only allows 256 vcpus to be signalled when using the CPU or PPI > types. Unfortunately, we've cornered ourselves badly by allocating > all the bits in the irq field. > > Since the irq_type subfield (8 bit wide) is currently only taking > the values 0, 1 and 2 (and we have been careful not to allow anything > else), let's reduce this field to only 4 bits, and allocate the > remaining 4 bits to a vcpu2_index, which acts as a multiplier: > > vcpu_id = 256 * vcpu2_index + vcpu_index > > With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) > allowing this to be discovered, it becomes possible to inject > PPIs to up to 4096 vcpus. But please just don't. > > Reported-by: Zenghui Yu <yuzenghui@huawei.com> > Signed-off-by: Marc Zyngier <maz@kernel.org> > diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt > index 2d067767b617..85518bfb2a99 100644 > --- a/Documentation/virt/kvm/api.txt > +++ b/Documentation/virt/kvm/api.txt > @@ -753,8 +753,8 @@ in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to > use PPIs designated for specific cpus. The irq field is interpreted > like this: > > - bits: | 31 ... 24 | 23 ... 16 | 15 ... 0 | > - field: | irq_type | vcpu_index | irq_id | > + bits: | 31 ... 28 | 27 ... 24 | 23 ... 16 | 15 ... 0 | > + field: | vcpu2_index | irq_type | vcpu_index | irq_id | > > The irq_type field has the following values: > - irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ > @@ -766,6 +766,10 @@ The irq_type field has the following values: > > In both cases, level is used to assert/deassert the line. > > +When KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 is supported, the target vcpu is > +identified as (256 * vcpu2_index + vcpu_index). Otherwise, vcpu2_index > +must be zero. > + > struct kvm_irq_level { > union { > __u32 irq; /* GSI */ > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index 35a069815baf..c1385911de69 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -182,6 +182,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > int r; > switch (ext) { > case KVM_CAP_IRQCHIP: > + case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2: > r = vgic_present; > break; Shouldn't we be advertising the capability always, not just if the VGIC is present? The KVM_IRQ_LINE ioctl can be used for directly signalling IRQs to vCPUs even if we're using an out-of-kernel irqchip model. The general principle of the API change/extension looks OK to me. thanks -- PMM
diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt index 2d067767b617..85518bfb2a99 100644 --- a/Documentation/virt/kvm/api.txt +++ b/Documentation/virt/kvm/api.txt @@ -753,8 +753,8 @@ in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to use PPIs designated for specific cpus. The irq field is interpreted like this: - bits: | 31 ... 24 | 23 ... 16 | 15 ... 0 | - field: | irq_type | vcpu_index | irq_id | + bits: | 31 ... 28 | 27 ... 24 | 23 ... 16 | 15 ... 0 | + field: | vcpu2_index | irq_type | vcpu_index | irq_id | The irq_type field has the following values: - irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ @@ -766,6 +766,10 @@ The irq_type field has the following values: In both cases, level is used to assert/deassert the line. +When KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 is supported, the target vcpu is +identified as (256 * vcpu2_index + vcpu_index). Otherwise, vcpu2_index +must be zero. + struct kvm_irq_level { union { __u32 irq; /* GSI */ diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h index a4217c1a5d01..2769360f195c 100644 --- a/arch/arm/include/uapi/asm/kvm.h +++ b/arch/arm/include/uapi/asm/kvm.h @@ -266,8 +266,10 @@ struct kvm_vcpu_events { #define KVM_DEV_ARM_ITS_CTRL_RESET 4 /* KVM_IRQ_LINE irq field index values */ +#define KVM_ARM_IRQ_VCPU2_SHIFT 28 +#define KVM_ARM_IRQ_VCPU2_MASK 0xf #define KVM_ARM_IRQ_TYPE_SHIFT 24 -#define KVM_ARM_IRQ_TYPE_MASK 0xff +#define KVM_ARM_IRQ_TYPE_MASK 0xf #define KVM_ARM_IRQ_VCPU_SHIFT 16 #define KVM_ARM_IRQ_VCPU_MASK 0xff #define KVM_ARM_IRQ_NUM_SHIFT 0 diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 9a507716ae2f..67c21f9bdbad 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -325,8 +325,10 @@ struct kvm_vcpu_events { #define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1 /* KVM_IRQ_LINE irq field index values */ +#define KVM_ARM_IRQ_VCPU2_SHIFT 28 +#define KVM_ARM_IRQ_VCPU2_MASK 0xf #define KVM_ARM_IRQ_TYPE_SHIFT 24 -#define KVM_ARM_IRQ_TYPE_MASK 0xff +#define KVM_ARM_IRQ_TYPE_MASK 0xf #define KVM_ARM_IRQ_VCPU_SHIFT 16 #define KVM_ARM_IRQ_VCPU_MASK 0xff #define KVM_ARM_IRQ_NUM_SHIFT 0 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 5e3f12d5359e..5414b6588fbb 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -996,6 +996,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_ARM_PTRAUTH_ADDRESS 171 #define KVM_CAP_ARM_PTRAUTH_GENERIC 172 #define KVM_CAP_PMU_EVENT_FILTER 173 +#define KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 174 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 35a069815baf..c1385911de69 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -182,6 +182,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) int r; switch (ext) { case KVM_CAP_IRQCHIP: + case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2: r = vgic_present; break; case KVM_CAP_IOEVENTFD: @@ -888,6 +889,7 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, irq_type = (irq >> KVM_ARM_IRQ_TYPE_SHIFT) & KVM_ARM_IRQ_TYPE_MASK; vcpu_idx = (irq >> KVM_ARM_IRQ_VCPU_SHIFT) & KVM_ARM_IRQ_VCPU_MASK; + vcpu_idx += ((irq >> KVM_ARM_IRQ_VCPU2_SHIFT) & KVM_ARM_IRQ_VCPU2_MASK) * (KVM_ARM_IRQ_VCPU_MASK + 1); irq_num = (irq >> KVM_ARM_IRQ_NUM_SHIFT) & KVM_ARM_IRQ_NUM_MASK; trace_kvm_irq_line(irq_type, vcpu_idx, irq_num, irq_level->level);
While parts of the VGIC support a large number of vcpus (we bravely allow up to 512), other parts are more limited. One of these limits is visible in the KVM_IRQ_LINE ioctl, which only allows 256 vcpus to be signalled when using the CPU or PPI types. Unfortunately, we've cornered ourselves badly by allocating all the bits in the irq field. Since the irq_type subfield (8 bit wide) is currently only taking the values 0, 1 and 2 (and we have been careful not to allow anything else), let's reduce this field to only 4 bits, and allocate the remaining 4 bits to a vcpu2_index, which acts as a multiplier: vcpu_id = 256 * vcpu2_index + vcpu_index With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) allowing this to be discovered, it becomes possible to inject PPIs to up to 4096 vcpus. But please just don't. Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> --- Documentation/virt/kvm/api.txt | 8 ++++++-- arch/arm/include/uapi/asm/kvm.h | 4 +++- arch/arm64/include/uapi/asm/kvm.h | 4 +++- include/uapi/linux/kvm.h | 1 + virt/kvm/arm/arm.c | 2 ++ 5 files changed, 15 insertions(+), 4 deletions(-)