Message ID | 1606780777-30718-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/IRQ: bump max number of guests for a shared IRQ to 31 | expand |
On 01.12.2020 00:59, Igor Druzhinin wrote: > Current limit of 7 is too restrictive for modern systems where one GSI > could be shared by potentially many PCI INTx sources where each of them > corresponds to a device passed through to its own guest. Some systems do not > apply due dilligence in swizzling INTx links in case e.g. INTA is declared as > interrupt pin for the majority of PCI devices behind a single router, > resulting in overuse of a GSI. > > Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> > --- > > If people think that would make sense - I can rework the array to a list of > domain pointers to avoid the limit. Not sure about this. What is the biggest number you've found on any system? (I assume the chosen value of 31 has some headroom.) Instead I'm wondering whether this wouldn't better be a Kconfig setting (or even command line controllable). There don't look to be any restrictions on the precise value chosen (i.e. 2**n-1 like is the case for old and new values here, for whatever reason), so a simple permitted range of like 4...64 would seem fine to specify. Whether the default then would want to be 8 (close to the current 7) or higher (around the actually observed maximum) is a different question. Jan
On 02/12/2020 09:25, Jan Beulich wrote: > On 01.12.2020 00:59, Igor Druzhinin wrote: >> Current limit of 7 is too restrictive for modern systems where one GSI >> could be shared by potentially many PCI INTx sources where each of them >> corresponds to a device passed through to its own guest. Some systems do not >> apply due dilligence in swizzling INTx links in case e.g. INTA is declared as >> interrupt pin for the majority of PCI devices behind a single router, >> resulting in overuse of a GSI. >> >> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> >> --- >> >> If people think that would make sense - I can rework the array to a list of >> domain pointers to avoid the limit. > > Not sure about this. What is the biggest number you've found on any > system? (I assume the chosen value of 31 has some headroom.) The value 31 was taken as a practical maximum for one specific HP system if all of the PCI slots in all of its riser cards are occupied with GPUs. The value is obtained by reverse engineering their ACPI tables. Currently we're only concerned with number 8 (our graphics vendors do not recommend installing more cards than that) but it's a matter of time it will grow. I'm also not sure why this routing scheme was chosen in the first place: is it dictated by hardware restrictions or firmware engineers being lazy - we're working with HP to determine it. > Instead I'm wondering whether this wouldn't better be a Kconfig > setting (or even command line controllable). There don't look to be > any restrictions on the precise value chosen (i.e. 2**n-1 like is > the case for old and new values here, for whatever reason), so a > simple permitted range of like 4...64 would seem fine to specify. > Whether the default then would want to be 8 (close to the current > 7) or higher (around the actually observed maximum) is a different > question. I'm in favor of a command line argument here - it would be much less trouble if a higher limit was suddenly necessary in the field. The default IMO should definitely be higher than 8 - I'd stick with number 32 which to me should cover our real world scenarios and apply some headroom for the future. Igor
On 02.12.2020 15:53, Igor Druzhinin wrote: > On 02/12/2020 09:25, Jan Beulich wrote: >> On 01.12.2020 00:59, Igor Druzhinin wrote: >>> Current limit of 7 is too restrictive for modern systems where one GSI >>> could be shared by potentially many PCI INTx sources where each of them >>> corresponds to a device passed through to its own guest. Some systems do not >>> apply due dilligence in swizzling INTx links in case e.g. INTA is declared as >>> interrupt pin for the majority of PCI devices behind a single router, >>> resulting in overuse of a GSI. >>> >>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> >>> --- >>> >>> If people think that would make sense - I can rework the array to a list of >>> domain pointers to avoid the limit. >> >> Not sure about this. What is the biggest number you've found on any >> system? (I assume the chosen value of 31 has some headroom.) > > The value 31 was taken as a practical maximum for one specific HP system > if all of the PCI slots in all of its riser cards are occupied with GPUs. > The value is obtained by reverse engineering their ACPI tables. Currently > we're only concerned with number 8 (our graphics vendors do not recommend > installing more cards than that) but it's a matter of time it will grow. > I'm also not sure why this routing scheme was chosen in the first place: > is it dictated by hardware restrictions or firmware engineers being lazy - > we're working with HP to determine it. Thanks for the insight. >> Instead I'm wondering whether this wouldn't better be a Kconfig >> setting (or even command line controllable). There don't look to be >> any restrictions on the precise value chosen (i.e. 2**n-1 like is >> the case for old and new values here, for whatever reason), so a >> simple permitted range of like 4...64 would seem fine to specify. >> Whether the default then would want to be 8 (close to the current >> 7) or higher (around the actually observed maximum) is a different >> question. > > I'm in favor of a command line argument here - it would be much less trouble > if a higher limit was suddenly necessary in the field. The default IMO > should definitely be higher than 8 - I'd stick with number 32 which to me > should cover our real world scenarios and apply some headroom for the future. Well, I'm concerned of the extra memory overhead. Every IRQ, sharable or not, will get the extra slots allocated with the current scheme. Perhaps a prereq change then would be to only allocate multi-guest arrays for sharable IRQs, effectively shrinking the overhead in particular for all MSI ones? Jan
On 02/12/2020 15:21, Jan Beulich wrote: > On 02.12.2020 15:53, Igor Druzhinin wrote: >> On 02/12/2020 09:25, Jan Beulich wrote: >>> Instead I'm wondering whether this wouldn't better be a Kconfig >>> setting (or even command line controllable). There don't look to be >>> any restrictions on the precise value chosen (i.e. 2**n-1 like is >>> the case for old and new values here, for whatever reason), so a >>> simple permitted range of like 4...64 would seem fine to specify. >>> Whether the default then would want to be 8 (close to the current >>> 7) or higher (around the actually observed maximum) is a different >>> question. >> >> I'm in favor of a command line argument here - it would be much less trouble >> if a higher limit was suddenly necessary in the field. The default IMO >> should definitely be higher than 8 - I'd stick with number 32 which to me >> should cover our real world scenarios and apply some headroom for the future. > > Well, I'm concerned of the extra memory overhead. Every IRQ, > sharable or not, will get the extra slots allocated with the > current scheme. Perhaps a prereq change then would be to only > allocate multi-guest arrays for sharable IRQs, effectively > shrinking the overhead in particular for all MSI ones? That's one way to improve overall system scalability but in that area there is certainly much bigger fish to fry elsewhere. With 32 elements in the array we get 200 bytes of overhead per structure, with 16 it's just 72 extra bytes which in the unattainable worst case scenario of every single vector taken in 512 CPU machine would only account for several MB of overhead. I'd start with dynamic array allocation first and setting the limit to 16 that should be enough for now. And then if that default value needs to be raised we can consider further improvements. Igor
On 02.12.2020 17:34, Igor Druzhinin wrote: > On 02/12/2020 15:21, Jan Beulich wrote: >> On 02.12.2020 15:53, Igor Druzhinin wrote: >>> On 02/12/2020 09:25, Jan Beulich wrote: >>>> Instead I'm wondering whether this wouldn't better be a Kconfig >>>> setting (or even command line controllable). There don't look to be >>>> any restrictions on the precise value chosen (i.e. 2**n-1 like is >>>> the case for old and new values here, for whatever reason), so a >>>> simple permitted range of like 4...64 would seem fine to specify. >>>> Whether the default then would want to be 8 (close to the current >>>> 7) or higher (around the actually observed maximum) is a different >>>> question. >>> >>> I'm in favor of a command line argument here - it would be much less trouble >>> if a higher limit was suddenly necessary in the field. The default IMO >>> should definitely be higher than 8 - I'd stick with number 32 which to me >>> should cover our real world scenarios and apply some headroom for the future. >> >> Well, I'm concerned of the extra memory overhead. Every IRQ, >> sharable or not, will get the extra slots allocated with the >> current scheme. Perhaps a prereq change then would be to only >> allocate multi-guest arrays for sharable IRQs, effectively >> shrinking the overhead in particular for all MSI ones? > > That's one way to improve overall system scalability but in that area > there is certainly much bigger fish to fry elsewhere. With 32 elements in the > array we get 200 bytes of overhead per structure, with 16 it's just 72 extra > bytes which in the unattainable worst case scenario of every single vector taken > in 512 CPU machine would only account for several MB of overhead. I'm generally unhappy with this way of thinking, as this is what has been leading to unnecessary growth of all sorts of software and its needs of resources. Yes, there surely are larger gains to be had elsewhere, but that's imo still no excuse to grow memory allocations "blindly" despite it being clear that in a fair share of cases a fair part of the allocated memory won't be used. This said, ... > I'd start with dynamic array allocation first and setting the limit to 16 that > should be enough for now. And then if that default value needs to be raised > we can consider further improvements. ... I'm puzzled by this plan of yours, because unless I'm misunderstanding dynamic array allocation is what I've been asking for, effectively. Now that we have xmalloc_flex_struct(), this should even be relatively straightforward, i.e. in particular with no need to open code complex expressions. Jan
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c index 8d1f9a9..194f660 100644 --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1028,7 +1028,7 @@ int __init setup_irq(unsigned int irq, unsigned int irqflags, * HANDLING OF GUEST-BOUND PHYSICAL IRQS */ -#define IRQ_MAX_GUESTS 7 +#define IRQ_MAX_GUESTS 31 typedef struct { u8 nr_guests; u8 in_flight;
Current limit of 7 is too restrictive for modern systems where one GSI could be shared by potentially many PCI INTx sources where each of them corresponds to a device passed through to its own guest. Some systems do not apply due dilligence in swizzling INTx links in case e.g. INTA is declared as interrupt pin for the majority of PCI devices behind a single router, resulting in overuse of a GSI. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> --- If people think that would make sense - I can rework the array to a list of domain pointers to avoid the limit. --- xen/arch/x86/irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)