Message ID | 20210319161956.2838291-2-boqun.feng@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | PCI: Introduce pci_ops::use_arch_sysdata | expand |
[+cc Arnd (author of 37d6a0a6f470 ("PCI: Add pci_register_host_bridge() interface"), which I think would make my idea below possible), Marc (IRQ domains maintainer)] On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote: > Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the > ->sysdata in bus and bridge will be treated as struct pci_config_window, > which is created by generic ECAM using the data from acpi. It might be a mistake that we put the struct pci_config_window pointer, which is really arch-independent, in the ->sysdata element, which normally contains a pointer to arch- or host bridge-dependent data. > However, for a virtualized PCI bus, there might be no enough data in of > or acpi table to create a pci_config_window. This is similar to the case > where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own > structure for sysdata, so no apci table lookup is required. > > In order to enable Hyper-V's virtual PCI (which doesn't have acpi table > entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we > introduce arch-specific pci sysdata (similar to the one for x86) for > ARM64, and allow the core PCI code to detect the type of sysdata at the > runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata > field. > > Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com> > Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> > --- > arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++ > arch/arm64/kernel/pci.c | 15 ++++++++++++--- > include/linux/pci.h | 3 +++ > 3 files changed, 44 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h > index b33ca260e3c9..dade061a0658 100644 > --- a/arch/arm64/include/asm/pci.h > +++ b/arch/arm64/include/asm/pci.h > @@ -22,6 +22,16 @@ > > extern int isa_dma_bridge_buggy; > > +struct pci_sysdata { > + int domain; /* PCI domain */ > + int node; /* NUMA Node */ > +#ifdef CONFIG_ACPI > + struct acpi_device *companion; /* ACPI companion device */ > +#endif > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN > + void *fwnode; /* IRQ domain for MSI assignment */ > +#endif > +}; Our PCI domain code is really a mess (mostly my fault) and I hate to make it even more complicated by adding more switches, e.g., ->use_arch_sysdata. I think the design problem is that PCI host bridge drivers should supply the PCI domain up front instead of having callbacks to extract it. We could put "int domain_nr" in struct pci_host_bridge, and the arch code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD, HV, etc) could fill in pci_host_bridge.domain_nr before calling pci_scan_root_bus_bridge() or pci_host_probe(). Then maybe we could get rid of pci_bus_find_domain_nr() and some of the needlessly arch-specific implementations of pci_domain_nr(). I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too, eventually. > #ifdef CONFIG_PCI > static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel) > { > @@ -31,8 +41,27 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel) > > static inline int pci_proc_domain(struct pci_bus *bus) > { > + if (bus->ops->use_arch_sysdata) > + return pci_domain_nr(bus); > return 1; I don't understand this. pci_proc_domain() returns a boolean and determines whether the /proc/bus/pci/ directory contains, e.g., /proc/bus/pci/00 or /proc/bus/pci/0000:00 On arm64, pci_proc_domain() currently always returns 1, so the directory contains "0000:00". After these patches, pci_proc_domain() returns 0 if CONFIG_PCI_DOMAINS_GENERIC=y and "bus" is in domain 0, so buses in domain 0 will be "00" instead of "0000:00". This doesn't make sense to me, but at the very least, this user-visible change needs to be explained. > } > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN > +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus) > +{ > + struct pci_sysdata *sd = bus->sysdata; > + > + if (bus->ops->use_arch_sysdata) > + return sd->fwnode; > + > + /* > + * bus->sysdata is not struct pci_sysdata, fwnode should be able to > + * be queried from of/acpi. > + */ > + return NULL; > +} > +#define pci_root_bus_fwnode _pci_root_bus_fwnode Ugh. pci_root_bus_fwnode() is another callback to find the irq_domain. Only one call, from pci_host_bridge_msi_domain(), which itself is only called from pci_set_bus_msi_domain(). This feels like another case where we could simplify things by having the host bridge driver figure out the irq_domain explicitly when it creates the pci_host_bridge. It seems like that's where we have the most information about how to find the irq_domain. > +#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */ > + > #endif /* CONFIG_PCI */ > > #endif /* __ASM_PCI_H */ > diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c > index 1006ed2d7c60..63d420d57e63 100644 > --- a/arch/arm64/kernel/pci.c > +++ b/arch/arm64/kernel/pci.c > @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info { > int acpi_pci_bus_find_domain_nr(struct pci_bus *bus) > { > struct pci_config_window *cfg = bus->sysdata; > - struct acpi_device *adev = to_acpi_device(cfg->parent); > - struct acpi_pci_root *root = acpi_driver_data(adev); > + struct pci_sysdata *sd = bus->sysdata; > + struct acpi_device *adev; > + struct acpi_pci_root *root; > + > + /* struct pci_sysdata has domain nr in it */ > + if (bus->ops->use_arch_sysdata) > + return sd->domain; > + > + /* or pci_config_window is used as sysdata */ > + adev = to_acpi_device(cfg->parent); > + root = acpi_driver_data(adev); My comments above are a lot of hand-waving without a very clear way forward. Would it simplify things to just add a "struct pci_config_window *ecam_info" to pci_host_bridge, so we wouldn't have to overload sysdata? > return root->segment; > } > > int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge) > { > - if (!acpi_disabled) { > + if (!acpi_disabled && bridge->ops->use_arch_sysdata) { > struct pci_config_window *cfg = bridge->bus->sysdata; > struct acpi_device *adev = to_acpi_device(cfg->parent); > struct device *bus_dev = &bridge->bus->dev; > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 86c799c97b77..4036aac40361 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -740,6 +740,9 @@ struct pci_ops { > void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where); > int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val); > int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val); > +#ifdef CONFIG_PCI_DOMAINS_GENERIC > + int use_arch_sysdata; /* ->sysdata is arch-specific */ > +#endif > }; > > /* > -- > 2.30.2 >
On Fri, Mar 19, 2021 at 5:22 PM Boqun Feng <boqun.feng@gmail.com> wrote: > > Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the > ->sysdata in bus and bridge will be treated as struct pci_config_window, > which is created by generic ECAM using the data from acpi. > > However, for a virtualized PCI bus, there might be no enough data in of > or acpi table to create a pci_config_window. This is similar to the case > where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own > structure for sysdata, so no apci table lookup is required. > > In order to enable Hyper-V's virtual PCI (which doesn't have acpi table > entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we > introduce arch-specific pci sysdata (similar to the one for x86) for > ARM64, and allow the core PCI code to detect the type of sysdata at the > runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata > field. > > Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com> > Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> I think this takes it in the opposite direction of where it should be going. > --- > arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++ > arch/arm64/kernel/pci.c | 15 ++++++++++++--- > include/linux/pci.h | 3 +++ > 3 files changed, 44 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h > index b33ca260e3c9..dade061a0658 100644 > --- a/arch/arm64/include/asm/pci.h > +++ b/arch/arm64/include/asm/pci.h > @@ -22,6 +22,16 @@ > > extern int isa_dma_bridge_buggy; > > +struct pci_sysdata { > + int domain; /* PCI domain */ > + int node; /* NUMA Node */ > +#ifdef CONFIG_ACPI > + struct acpi_device *companion; /* ACPI companion device */ > +#endif > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN > + void *fwnode; /* IRQ domain for MSI assignment */ > +#endif > +}; I think none of these members belong into sysdata or architecture specific code. The fact that a pci_host_bridge belongs to a particular NUMA node or i associated with a firmware description is neither specific to a host bridge implementation nor a CPU instruction set! Moreover, you cannot assume that all PCI host bridges on any given architecture can share the pci_sysdata pointer, it is purely specific to the bridge driver. A good start would be to move the members (one at a time) into struct pci_host_bridge and out of the sysdata of individual host bridge drivers. > diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c > index 1006ed2d7c60..63d420d57e63 100644 > --- a/arch/arm64/kernel/pci.c > +++ b/arch/arm64/kernel/pci.c > @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info { > int acpi_pci_bus_find_domain_nr(struct pci_bus *bus) > { > struct pci_config_window *cfg = bus->sysdata; > - struct acpi_device *adev = to_acpi_device(cfg->parent); > - struct acpi_pci_root *root = acpi_driver_data(adev); > + struct pci_sysdata *sd = bus->sysdata; > + struct acpi_device *adev; > + struct acpi_pci_root *root; There should be no reason to add even most code to this file, it should in fact become empty as it gets generalized more. Arnd
Thanks Bjorn for looping me in. On Fri, 19 Mar 2021 21:12:46 +0000, Bjorn Helgaas <helgaas@kernel.org> wrote: > > [+cc Arnd (author of 37d6a0a6f470 ("PCI: Add > pci_register_host_bridge() interface"), which I think would make my > idea below possible), Marc (IRQ domains maintainer)] > > On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote: > > Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the > > ->sysdata in bus and bridge will be treated as struct pci_config_window, > > which is created by generic ECAM using the data from acpi. > > It might be a mistake that we put the struct pci_config_window > pointer, which is really arch-independent, in the ->sysdata element, > which normally contains a pointer to arch- or host bridge-dependent > data. > > > However, for a virtualized PCI bus, there might be no enough data in of > > or acpi table to create a pci_config_window. This is similar to the case > > where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own > > structure for sysdata, so no apci table lookup is required. > > > > In order to enable Hyper-V's virtual PCI (which doesn't have acpi table > > entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we > > introduce arch-specific pci sysdata (similar to the one for x86) for > > ARM64, and allow the core PCI code to detect the type of sysdata at the > > runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata > > field. > > > > Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com> > > Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> > > --- > > arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++ > > arch/arm64/kernel/pci.c | 15 ++++++++++++--- > > include/linux/pci.h | 3 +++ > > 3 files changed, 44 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h > > index b33ca260e3c9..dade061a0658 100644 > > --- a/arch/arm64/include/asm/pci.h > > +++ b/arch/arm64/include/asm/pci.h > > @@ -22,6 +22,16 @@ > > > > extern int isa_dma_bridge_buggy; > > > > +struct pci_sysdata { > > + int domain; /* PCI domain */ > > + int node; /* NUMA Node */ > > +#ifdef CONFIG_ACPI > > + struct acpi_device *companion; /* ACPI companion device */ > > +#endif > > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN > > + void *fwnode; /* IRQ domain for MSI assignment */ Why isn't this more strongly typed? pci_host_bridge_msi_domain() definitely expects this to be the real thing. And the comment is wrong. [...] > > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN > > +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus) > > +{ > > + struct pci_sysdata *sd = bus->sysdata; > > + > > + if (bus->ops->use_arch_sysdata) > > + return sd->fwnode; > > + > > + /* > > + * bus->sysdata is not struct pci_sysdata, fwnode should be able to > > + * be queried from of/acpi. > > + */ > > + return NULL; > > +} > > +#define pci_root_bus_fwnode _pci_root_bus_fwnode > > Ugh. pci_root_bus_fwnode() is another callback to find the > irq_domain. Only one call, from pci_host_bridge_msi_domain(), which > itself is only called from pci_set_bus_msi_domain(). This feels like > another case where we could simplify things by having the host bridge > driver figure out the irq_domain explicitly when it creates the > pci_host_bridge. It seems like that's where we have the most > information about how to find the irq_domain. Urgh. This is a perfect copy paste of the x86 horror, warts and all. I can't say I'm thrilled (another way to say "Gawd, Noes! Never!"). One thing I am sure of is that I do not want to add more custom indirection to build the MSI topology. We barely got rid of the msi_controller structure, and this is the same thing by another name. Probably worse, actually. In this case, I don't see the point in going via a fwnode indirection given that there is no firmware tables the first place. As for finding the irq domain from the host bridge, that's not doable in most cases on arm64, as it is pretty likely that the host bridge knows nothing about MSIs when they are implemented in the GIC (see my recent msi_controller removal series that has a few patches about that). Having an optional callback to host bridges to obtain the MSI domain may be possible in some cases though (there might be a chicken/egg problem for some drivers though...). Thanks, M.
On Fri, Mar 19, 2021 at 10:14 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > However, for a virtualized PCI bus, there might be no enough data in of > > or acpi table to create a pci_config_window. This is similar to the case > > where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own > > structure for sysdata, so no apci table lookup is required. > > > > In order to enable Hyper-V's virtual PCI (which doesn't have acpi table > > entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we > > introduce arch-specific pci sysdata (similar to the one for x86) for > > ARM64, and allow the core PCI code to detect the type of sysdata at the > > runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata > > field. > > > > Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com> > > Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> > > --- > > arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++ > > arch/arm64/kernel/pci.c | 15 ++++++++++++--- > > include/linux/pci.h | 3 +++ > > 3 files changed, 44 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h > > index b33ca260e3c9..dade061a0658 100644 > > --- a/arch/arm64/include/asm/pci.h > > +++ b/arch/arm64/include/asm/pci.h > > @@ -22,6 +22,16 @@ > > > > extern int isa_dma_bridge_buggy; > > > > +struct pci_sysdata { > > + int domain; /* PCI domain */ > > + int node; /* NUMA Node */ > > +#ifdef CONFIG_ACPI > > + struct acpi_device *companion; /* ACPI companion device */ > > +#endif > > +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN > > + void *fwnode; /* IRQ domain for MSI assignment */ > > +#endif > > +}; > > Our PCI domain code is really a mess (mostly my fault) and I hate to > make it even more complicated by adding more switches, e.g., > ->use_arch_sysdata. > > I think the design problem is that PCI host bridge drivers should > supply the PCI domain up front instead of having callbacks to extract > it. > > We could put "int domain_nr" in struct pci_host_bridge, and the arch > code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD, > HV, etc) could fill in pci_host_bridge.domain_nr before calling > pci_scan_root_bus_bridge() or pci_host_probe(). > > Then maybe we could get rid of pci_bus_find_domain_nr() and some of > the needlessly arch-specific implementations of pci_domain_nr(). > I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too, > eventually. Agreed. I actually still have a (not really tested) patch series to clean up the pci host bridge registration, and this should make this a lot easier to add on top. I should dig that out of my backlog and post for review. Arnd
On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote: > On Fri, 19 Mar 2021 21:12:46 +0000, > > > > Ugh. pci_root_bus_fwnode() is another callback to find the > > irq_domain. Only one call, from pci_host_bridge_msi_domain(), which > > itself is only called from pci_set_bus_msi_domain(). This feels like > > another case where we could simplify things by having the host bridge > > driver figure out the irq_domain explicitly when it creates the > > pci_host_bridge. It seems like that's where we have the most > > information about how to find the irq_domain. > > Urgh. This is a perfect copy paste of the x86 horror, warts and all. > I can't say I'm thrilled (another way to say "Gawd, Noes! Never!"). > > One thing I am sure of is that I do not want to add more custom > indirection to build the MSI topology. We barely got rid of the > msi_controller structure, and this is the same thing by another > name. Probably worse, actually. > > In this case, I don't see the point in going via a fwnode indirection > given that there is no firmware tables the first place. > > As for finding the irq domain from the host bridge, that's not doable > in most cases on arm64, as it is pretty likely that the host bridge > knows nothing about MSIs when they are implemented in the GIC (see my > recent msi_controller removal series that has a few patches about > that). > > Having an optional callback to host bridges to obtain the MSI domain > may be possible in some cases though (there might be a chicken/egg > problem for some drivers though...). I would expect that the host bridge driver can find the MSI domain at probe time and just add a pointer into the pci_host_bridge structure. Arnd
On Sat, 20 Mar 2021 13:03:13 +0000, Arnd Bergmann <arnd@arndb.de> wrote: > > On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote: > > On Fri, 19 Mar 2021 21:12:46 +0000, > > > > > > > Ugh. pci_root_bus_fwnode() is another callback to find the > > > irq_domain. Only one call, from pci_host_bridge_msi_domain(), which > > > itself is only called from pci_set_bus_msi_domain(). This feels like > > > another case where we could simplify things by having the host bridge > > > driver figure out the irq_domain explicitly when it creates the > > > pci_host_bridge. It seems like that's where we have the most > > > information about how to find the irq_domain. > > > > Urgh. This is a perfect copy paste of the x86 horror, warts and all. > > I can't say I'm thrilled (another way to say "Gawd, Noes! Never!"). > > > > One thing I am sure of is that I do not want to add more custom > > indirection to build the MSI topology. We barely got rid of the > > msi_controller structure, and this is the same thing by another > > name. Probably worse, actually. > > > > In this case, I don't see the point in going via a fwnode indirection > > given that there is no firmware tables the first place. > > > > As for finding the irq domain from the host bridge, that's not doable > > in most cases on arm64, as it is pretty likely that the host bridge > > knows nothing about MSIs when they are implemented in the GIC (see my > > recent msi_controller removal series that has a few patches about > > that). > > > > Having an optional callback to host bridges to obtain the MSI domain > > may be possible in some cases though (there might be a chicken/egg > > problem for some drivers though...). > > I would expect that the host bridge driver can find the MSI domain > at probe time and just add a pointer into the pci_host_bridge > structure. In most cases, it doesn't implement it itself, and I'd be reluctant to duplicate information that can already be retrieved from somewhere else in a generic way (i.e. no PCI specific). M.
On Sat, Mar 20, 2021 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote: > > On Sat, 20 Mar 2021 13:03:13 +0000, > Arnd Bergmann <arnd@arndb.de> wrote: > > > > On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote: > > > On Fri, 19 Mar 2021 21:12:46 +0000, > > > > > > Having an optional callback to host bridges to obtain the MSI domain > > > may be possible in some cases though (there might be a chicken/egg > > > problem for some drivers though...). > > > > I would expect that the host bridge driver can find the MSI domain > > at probe time and just add a pointer into the pci_host_bridge > > structure. > > In most cases, it doesn't implement it itself, and I'd be reluctant to > duplicate information that can already be retrieved from somewhere > else in a generic way (i.e. no PCI specific). At the moment, the information is retried through a maze of different functions, and already duplicated in both the pci_host_bridge and the pci_bus structures. If we can change everything to use CONFIG_GENERIC_MSI_IRQ_DOMAIN, then most of that code can probably just go away, leaving only the part in the phb. Arnd
On Sat, Mar 20, 2021 at 1:54 PM Arnd Bergmann <arnd@arndb.de> wrote: > I actually still have a (not really tested) patch series to clean up > the pci host bridge registration, and this should make this a lot easier > to add on top. > > I should dig that out of my backlog and post for review. I've uploaded my series to https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git pci-probe-rework-20210320 The purpose of this series is mostly to simplify what variations of host probe methods exist, towards using pci_host_probe() as the only method. It does provide some simplifications based on that that, including a way to universally have access to the pci_host_bridge pointer during the probe function. Arnd
On Sat, 20 Mar 2021 14:24:06 +0000, Arnd Bergmann <arnd@arndb.de> wrote: > > On Sat, Mar 20, 2021 at 2:23 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Sat, 20 Mar 2021 13:03:13 +0000, > > Arnd Bergmann <arnd@arndb.de> wrote: > > > > > > On Sat, Mar 20, 2021 at 1:54 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Fri, 19 Mar 2021 21:12:46 +0000, > > > > > > > > Having an optional callback to host bridges to obtain the MSI domain > > > > may be possible in some cases though (there might be a chicken/egg > > > > problem for some drivers though...). > > > > > > I would expect that the host bridge driver can find the MSI domain > > > at probe time and just add a pointer into the pci_host_bridge > > > structure. > > > > In most cases, it doesn't implement it itself, and I'd be reluctant to > > duplicate information that can already be retrieved from somewhere > > else in a generic way (i.e. no PCI specific). > > At the moment, the information is retried through a maze of different > functions, and already duplicated in both the pci_host_bridge and the > pci_bus structures. If we can change everything to use > CONFIG_GENERIC_MSI_IRQ_DOMAIN, then most of that code > can probably just go away, leaving only the part in the phb. Fine by me, as long as you don't assume that there is a single MSI domain per PHB (both OF and IORT mandate that you can segment the RID space to hit multiple controllers). M.
Hi Arnd, On Sat, Mar 20, 2021 at 05:09:10PM +0100, Arnd Bergmann wrote: > On Sat, Mar 20, 2021 at 1:54 PM Arnd Bergmann <arnd@arndb.de> wrote: > > I actually still have a (not really tested) patch series to clean up > > the pci host bridge registration, and this should make this a lot easier > > to add on top. > > > > I should dig that out of my backlog and post for review. > > I've uploaded my series to > https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git > pci-probe-rework-20210320 > > The purpose of this series is mostly to simplify what variations of > host probe methods exist, towards using pci_host_probe() as the > only method. It does provide some simplifications based on that > that, including a way to universally have access to the pci_host_bridge > pointer during the probe function. > Thanks for the suggestion and code. I spend some time to catch up. Yes, Bjorn and you are correct, the better way is having a 'domain_nr' in the 'pci_host_bridge' and making sure every driver fill that correctly before probe. I definitly will use this approach. However, I may start small: I plan to introduce 'domain_nr' and only fill the field at probe time for PCI_DOMAINS_GENERIC=y archs, and leave other archs and driver alone. (honestly, I was shocked by the number of pci_scan_root_bus_bridge() and pci_host_probe() that I need to adjust if I really want to unify the 'domain_nr' handling for every arch and driver ;-)). This will fulfil my requirement for Hyper-V PCI controller on ARM64. And later on, we can switch each arch to this approach one by one and keep the rest still working. Thoughts? Regards, Boqun > Arnd
On Mon, Mar 29, 2021 at 4:32 PM Boqun Feng <boqun.feng@gmail.com> wrote: > > Hi Arnd, > > On Sat, Mar 20, 2021 at 05:09:10PM +0100, Arnd Bergmann wrote: > > On Sat, Mar 20, 2021 at 1:54 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > I actually still have a (not really tested) patch series to clean up > > > the pci host bridge registration, and this should make this a lot easier > > > to add on top. > > > > > > I should dig that out of my backlog and post for review. > > > > I've uploaded my series to > > https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git > > pci-probe-rework-20210320 > > > > The purpose of this series is mostly to simplify what variations of > > host probe methods exist, towards using pci_host_probe() as the > > only method. It does provide some simplifications based on that > > that, including a way to universally have access to the pci_host_bridge > > pointer during the probe function. > > > > Thanks for the suggestion and code. I spend some time to catch up. Yes, > Bjorn and you are correct, the better way is having a 'domain_nr' in the > 'pci_host_bridge' and making sure every driver fill that correctly > before probe. I definitly will use this approach. > > However, I may start small: I plan to introduce 'domain_nr' and only > fill the field at probe time for PCI_DOMAINS_GENERIC=y archs, and leave > other archs and driver alone. (honestly, I was shocked by the number of > pci_scan_root_bus_bridge() and pci_host_probe() that I need to adjust if > I really want to unify the 'domain_nr' handling for every arch and > driver ;-)). This will fulfil my requirement for Hyper-V PCI controller > on ARM64. And later on, we can switch each arch to this approach one by > one and keep the rest still working. > > Thoughts? That sounds reasonable to me, yes. I would also suggest you look at my hyperv patch from the branch I mentioned [1] and try to integrate that first. I suspect this makes it easier to do the domain rework and possibly other changes on top. Arnd https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/commit/?h=pci-probe-rework-20210320&id=44db8df9d729d
diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h index b33ca260e3c9..dade061a0658 100644 --- a/arch/arm64/include/asm/pci.h +++ b/arch/arm64/include/asm/pci.h @@ -22,6 +22,16 @@ extern int isa_dma_bridge_buggy; +struct pci_sysdata { + int domain; /* PCI domain */ + int node; /* NUMA Node */ +#ifdef CONFIG_ACPI + struct acpi_device *companion; /* ACPI companion device */ +#endif +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN + void *fwnode; /* IRQ domain for MSI assignment */ +#endif +}; #ifdef CONFIG_PCI static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel) { @@ -31,8 +41,27 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel) static inline int pci_proc_domain(struct pci_bus *bus) { + if (bus->ops->use_arch_sysdata) + return pci_domain_nr(bus); return 1; } +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus) +{ + struct pci_sysdata *sd = bus->sysdata; + + if (bus->ops->use_arch_sysdata) + return sd->fwnode; + + /* + * bus->sysdata is not struct pci_sysdata, fwnode should be able to + * be queried from of/acpi. + */ + return NULL; +} +#define pci_root_bus_fwnode _pci_root_bus_fwnode +#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */ + #endif /* CONFIG_PCI */ #endif /* __ASM_PCI_H */ diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index 1006ed2d7c60..63d420d57e63 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info { int acpi_pci_bus_find_domain_nr(struct pci_bus *bus) { struct pci_config_window *cfg = bus->sysdata; - struct acpi_device *adev = to_acpi_device(cfg->parent); - struct acpi_pci_root *root = acpi_driver_data(adev); + struct pci_sysdata *sd = bus->sysdata; + struct acpi_device *adev; + struct acpi_pci_root *root; + + /* struct pci_sysdata has domain nr in it */ + if (bus->ops->use_arch_sysdata) + return sd->domain; + + /* or pci_config_window is used as sysdata */ + adev = to_acpi_device(cfg->parent); + root = acpi_driver_data(adev); return root->segment; } int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge) { - if (!acpi_disabled) { + if (!acpi_disabled && bridge->ops->use_arch_sysdata) { struct pci_config_window *cfg = bridge->bus->sysdata; struct acpi_device *adev = to_acpi_device(cfg->parent); struct device *bus_dev = &bridge->bus->dev; diff --git a/include/linux/pci.h b/include/linux/pci.h index 86c799c97b77..4036aac40361 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -740,6 +740,9 @@ struct pci_ops { void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where); int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val); int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val); +#ifdef CONFIG_PCI_DOMAINS_GENERIC + int use_arch_sysdata; /* ->sysdata is arch-specific */ +#endif }; /*
Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the ->sysdata in bus and bridge will be treated as struct pci_config_window, which is created by generic ECAM using the data from acpi. However, for a virtualized PCI bus, there might be no enough data in of or acpi table to create a pci_config_window. This is similar to the case where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own structure for sysdata, so no apci table lookup is required. In order to enable Hyper-V's virtual PCI (which doesn't have acpi table entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we introduce arch-specific pci sysdata (similar to the one for x86) for ARM64, and allow the core PCI code to detect the type of sysdata at the runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata field. Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> --- arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++ arch/arm64/kernel/pci.c | 15 ++++++++++++--- include/linux/pci.h | 3 +++ 3 files changed, 44 insertions(+), 3 deletions(-)