Message ID | 20180918235702.26573-2-keith.busch@intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI error handling | expand |
On Tue, Sep 18, 2018 at 05:56:53PM -0600, Keith Busch wrote: > The PCI port driver saves the PCI state after initializing all the > service devices. This was, however, before the service drivers were even > registered. The config space state that the service drivers were setting > up were not being saved. > > This patch fixes this by changing the service drivers use the > subsys_init, which gets the service drivers registered after the pci bus > system is initialized, but before the pci devices are probed. This gets > the state saved as expected. I agree this is a problem. What are the user-visible symptoms of it? Incorrect service behavior after a resume? Nice debugging and fix! I think the ordering here is pretty obscure. We have a lot of initcalls here and they all have to line up exactly right. If I understand correctly, the flow of the required pieces (after this patch) is like this: pci_driver_init # postcore_initcall (2) bus_register(&pcie_port_bus_type) pcied_init (pciehp) # subsys_initcall (4) pcie_port_service_register(&hpdriver_portdrv) new->driver.bus = &pcie_port_bus_type # depends on above # bus_register() driver_register(&new->driver) pcie_portdrv_init # device_initcall (6) pci_register_driver(&pcie_portdriver) pcie_portdrv_probe # pcie_portdriver.probe pcie_port_device_register pcie_device_init device_register device_add bus_probe_device ... pciehp_probe # <-- critical init # depends on above # service_register() and # eager probing pci_save_state # <-- critical save The problem used to be that both pcied_init() (for pciehp) and pcie_portdrv_init() were device initcalls and pcie_portdrv_init() was called before pcied_init() because of link order, so the pci_save_state() happened before the pciehp init. Since none of the service drivers can be modules, I don't think it buys us much to make their init functions initcalls. Can we explicitly call them from the pcie_portdrv_probe() path? > Note, 70626d88385 ("PCI: pciehp: Make explicitly non-modular") already > mentioned this exact change should be done, but declined to at the time > until there was a real need for it. > > Signed-off-by: Keith Busch <keith.busch@intel.com> > --- > drivers/pci/hotplug/pciehp_core.c | 2 +- > drivers/pci/pcie/aer.c | 2 +- > drivers/pci/pcie/dpc.c | 2 +- > drivers/pci/pcie/pme.c | 2 +- > 4 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > index ccaf01e6eced..334044814dbe 100644 > --- a/drivers/pci/hotplug/pciehp_core.c > +++ b/drivers/pci/hotplug/pciehp_core.c > @@ -322,4 +322,4 @@ static int __init pcied_init(void) > > return retval; > } > -device_initcall(pcied_init); > +subsys_initcall(pcied_init); > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 83180edd6ed4..1d2159409b01 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1575,4 +1575,4 @@ static int __init aer_service_init(void) > return -ENXIO; > return pcie_port_service_register(&aerdriver); > } > -device_initcall(aer_service_init); > +subsys_initcall(aer_service_init); > diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c > index f03279fc87cd..aacfb50eccfc 100644 > --- a/drivers/pci/pcie/dpc.c > +++ b/drivers/pci/pcie/dpc.c > @@ -286,4 +286,4 @@ static int __init dpc_service_init(void) > { > return pcie_port_service_register(&dpcdriver); > } > -device_initcall(dpc_service_init); > +subsys_initcall(dpc_service_init); > diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c > index 3ed67676ea2a..cd8c1adb9b0a 100644 > --- a/drivers/pci/pcie/pme.c > +++ b/drivers/pci/pcie/pme.c > @@ -450,4 +450,4 @@ static int __init pcie_pme_service_init(void) > { > return pcie_port_service_register(&pcie_pme_driver); > } > -device_initcall(pcie_pme_service_init); > +subsys_initcall(pcie_pme_service_init); > -- > 2.14.4 >
On Wed, Sep 19, 2018 at 11:28:46AM -0500, Bjorn Helgaas wrote: > On Tue, Sep 18, 2018 at 05:56:53PM -0600, Keith Busch wrote: > > The PCI port driver saves the PCI state after initializing all the > > service devices. This was, however, before the service drivers were even > > registered. The config space state that the service drivers were setting > > up were not being saved. > > > > This patch fixes this by changing the service drivers use the > > subsys_init, which gets the service drivers registered after the pci bus > > system is initialized, but before the pci devices are probed. This gets > > the state saved as expected. > > I agree this is a problem. What are the user-visible symptoms of it? > Incorrect service behavior after a resume? Nice debugging and fix! I'll look at the suspend/resume case too, but I noticed the incorrect behavior after a bus reset: future DPC or HPC events downstream a port were lost after an AER recovery because the control registers were never restored. The very next patch is required too, since that's what actually restores the registers to the saved state. > I think the ordering here is pretty obscure. We have a lot of > initcalls here and they all have to line up exactly right. If I > understand correctly, the flow of the required pieces (after this > patch) is like this: > > pci_driver_init # postcore_initcall (2) > bus_register(&pcie_port_bus_type) > > pcied_init (pciehp) # subsys_initcall (4) > pcie_port_service_register(&hpdriver_portdrv) > new->driver.bus = &pcie_port_bus_type # depends on above > # bus_register() > driver_register(&new->driver) > > pcie_portdrv_init # device_initcall (6) > pci_register_driver(&pcie_portdriver) > > pcie_portdrv_probe # pcie_portdriver.probe > pcie_port_device_register > pcie_device_init > device_register > device_add > bus_probe_device > ... > pciehp_probe # <-- critical init > # depends on above > # service_register() and > # eager probing > pci_save_state # <-- critical save > > The problem used to be that both pcied_init() (for pciehp) and > pcie_portdrv_init() were device initcalls and pcie_portdrv_init() was > called before pcied_init() because of link order, so the > pci_save_state() happened before the pciehp init. > > Since none of the service drivers can be modules, I don't think it > buys us much to make their init functions initcalls. Can we > explicitly call them from the pcie_portdrv_probe() path? That sounds good to me. The portdrv isn't all that abstracted from the child services anyway.
On Wed, Sep 19, 2018 at 11:28:46AM -0500, Bjorn Helgaas wrote: > On Tue, Sep 18, 2018 at 05:56:53PM -0600, Keith Busch wrote: > Since none of the service drivers can be modules, I don't think it > buys us much to make their init functions initcalls. Can we > explicitly call them from the pcie_portdrv_probe() path? It's actually during pcie_portdrv_init that the services need to be initialized if we're going this way. Do you think the following is better? The initialization order should be more clear to the reader at the cost of more code than init call magic, but I'm okay having this done either way. --- diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index 68b20e667764..944047976569 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -287,7 +287,7 @@ static struct pcie_port_service_driver hpdriver_portdrv = { #endif /* PM */ }; -static int __init pcied_init(void) +int __init pcie_hp_init(void) { int retval = 0; @@ -298,4 +298,3 @@ static int __init pcied_init(void) return retval; } -device_initcall(pcied_init); diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 83180edd6ed4..637d638f73da 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1569,10 +1569,9 @@ static struct pcie_port_service_driver aerdriver = { * * Invoked when AER root service driver is loaded. */ -static int __init aer_service_init(void) +int __init pcie_aer_init(void) { if (!pci_aer_available() || aer_acpi_firmware_first()) return -ENXIO; return pcie_port_service_register(&aerdriver); } -device_initcall(aer_service_init); diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c index f03279fc87cd..a1fd16bf1cab 100644 --- a/drivers/pci/pcie/dpc.c +++ b/drivers/pci/pcie/dpc.c @@ -282,8 +282,7 @@ static struct pcie_port_service_driver dpcdriver = { .reset_link = dpc_reset_link, }; -static int __init dpc_service_init(void) +int __init pcie_dpc_init(void) { return pcie_port_service_register(&dpcdriver); } -device_initcall(dpc_service_init); diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c index 3ed67676ea2a..1a8b85051b1b 100644 --- a/drivers/pci/pcie/pme.c +++ b/drivers/pci/pcie/pme.c @@ -446,8 +446,7 @@ static struct pcie_port_service_driver pcie_pme_driver = { /** * pcie_pme_service_init - Register the PCIe PME service driver. */ -static int __init pcie_pme_service_init(void) +int __init pcie_pme_init(void) { return pcie_port_service_register(&pcie_pme_driver); } -device_initcall(pcie_pme_service_init); diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index d59afa42fc14..2498b2d34009 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -23,6 +23,30 @@ #define PCIE_PORT_DEVICE_MAXSERVICES 4 +#ifdef CONFIG_PCIEAER +int pcie_aer_init(void); +#else +static inline int pcie_aer_init(void) { return 0; } +#endif + +#ifdef CONFIG_HOTPLUG_PCI_PCIE +int pcie_hp_init(void); +#else +static inline int pcie_hp_init(void) { return 0; } +#endif + +#ifdef CONFIG_PCIE_PME +int pcie_pme_init(void); +#else +static inline int pcie_pme_init(void) { return 0; } +#endif + +#ifdef CONFIG_PCIE_DPC +int pcie_dpc_init(void); +#else +static inline int pcie_dpc_init(void) { return 0; } +#endif + /* Port Type */ #define PCIE_ANY_PORT (~0) diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index eef22dc29140..23a5a0c2c3fe 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -226,11 +226,20 @@ static const struct dmi_system_id pcie_portdrv_dmi_table[] __initconst = { {} }; +static void __init pcie_init_services(void) +{ + pcie_aer_init(); + pcie_pme_init(); + pcie_dpc_init(); + pcie_hp_init(); +} + static int __init pcie_portdrv_init(void) { if (pcie_ports_disabled) return -EACCES; + pcie_init_services(); dmi_check_system(pcie_portdrv_dmi_table); return pci_register_driver(&pcie_portdriver); -- > > --- > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > > index ccaf01e6eced..334044814dbe 100644 > > --- a/drivers/pci/hotplug/pciehp_core.c > > +++ b/drivers/pci/hotplug/pciehp_core.c > > @@ -322,4 +322,4 @@ static int __init pcied_init(void) > > > > return retval; > > } > > -device_initcall(pcied_init); > > +subsys_initcall(pcied_init); > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > index 83180edd6ed4..1d2159409b01 100644 > > --- a/drivers/pci/pcie/aer.c > > +++ b/drivers/pci/pcie/aer.c > > @@ -1575,4 +1575,4 @@ static int __init aer_service_init(void) > > return -ENXIO; > > return pcie_port_service_register(&aerdriver); > > } > > -device_initcall(aer_service_init); > > +subsys_initcall(aer_service_init); > > diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c > > index f03279fc87cd..aacfb50eccfc 100644 > > --- a/drivers/pci/pcie/dpc.c > > +++ b/drivers/pci/pcie/dpc.c > > @@ -286,4 +286,4 @@ static int __init dpc_service_init(void) > > { > > return pcie_port_service_register(&dpcdriver); > > } > > -device_initcall(dpc_service_init); > > +subsys_initcall(dpc_service_init); > > diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c > > index 3ed67676ea2a..cd8c1adb9b0a 100644 > > --- a/drivers/pci/pcie/pme.c > > +++ b/drivers/pci/pcie/pme.c > > @@ -450,4 +450,4 @@ static int __init pcie_pme_service_init(void) > > { > > return pcie_port_service_register(&pcie_pme_driver); > > } > > -device_initcall(pcie_pme_service_init); > > +subsys_initcall(pcie_pme_service_init); > > --
On Wed, Sep 19, 2018 at 12:00:03PM -0600, Keith Busch wrote: > On Wed, Sep 19, 2018 at 11:28:46AM -0500, Bjorn Helgaas wrote: > > On Tue, Sep 18, 2018 at 05:56:53PM -0600, Keith Busch wrote: > > Since none of the service drivers can be modules, I don't think it > > buys us much to make their init functions initcalls. Can we > > explicitly call them from the pcie_portdrv_probe() path? > > It's actually during pcie_portdrv_init that the services need to be > initialized if we're going this way. Do you think the following is > better? The initialization order should be more clear to the reader at > the cost of more code than init call magic, but I'm okay having this > done either way. Yes. More code, less magic seems like the right tradeoff to me. > --- > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c > index 68b20e667764..944047976569 100644 > --- a/drivers/pci/hotplug/pciehp_core.c > +++ b/drivers/pci/hotplug/pciehp_core.c > @@ -287,7 +287,7 @@ static struct pcie_port_service_driver hpdriver_portdrv = { > #endif /* PM */ > }; > > -static int __init pcied_init(void) > +int __init pcie_hp_init(void) > { > int retval = 0; > > @@ -298,4 +298,3 @@ static int __init pcied_init(void) > > return retval; > } > -device_initcall(pcied_init); > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 83180edd6ed4..637d638f73da 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1569,10 +1569,9 @@ static struct pcie_port_service_driver aerdriver = { > * > * Invoked when AER root service driver is loaded. > */ > -static int __init aer_service_init(void) > +int __init pcie_aer_init(void) > { > if (!pci_aer_available() || aer_acpi_firmware_first()) > return -ENXIO; > return pcie_port_service_register(&aerdriver); > } > -device_initcall(aer_service_init); > diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c > index f03279fc87cd..a1fd16bf1cab 100644 > --- a/drivers/pci/pcie/dpc.c > +++ b/drivers/pci/pcie/dpc.c > @@ -282,8 +282,7 @@ static struct pcie_port_service_driver dpcdriver = { > .reset_link = dpc_reset_link, > }; > > -static int __init dpc_service_init(void) > +int __init pcie_dpc_init(void) > { > return pcie_port_service_register(&dpcdriver); > } > -device_initcall(dpc_service_init); > diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c > index 3ed67676ea2a..1a8b85051b1b 100644 > --- a/drivers/pci/pcie/pme.c > +++ b/drivers/pci/pcie/pme.c > @@ -446,8 +446,7 @@ static struct pcie_port_service_driver pcie_pme_driver = { > /** > * pcie_pme_service_init - Register the PCIe PME service driver. > */ > -static int __init pcie_pme_service_init(void) > +int __init pcie_pme_init(void) > { > return pcie_port_service_register(&pcie_pme_driver); > } > -device_initcall(pcie_pme_service_init); > diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h > index d59afa42fc14..2498b2d34009 100644 > --- a/drivers/pci/pcie/portdrv.h > +++ b/drivers/pci/pcie/portdrv.h > @@ -23,6 +23,30 @@ > > #define PCIE_PORT_DEVICE_MAXSERVICES 4 > > +#ifdef CONFIG_PCIEAER > +int pcie_aer_init(void); > +#else > +static inline int pcie_aer_init(void) { return 0; } > +#endif > + > +#ifdef CONFIG_HOTPLUG_PCI_PCIE > +int pcie_hp_init(void); > +#else > +static inline int pcie_hp_init(void) { return 0; } > +#endif > + > +#ifdef CONFIG_PCIE_PME > +int pcie_pme_init(void); > +#else > +static inline int pcie_pme_init(void) { return 0; } > +#endif > + > +#ifdef CONFIG_PCIE_DPC > +int pcie_dpc_init(void); > +#else > +static inline int pcie_dpc_init(void) { return 0; } > +#endif > + > /* Port Type */ > #define PCIE_ANY_PORT (~0) > > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c > index eef22dc29140..23a5a0c2c3fe 100644 > --- a/drivers/pci/pcie/portdrv_pci.c > +++ b/drivers/pci/pcie/portdrv_pci.c > @@ -226,11 +226,20 @@ static const struct dmi_system_id pcie_portdrv_dmi_table[] __initconst = { > {} > }; > > +static void __init pcie_init_services(void) > +{ > + pcie_aer_init(); > + pcie_pme_init(); > + pcie_dpc_init(); > + pcie_hp_init(); > +} > + > static int __init pcie_portdrv_init(void) > { > if (pcie_ports_disabled) > return -EACCES; > > + pcie_init_services(); > dmi_check_system(pcie_portdrv_dmi_table); > > return pci_register_driver(&pcie_portdriver); > --
On Wed, Sep 19, 2018 at 02:40:29PM -0500, Bjorn Helgaas wrote: > On Wed, Sep 19, 2018 at 12:00:03PM -0600, Keith Busch wrote: > > On Wed, Sep 19, 2018 at 11:28:46AM -0500, Bjorn Helgaas wrote: > > > On Tue, Sep 18, 2018 at 05:56:53PM -0600, Keith Busch wrote: > > > Since none of the service drivers can be modules, I don't think it > > > buys us much to make their init functions initcalls. Can we > > > explicitly call them from the pcie_portdrv_probe() path? > > > > It's actually during pcie_portdrv_init that the services need to be > > initialized if we're going this way. Do you think the following is > > better? The initialization order should be more clear to the reader at > > the cost of more code than init call magic, but I'm okay having this > > done either way. > > Yes. More code, less magic seems like the right tradeoff to me. Sounds good. Will send a new version of the set using this method, plus merging up to the current pci/hotplug and the documentation updates.
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index ccaf01e6eced..334044814dbe 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -322,4 +322,4 @@ static int __init pcied_init(void) return retval; } -device_initcall(pcied_init); +subsys_initcall(pcied_init); diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 83180edd6ed4..1d2159409b01 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1575,4 +1575,4 @@ static int __init aer_service_init(void) return -ENXIO; return pcie_port_service_register(&aerdriver); } -device_initcall(aer_service_init); +subsys_initcall(aer_service_init); diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c index f03279fc87cd..aacfb50eccfc 100644 --- a/drivers/pci/pcie/dpc.c +++ b/drivers/pci/pcie/dpc.c @@ -286,4 +286,4 @@ static int __init dpc_service_init(void) { return pcie_port_service_register(&dpcdriver); } -device_initcall(dpc_service_init); +subsys_initcall(dpc_service_init); diff --git a/drivers/pci/pcie/pme.c b/drivers/pci/pcie/pme.c index 3ed67676ea2a..cd8c1adb9b0a 100644 --- a/drivers/pci/pcie/pme.c +++ b/drivers/pci/pcie/pme.c @@ -450,4 +450,4 @@ static int __init pcie_pme_service_init(void) { return pcie_port_service_register(&pcie_pme_driver); } -device_initcall(pcie_pme_service_init); +subsys_initcall(pcie_pme_service_init);
The PCI port driver saves the PCI state after initializing all the service devices. This was, however, before the service drivers were even registered. The config space state that the service drivers were setting up were not being saved. This patch fixes this by changing the service drivers use the subsys_init, which gets the service drivers registered after the pci bus system is initialized, but before the pci devices are probed. This gets the state saved as expected. Note, 70626d88385 ("PCI: pciehp: Make explicitly non-modular") already mentioned this exact change should be done, but declined to at the time until there was a real need for it. Signed-off-by: Keith Busch <keith.busch@intel.com> --- drivers/pci/hotplug/pciehp_core.c | 2 +- drivers/pci/pcie/aer.c | 2 +- drivers/pci/pcie/dpc.c | 2 +- drivers/pci/pcie/pme.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-)