Message ID | 20230606162321.34222-1-mario.limonciello@amd.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [v2] PCI: Call _REG when saving/restoring PCI state | expand |
On Tue, Jun 6, 2023 at 6:23 PM Mario Limonciello <mario.limonciello@amd.com> wrote: > > ASMedia PCIe GPIO controllers fail functional tests after returning from > suspend (S3 or s2idle). This is because the BIOS checks whether the > OSPM has called the `_REG` method to determine whether it can interact with > the OperationRegion assigned to the device. > > As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML > code on the availability of an operation region. > > To fix this issue, call acpi_evaluate_reg() when saving and restoring the > state of PCI devices. > > Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > v1->v2: > * Handle case of no CONFIG_ACPI > * Rename function > * Update commit message > * Move ACPI calling code into pci-acpi.c instead > * Cite the ACPI spec > --- > drivers/pci/pci-acpi.c | 10 ++++++++++ > drivers/pci/pci.c | 14 ++++++++++++++ > drivers/pci/pci.h | 2 ++ > 3 files changed, 26 insertions(+) > > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c > index 1698205dd73c..abc8bcfc2c71 100644 > --- a/drivers/pci/pci-acpi.c > +++ b/drivers/pci/pci-acpi.c > @@ -1209,6 +1209,16 @@ void acpi_pci_remove_bus(struct pci_bus *bus) > acpi_pci_slot_remove(bus); > } > > +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) > +{ > + int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT; > + int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev), > + ACPI_ADR_SPACE_PCI_CONFIG, val); > + if (ret) > + pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n", > + val ? "connect" : "disconnect", ret); s/val/enable/ ? Then I don't have to remember that ACPI_REG_DISCONNECT is 0. <bikeshedding> I would call this function something like acpi_pci_config_space_access(), because technically it is about allowing AML to access the PCI configuration space. </bikeshedding> > +} > + > /* ACPI bus type */ > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index e38c2f6eebd4..b2f1f603ec62 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1068,6 +1068,14 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) > return acpi_pci_bridge_d3(dev); > } > > +static inline void platform_set_register_access(struct pci_dev *dev, bool en) > +{ > + if (pci_use_mid_pm()) > + return; > + > + acpi_pci_set_register_access(dev, en); > +} > + > /** > * pci_update_current_state - Read power state of given device and cache it > * @dev: PCI device to handle. > @@ -1645,6 +1653,9 @@ static void pci_restore_ltr_state(struct pci_dev *dev) > int pci_save_state(struct pci_dev *dev) > { > int i; > + > + platform_set_register_access(dev, false); > + > /* XXX: 100% dword access ok here? */ > for (i = 0; i < 16; i++) { > pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]); > @@ -1790,6 +1801,8 @@ void pci_restore_state(struct pci_dev *dev) > pci_enable_acs(dev); > pci_restore_iov_state(dev); > > + platform_set_register_access(dev, true); > + > dev->state_saved = false; > } > EXPORT_SYMBOL(pci_restore_state); > @@ -3203,6 +3216,7 @@ void pci_pm_init(struct pci_dev *dev) > pci_read_config_word(dev, PCI_STATUS, &status); > if (status & PCI_STATUS_IMM_READY) > dev->imm_ready = 1; > + platform_set_register_access(dev, true); > } > > static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index ffccb03933e2..78961505aae2 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -703,6 +703,7 @@ void acpi_pci_refresh_power_state(struct pci_dev *dev); > int acpi_pci_wakeup(struct pci_dev *dev, bool enable); > bool acpi_pci_need_resume(struct pci_dev *dev); > pci_power_t acpi_pci_choose_state(struct pci_dev *pdev); > +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable); > #else > static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe) > { > @@ -742,6 +743,7 @@ static inline pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) > { > return PCI_POWER_ERROR; > } > +static inline void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) {} > #endif > > #ifdef CONFIG_PCIEASPM > -- > 2.34.1 >
On 6/6/2023 1:21 PM, Rafael J. Wysocki wrote: > On Tue, Jun 6, 2023 at 6:23 PM Mario Limonciello > <mario.limonciello@amd.com> wrote: >> ASMedia PCIe GPIO controllers fail functional tests after returning from >> suspend (S3 or s2idle). This is because the BIOS checks whether the >> OSPM has called the `_REG` method to determine whether it can interact with >> the OperationRegion assigned to the device. >> >> As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML >> code on the availability of an operation region. >> >> To fix this issue, call acpi_evaluate_reg() when saving and restoring the >> state of PCI devices. >> >> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> v1->v2: >> * Handle case of no CONFIG_ACPI >> * Rename function >> * Update commit message >> * Move ACPI calling code into pci-acpi.c instead >> * Cite the ACPI spec >> --- >> drivers/pci/pci-acpi.c | 10 ++++++++++ >> drivers/pci/pci.c | 14 ++++++++++++++ >> drivers/pci/pci.h | 2 ++ >> 3 files changed, 26 insertions(+) >> >> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c >> index 1698205dd73c..abc8bcfc2c71 100644 >> --- a/drivers/pci/pci-acpi.c >> +++ b/drivers/pci/pci-acpi.c >> @@ -1209,6 +1209,16 @@ void acpi_pci_remove_bus(struct pci_bus *bus) >> acpi_pci_slot_remove(bus); >> } >> >> +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) >> +{ >> + int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT; >> + int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev), >> + ACPI_ADR_SPACE_PCI_CONFIG, val); >> + if (ret) >> + pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n", >> + val ? "connect" : "disconnect", ret); > s/val/enable/ ? > Then I don't have to remember that ACPI_REG_DISCONNECT is 0. Sure, no problem. > > <bikeshedding> > > I would call this function something like > acpi_pci_config_space_access(), because technically it is about > allowing AML to access the PCI configuration space. > > </bikeshedding> Presumably platform_pci_config_space_access() below too. That's fine by me, Bjorn? >> +} >> + >> /* ACPI bus type */ >> >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index e38c2f6eebd4..b2f1f603ec62 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -1068,6 +1068,14 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) >> return acpi_pci_bridge_d3(dev); >> } >> >> +static inline void platform_set_register_access(struct pci_dev *dev, bool en) >> +{ >> + if (pci_use_mid_pm()) >> + return; >> + >> + acpi_pci_set_register_access(dev, en); >> +} >> + >> /** >> * pci_update_current_state - Read power state of given device and cache it >> * @dev: PCI device to handle. >> @@ -1645,6 +1653,9 @@ static void pci_restore_ltr_state(struct pci_dev *dev) >> int pci_save_state(struct pci_dev *dev) >> { >> int i; >> + >> + platform_set_register_access(dev, false); >> + >> /* XXX: 100% dword access ok here? */ >> for (i = 0; i < 16; i++) { >> pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]); >> @@ -1790,6 +1801,8 @@ void pci_restore_state(struct pci_dev *dev) >> pci_enable_acs(dev); >> pci_restore_iov_state(dev); >> >> + platform_set_register_access(dev, true); >> + >> dev->state_saved = false; >> } >> EXPORT_SYMBOL(pci_restore_state); >> @@ -3203,6 +3216,7 @@ void pci_pm_init(struct pci_dev *dev) >> pci_read_config_word(dev, PCI_STATUS, &status); >> if (status & PCI_STATUS_IMM_READY) >> dev->imm_ready = 1; >> + platform_set_register_access(dev, true); >> } >> >> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) >> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >> index ffccb03933e2..78961505aae2 100644 >> --- a/drivers/pci/pci.h >> +++ b/drivers/pci/pci.h >> @@ -703,6 +703,7 @@ void acpi_pci_refresh_power_state(struct pci_dev *dev); >> int acpi_pci_wakeup(struct pci_dev *dev, bool enable); >> bool acpi_pci_need_resume(struct pci_dev *dev); >> pci_power_t acpi_pci_choose_state(struct pci_dev *pdev); >> +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable); >> #else >> static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe) >> { >> @@ -742,6 +743,7 @@ static inline pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) >> { >> return PCI_POWER_ERROR; >> } >> +static inline void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) {} >> #endif >> >> #ifdef CONFIG_PCIEASPM >> -- >> 2.34.1 >>
On Tue, Jun 06, 2023 at 11:23:21AM -0500, Mario Limonciello wrote: > ASMedia PCIe GPIO controllers fail functional tests after returning from > suspend (S3 or s2idle). This is because the BIOS checks whether the > OSPM has called the `_REG` method to determine whether it can interact with > the OperationRegion assigned to the device. > > As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML > code on the availability of an operation region. > > To fix this issue, call acpi_evaluate_reg() when saving and restoring the > state of PCI devices. > > Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > v1->v2: > * Handle case of no CONFIG_ACPI > * Rename function > * Update commit message > * Move ACPI calling code into pci-acpi.c instead > * Cite the ACPI spec Thanks for the spec reference (s/APCI/ACPI/ and add the revision if you rev this (r6.5 is the latest, AFAIK) if you rev this). I don't see text in that section that connects S3 with _REG. If it's there, you might have to quote the relevant sentence or two in the commit log. You mentioned _REG being sort of a mutex to synchronize OSPM vs platform access; if there's spec language to that effect, let's cite it. Ideally we should have been able to read the PCI and ACPI specs and implement this without tripping over problem on this particular hardware. I'm looking for the text that enables that "clean-room" implementation. If the spec doesn't have that text, it's either a hole in the spec or a BIOS defect that depends on something the spec doesn't require. Doing this in pci_save_state() still seems wrong to me. For example, e1000_probe() calls pci_save_state(), but this is not part of suspend. IIUC, this patch will disconnect the opregion when we probe an e1000 NIC. Is that what you intend? > --- > drivers/pci/pci-acpi.c | 10 ++++++++++ > drivers/pci/pci.c | 14 ++++++++++++++ > drivers/pci/pci.h | 2 ++ > 3 files changed, 26 insertions(+) > > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c > index 1698205dd73c..abc8bcfc2c71 100644 > --- a/drivers/pci/pci-acpi.c > +++ b/drivers/pci/pci-acpi.c > @@ -1209,6 +1209,16 @@ void acpi_pci_remove_bus(struct pci_bus *bus) > acpi_pci_slot_remove(bus); > } > > +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) > +{ > + int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT; > + int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev), > + ACPI_ADR_SPACE_PCI_CONFIG, val); > + if (ret) > + pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n", > + val ? "connect" : "disconnect", ret); > +} > + > /* ACPI bus type */ > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index e38c2f6eebd4..b2f1f603ec62 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1068,6 +1068,14 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) > return acpi_pci_bridge_d3(dev); > } > > +static inline void platform_set_register_access(struct pci_dev *dev, bool en) > +{ > + if (pci_use_mid_pm()) > + return; > + > + acpi_pci_set_register_access(dev, en); > +} > + > /** > * pci_update_current_state - Read power state of given device and cache it > * @dev: PCI device to handle. > @@ -1645,6 +1653,9 @@ static void pci_restore_ltr_state(struct pci_dev *dev) > int pci_save_state(struct pci_dev *dev) > { > int i; > + > + platform_set_register_access(dev, false); > + > /* XXX: 100% dword access ok here? */ > for (i = 0; i < 16; i++) { > pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]); > @@ -1790,6 +1801,8 @@ void pci_restore_state(struct pci_dev *dev) > pci_enable_acs(dev); > pci_restore_iov_state(dev); > > + platform_set_register_access(dev, true); > + > dev->state_saved = false; > } > EXPORT_SYMBOL(pci_restore_state); > @@ -3203,6 +3216,7 @@ void pci_pm_init(struct pci_dev *dev) > pci_read_config_word(dev, PCI_STATUS, &status); > if (status & PCI_STATUS_IMM_READY) > dev->imm_ready = 1; > + platform_set_register_access(dev, true); > } > > static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index ffccb03933e2..78961505aae2 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -703,6 +703,7 @@ void acpi_pci_refresh_power_state(struct pci_dev *dev); > int acpi_pci_wakeup(struct pci_dev *dev, bool enable); > bool acpi_pci_need_resume(struct pci_dev *dev); > pci_power_t acpi_pci_choose_state(struct pci_dev *pdev); > +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable); > #else > static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe) > { > @@ -742,6 +743,7 @@ static inline pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) > { > return PCI_POWER_ERROR; > } > +static inline void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) {} > #endif > > #ifdef CONFIG_PCIEASPM > -- > 2.34.1 >
On 6/6/2023 2:23 PM, Bjorn Helgaas wrote: > On Tue, Jun 06, 2023 at 11:23:21AM -0500, Mario Limonciello wrote: >> ASMedia PCIe GPIO controllers fail functional tests after returning from >> suspend (S3 or s2idle). This is because the BIOS checks whether the >> OSPM has called the `_REG` method to determine whether it can interact with >> the OperationRegion assigned to the device. >> >> As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML >> code on the availability of an operation region. >> >> To fix this issue, call acpi_evaluate_reg() when saving and restoring the >> state of PCI devices. >> >> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> v1->v2: >> * Handle case of no CONFIG_ACPI >> * Rename function >> * Update commit message >> * Move ACPI calling code into pci-acpi.c instead >> * Cite the ACPI spec > Thanks for the spec reference (s/APCI/ACPI/ and add the revision if > you rev this (r6.5 is the latest, AFAIK) if you rev this). > > I don't see text in that section that connects S3 with _REG. If it's > there, you might have to quote the relevant sentence or two in the > commit log. I don't think there is anything the spec connecting this with S3. At least from my perspective S3 is the reason this was exposed but there is a deficiency that exists that _REG is not being called by Linux. I intend to re-word the commit message something to the effect of explaining what _REG does and why _REG should be called, along with citations. Then in another paragraph "Fixing this resolves an issue ...". > > You mentioned _REG being sort of a mutex to synchronize OSPM vs > platform access; if there's spec language to that effect, let's cite > it. That sentence I included was cited from the spec. > Ideally we should have been able to read the PCI and ACPI specs and > implement this without tripping over problem on this particular > hardware. I'm looking for the text that enables that "clean-room" > implementation. If the spec doesn't have that text, it's either a > hole in the spec or a BIOS defect that depends on something the spec > doesn't require. IMO both the spec and BIOS are correct, it's a Linux issue that _REG wasn't used. Hopefully disconnecting the issue as just an example will achieve what you're looking for. > > Doing this in pci_save_state() still seems wrong to me. For example, > e1000_probe() calls pci_save_state(), but this is not part of suspend. > IIUC, this patch will disconnect the opregion when we probe an e1000 > NIC. Is that what you intend? Thanks for pointing this one out. I was narrowly focused on callers in PCI core. This was a caller I wasn't aware of; I agree it doesn't make sense. I think pci_set_power_state() might be another good candidate to use. What do you think of this? Or can you suggest another call site? I'm hesitant to put this into PCI suspend/resume code, because I think this same issue could happen at runtime too. >> --- >> drivers/pci/pci-acpi.c | 10 ++++++++++ >> drivers/pci/pci.c | 14 ++++++++++++++ >> drivers/pci/pci.h | 2 ++ >> 3 files changed, 26 insertions(+) >> >> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c >> index 1698205dd73c..abc8bcfc2c71 100644 >> --- a/drivers/pci/pci-acpi.c >> +++ b/drivers/pci/pci-acpi.c >> @@ -1209,6 +1209,16 @@ void acpi_pci_remove_bus(struct pci_bus *bus) >> acpi_pci_slot_remove(bus); >> } >> >> +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) >> +{ >> + int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT; >> + int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev), >> + ACPI_ADR_SPACE_PCI_CONFIG, val); >> + if (ret) >> + pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n", >> + val ? "connect" : "disconnect", ret); >> +} >> + >> /* ACPI bus type */ >> >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index e38c2f6eebd4..b2f1f603ec62 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -1068,6 +1068,14 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) >> return acpi_pci_bridge_d3(dev); >> } >> >> +static inline void platform_set_register_access(struct pci_dev *dev, bool en) >> +{ >> + if (pci_use_mid_pm()) >> + return; >> + >> + acpi_pci_set_register_access(dev, en); >> +} >> + >> /** >> * pci_update_current_state - Read power state of given device and cache it >> * @dev: PCI device to handle. >> @@ -1645,6 +1653,9 @@ static void pci_restore_ltr_state(struct pci_dev *dev) >> int pci_save_state(struct pci_dev *dev) >> { >> int i; >> + >> + platform_set_register_access(dev, false); >> + >> /* XXX: 100% dword access ok here? */ >> for (i = 0; i < 16; i++) { >> pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]); >> @@ -1790,6 +1801,8 @@ void pci_restore_state(struct pci_dev *dev) >> pci_enable_acs(dev); >> pci_restore_iov_state(dev); >> >> + platform_set_register_access(dev, true); >> + >> dev->state_saved = false; >> } >> EXPORT_SYMBOL(pci_restore_state); >> @@ -3203,6 +3216,7 @@ void pci_pm_init(struct pci_dev *dev) >> pci_read_config_word(dev, PCI_STATUS, &status); >> if (status & PCI_STATUS_IMM_READY) >> dev->imm_ready = 1; >> + platform_set_register_access(dev, true); >> } >> >> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) >> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >> index ffccb03933e2..78961505aae2 100644 >> --- a/drivers/pci/pci.h >> +++ b/drivers/pci/pci.h >> @@ -703,6 +703,7 @@ void acpi_pci_refresh_power_state(struct pci_dev *dev); >> int acpi_pci_wakeup(struct pci_dev *dev, bool enable); >> bool acpi_pci_need_resume(struct pci_dev *dev); >> pci_power_t acpi_pci_choose_state(struct pci_dev *pdev); >> +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable); >> #else >> static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe) >> { >> @@ -742,6 +743,7 @@ static inline pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) >> { >> return PCI_POWER_ERROR; >> } >> +static inline void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) {} >> #endif >> >> #ifdef CONFIG_PCIEASPM >> -- >> 2.34.1 >>
On Tue, Jun 06, 2023 at 02:40:45PM -0500, Limonciello, Mario wrote: > On 6/6/2023 2:23 PM, Bjorn Helgaas wrote: > > On Tue, Jun 06, 2023 at 11:23:21AM -0500, Mario Limonciello wrote: > > > ASMedia PCIe GPIO controllers fail functional tests after returning from > > > suspend (S3 or s2idle). This is because the BIOS checks whether the > > > OSPM has called the `_REG` method to determine whether it can interact with > > > the OperationRegion assigned to the device. > > > > > > As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML > > > code on the availability of an operation region. > > > > > > To fix this issue, call acpi_evaluate_reg() when saving and restoring the > > > state of PCI devices. > > > > > > Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > > > --- > > > v1->v2: > > > * Handle case of no CONFIG_ACPI > > > * Rename function > > > * Update commit message > > > * Move ACPI calling code into pci-acpi.c instead > > > * Cite the ACPI spec > > > > Thanks for the spec reference (s/APCI/ACPI/ and add the revision if > > you rev this (r6.5 is the latest, AFAIK) if you rev this). > > > > I don't see text in that section that connects S3 with _REG. If it's > > there, you might have to quote the relevant sentence or two in the > > commit log. > > I don't think there is anything the spec connecting this > with S3. At least from my perspective S3 is the reason > this was exposed but there is a deficiency that exists > that _REG is not being called by Linux. > > I intend to re-word the commit message something to the > effect of explaining what _REG does and why _REG should be > called, along with citations. > > Then in another paragraph "Fixing this resolves an issue ...". > > > You mentioned _REG being sort of a mutex to synchronize OSPM vs > > platform access; if there's spec language to that effect, let's cite > > it. > > That sentence I included was cited from the spec. If it's necessary to justify the commit, include the citation in the commit log. > > Ideally we should have been able to read the PCI and ACPI specs and > > implement this without tripping over problem on this particular > > hardware. I'm looking for the text that enables that "clean-room" > > implementation. If the spec doesn't have that text, it's either a > > hole in the spec or a BIOS defect that depends on something the spec > > doesn't require. > > IMO both the spec and BIOS are correct, it's a Linux > issue that _REG wasn't used. What tells Linux that _REG needs to be used here? If there's nothing that tells Linux to use _REG here, I claim it's a BIOS defect. I'm happy to be convinced otherwise; the way to convince me is to point to the spec. If it's a BIOS defect, it's fine to work around it, but we need to understand that, own up to it, and make the exact requirements very clear. Otherwise we're likely to break this in the future because future developers and maintainers will rely on the specs. > > Doing this in pci_save_state() still seems wrong to me. For example, > > e1000_probe() calls pci_save_state(), but this is not part of suspend. > > IIUC, this patch will disconnect the opregion when we probe an e1000 > > NIC. Is that what you intend? > > Thanks for pointing this one out. I was narrowly focused > on callers in PCI core. This was a caller I wasn't > aware of; I agree it doesn't make sense. > > I think pci_set_power_state() might be another good > candidate to use. What do you think of this? I can't suggest a call site because (1) I'm not a power management person, and (2) I don't think we have a clear statement of when it is required. This must be expressed in terms of PCI power state transitions, or at least something discoverable from a pci_dev, not "s2idle" or even "S3" because those are meaningless in the PCI context. Bjorn
On 6/6/2023 2:58 PM, Bjorn Helgaas wrote: > On Tue, Jun 06, 2023 at 02:40:45PM -0500, Limonciello, Mario wrote: >> On 6/6/2023 2:23 PM, Bjorn Helgaas wrote: >>> On Tue, Jun 06, 2023 at 11:23:21AM -0500, Mario Limonciello wrote: >>>> ASMedia PCIe GPIO controllers fail functional tests after returning from >>>> suspend (S3 or s2idle). This is because the BIOS checks whether the >>>> OSPM has called the `_REG` method to determine whether it can interact with >>>> the OperationRegion assigned to the device. >>>> >>>> As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML >>>> code on the availability of an operation region. >>>> >>>> To fix this issue, call acpi_evaluate_reg() when saving and restoring the >>>> state of PCI devices. >>>> >>>> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region >>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >>>> --- >>>> v1->v2: >>>> * Handle case of no CONFIG_ACPI >>>> * Rename function >>>> * Update commit message >>>> * Move ACPI calling code into pci-acpi.c instead >>>> * Cite the ACPI spec >>> Thanks for the spec reference (s/APCI/ACPI/ and add the revision if >>> you rev this (r6.5 is the latest, AFAIK) if you rev this). >>> >>> I don't see text in that section that connects S3 with _REG. If it's >>> there, you might have to quote the relevant sentence or two in the >>> commit log. >> I don't think there is anything the spec connecting this >> with S3. At least from my perspective S3 is the reason >> this was exposed but there is a deficiency that exists >> that _REG is not being called by Linux. >> >> I intend to re-word the commit message something to the >> effect of explaining what _REG does and why _REG should be >> called, along with citations. >> >> Then in another paragraph "Fixing this resolves an issue ...". >> >>> You mentioned _REG being sort of a mutex to synchronize OSPM vs >>> platform access; if there's spec language to that effect, let's cite >>> it. >> That sentence I included was cited from the spec. > If it's necessary to justify the commit, include the citation in the > commit log. > >>> Ideally we should have been able to read the PCI and ACPI specs and >>> implement this without tripping over problem on this particular >>> hardware. I'm looking for the text that enables that "clean-room" >>> implementation. If the spec doesn't have that text, it's either a >>> hole in the spec or a BIOS defect that depends on something the spec >>> doesn't require. >> IMO both the spec and BIOS are correct, it's a Linux >> issue that _REG wasn't used. > What tells Linux that _REG needs to be used here? If there's nothing > that tells Linux to use _REG here, I claim it's a BIOS defect. I'm > happy to be convinced otherwise; the way to convince me is to point to > the spec. From the spec it says "control methods must assume all operation regions are inaccessible until the _REG(RegionSpace, 1) method is executed" It also points out the opposite: "Conversely, control methods must not access fields in operation regions when _REG method execution has not indicated that the operation region handler is ready." The ACPI spec doesn't refer to D3 in this context, but it does make an allusion to power off in an example case. "Also, when the host controller or bridge controller is turned off or disabled, PCI Config Space Operation Regions for child devices are no longer available. As such, ETH0’s _REG method will be run when it is turned off and will again be run when PCI1 is turned off." > > If it's a BIOS defect, it's fine to work around it, but we need to > understand that, own up to it, and make the exact requirements very > clear. Otherwise we're likely to break this in the future because > future developers and maintainers will rely on the specs. From my discussions with BIOS developers, this is entirely intended behavior based on the _REG section in the spec. >>> Doing this in pci_save_state() still seems wrong to me. For example, >>> e1000_probe() calls pci_save_state(), but this is not part of suspend. >>> IIUC, this patch will disconnect the opregion when we probe an e1000 >>> NIC. Is that what you intend? >> Thanks for pointing this one out. I was narrowly focused >> on callers in PCI core. This was a caller I wasn't >> aware of; I agree it doesn't make sense. >> >> I think pci_set_power_state() might be another good >> candidate to use. What do you think of this? > I can't suggest a call site because (1) I'm not a power management > person, and (2) I don't think we have a clear statement of when it is > required. This must be expressed in terms of PCI power state > transitions, or at least something discoverable from a pci_dev, not > "s2idle" or even "S3" because those are meaningless in the PCI > context. > > Bjorn Right; I'm with you on not putting it with a suspend transition. The spec indicates that control methods can't access the regions until _REG is called, so my leaning is to keep the call at init time, and then add another call for the D3 and D0 transitions which is why I think pci_set_power_state() is probably best.
On Tue, Jun 6, 2023 at 10:26 PM Limonciello, Mario <mario.limonciello@amd.com> wrote: > > > On 6/6/2023 2:58 PM, Bjorn Helgaas wrote: > > On Tue, Jun 06, 2023 at 02:40:45PM -0500, Limonciello, Mario wrote: > >> On 6/6/2023 2:23 PM, Bjorn Helgaas wrote: > >>> On Tue, Jun 06, 2023 at 11:23:21AM -0500, Mario Limonciello wrote: > >>>> ASMedia PCIe GPIO controllers fail functional tests after returning from > >>>> suspend (S3 or s2idle). This is because the BIOS checks whether the > >>>> OSPM has called the `_REG` method to determine whether it can interact with > >>>> the OperationRegion assigned to the device. > >>>> > >>>> As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML > >>>> code on the availability of an operation region. > >>>> > >>>> To fix this issue, call acpi_evaluate_reg() when saving and restoring the > >>>> state of PCI devices. > >>>> > >>>> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region > >>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > >>>> --- > >>>> v1->v2: > >>>> * Handle case of no CONFIG_ACPI > >>>> * Rename function > >>>> * Update commit message > >>>> * Move ACPI calling code into pci-acpi.c instead > >>>> * Cite the ACPI spec > >>> Thanks for the spec reference (s/APCI/ACPI/ and add the revision if > >>> you rev this (r6.5 is the latest, AFAIK) if you rev this). > >>> > >>> I don't see text in that section that connects S3 with _REG. If it's > >>> there, you might have to quote the relevant sentence or two in the > >>> commit log. > >> I don't think there is anything the spec connecting this > >> with S3. At least from my perspective S3 is the reason > >> this was exposed but there is a deficiency that exists > >> that _REG is not being called by Linux. > >> > >> I intend to re-word the commit message something to the > >> effect of explaining what _REG does and why _REG should be > >> called, along with citations. > >> > >> Then in another paragraph "Fixing this resolves an issue ...". > >> > >>> You mentioned _REG being sort of a mutex to synchronize OSPM vs > >>> platform access; if there's spec language to that effect, let's cite > >>> it. > >> That sentence I included was cited from the spec. > > If it's necessary to justify the commit, include the citation in the > > commit log. > > > >>> Ideally we should have been able to read the PCI and ACPI specs and > >>> implement this without tripping over problem on this particular > >>> hardware. I'm looking for the text that enables that "clean-room" > >>> implementation. If the spec doesn't have that text, it's either a > >>> hole in the spec or a BIOS defect that depends on something the spec > >>> doesn't require. > >> IMO both the spec and BIOS are correct, it's a Linux > >> issue that _REG wasn't used. > > What tells Linux that _REG needs to be used here? If there's nothing > > that tells Linux to use _REG here, I claim it's a BIOS defect. I'm > > happy to be convinced otherwise; the way to convince me is to point to > > the spec. > From the spec it says "control methods must assume > all operation regions are inaccessible until the > _REG(RegionSpace, 1) method is executed" > > It also points out the opposite: "Conversely, > control methods must not access fields in > operation regions when _REG method execution > has not indicated that the operation region > handler is ready." > > The ACPI spec doesn't refer to D3 in this context, but > it does make an allusion to power off in an example case. > > "Also, when the host controller or bridge controller > is turned off or disabled, PCI Config Space Operation > Regions for child devices are no longer available. > As such, ETH0’s _REG method will be run when it is > turned off and will again be run when PCI1 is > turned off." > > > > > If it's a BIOS defect, it's fine to work around it, but we need to > > understand that, own up to it, and make the exact requirements very > > clear. Otherwise we're likely to break this in the future because > > future developers and maintainers will rely on the specs. > From my discussions with BIOS developers, this is entirely > intended behavior based on the _REG section in the spec. > >>> Doing this in pci_save_state() still seems wrong to me. For example, > >>> e1000_probe() calls pci_save_state(), but this is not part of suspend. > >>> IIUC, this patch will disconnect the opregion when we probe an e1000 > >>> NIC. Is that what you intend? > >> Thanks for pointing this one out. I was narrowly focused > >> on callers in PCI core. This was a caller I wasn't > >> aware of; I agree it doesn't make sense. > >> > >> I think pci_set_power_state() might be another good > >> candidate to use. What do you think of this? > > I can't suggest a call site because (1) I'm not a power management > > person, and (2) I don't think we have a clear statement of when it is > > required. This must be expressed in terms of PCI power state > > transitions, or at least something discoverable from a pci_dev, not > > "s2idle" or even "S3" because those are meaningless in the PCI > > context. > > > > Bjorn > Right; I'm with you on not putting it with a suspend > transition. > > The spec indicates that control methods can't access > the regions until _REG is called, so > my leaning is to keep the call at init time, and > then add another call for the D3 and D0 transitions > which is why I think pci_set_power_state() is probably > best. Except that it is not used in all transitions; see the callers oif pci_power_up() for example. Technically, the config space should become accessible right after acpi_pci_set_power_state() has transitioned the device into D0 and it should be accessible still right before acpi_pci_set_power_state() attempts to transition the device into a low-power state, so it looks like _REG could be evaluated from there.
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c index 1698205dd73c..abc8bcfc2c71 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -1209,6 +1209,16 @@ void acpi_pci_remove_bus(struct pci_bus *bus) acpi_pci_slot_remove(bus); } +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) +{ + int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT; + int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev), + ACPI_ADR_SPACE_PCI_CONFIG, val); + if (ret) + pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n", + val ? "connect" : "disconnect", ret); +} + /* ACPI bus type */ diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index e38c2f6eebd4..b2f1f603ec62 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1068,6 +1068,14 @@ static inline bool platform_pci_bridge_d3(struct pci_dev *dev) return acpi_pci_bridge_d3(dev); } +static inline void platform_set_register_access(struct pci_dev *dev, bool en) +{ + if (pci_use_mid_pm()) + return; + + acpi_pci_set_register_access(dev, en); +} + /** * pci_update_current_state - Read power state of given device and cache it * @dev: PCI device to handle. @@ -1645,6 +1653,9 @@ static void pci_restore_ltr_state(struct pci_dev *dev) int pci_save_state(struct pci_dev *dev) { int i; + + platform_set_register_access(dev, false); + /* XXX: 100% dword access ok here? */ for (i = 0; i < 16; i++) { pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]); @@ -1790,6 +1801,8 @@ void pci_restore_state(struct pci_dev *dev) pci_enable_acs(dev); pci_restore_iov_state(dev); + platform_set_register_access(dev, true); + dev->state_saved = false; } EXPORT_SYMBOL(pci_restore_state); @@ -3203,6 +3216,7 @@ void pci_pm_init(struct pci_dev *dev) pci_read_config_word(dev, PCI_STATUS, &status); if (status & PCI_STATUS_IMM_READY) dev->imm_ready = 1; + platform_set_register_access(dev, true); } static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index ffccb03933e2..78961505aae2 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -703,6 +703,7 @@ void acpi_pci_refresh_power_state(struct pci_dev *dev); int acpi_pci_wakeup(struct pci_dev *dev, bool enable); bool acpi_pci_need_resume(struct pci_dev *dev); pci_power_t acpi_pci_choose_state(struct pci_dev *pdev); +void acpi_pci_set_register_access(struct pci_dev *dev, bool enable); #else static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe) { @@ -742,6 +743,7 @@ static inline pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) { return PCI_POWER_ERROR; } +static inline void acpi_pci_set_register_access(struct pci_dev *dev, bool enable) {} #endif #ifdef CONFIG_PCIEASPM
ASMedia PCIe GPIO controllers fail functional tests after returning from suspend (S3 or s2idle). This is because the BIOS checks whether the OSPM has called the `_REG` method to determine whether it can interact with the OperationRegion assigned to the device. As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML code on the availability of an operation region. To fix this issue, call acpi_evaluate_reg() when saving and restoring the state of PCI devices. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> --- v1->v2: * Handle case of no CONFIG_ACPI * Rename function * Update commit message * Move ACPI calling code into pci-acpi.c instead * Cite the ACPI spec --- drivers/pci/pci-acpi.c | 10 ++++++++++ drivers/pci/pci.c | 14 ++++++++++++++ drivers/pci/pci.h | 2 ++ 3 files changed, 26 insertions(+)