Message ID | 1464188696-25782-1-git-send-email-mika.westerberg@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Wed, May 25, 2016 at 5:04 PM, Mika Westerberg <mika.westerberg@linux.intel.com> wrote: > The PCI PM 1.2 specification requires minimum of 50ms before function on a > bus is accessed after the bus is transitioned from B2 to B0. Now that we > actually power down bridges we should make sure the specification is > followed accordingly. > > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> ACK > --- > Hi Bjorn, > > Since this is only needed when bridges are powered down, I think it makes > sense to put this to your pci/pm branch. > > drivers/pci/pci.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index e785dc260e72..e645a3d4f2e0 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -2361,7 +2361,12 @@ void pci_pm_init(struct pci_dev *dev) > } > > dev->pm_cap = pm; > - dev->d3_delay = PCI_PM_D3_WAIT; > + /* > + * PCI PM 1.2 specification requires minimum of 50ms before any > + * function on the bus is accessed after the bus is transitioned > + * from B2 to B0. > + */ > + dev->d3_delay = pci_is_bridge(dev) ? PCI_PM_BUS_WAIT : PCI_PM_D3_WAIT; > dev->d3cold_delay = PCI_PM_D3COLD_WAIT; > dev->bridge_d3 = pci_bridge_d3_possible(dev); > dev->d3cold_allowed = true; > -- > 2.8.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Wed, May 25, 2016 at 06:04:56PM +0300, Mika Westerberg wrote: > The PCI PM 1.2 specification requires minimum of 50ms before function on a > bus is accessed after the bus is transitioned from B2 to B0. Now that we > actually power down bridges we should make sure the specification is > followed accordingly. This patch has the unfortunate side effect of increasing boot time on Macs with Thunderbolt by 320 ms. Granted, it's not much, but still noticeable: [ 2.358041] pcieport 0000:06:03.0: enabling device (0000 -> 0002) [ 2.358195] pcieport 0000:06:03.0: rpm_idle [ 2.358222] pcieport 0000:06:03.0: rpm_suspend [ 2.411664] pcieport 0000:06:04.0: enabling device (0000 -> 0003) [ 2.411821] pcieport 0000:06:04.0: rpm_idle [ 2.411848] pcieport 0000:06:04.0: rpm_suspend [ 2.467664] pcieport 0000:06:05.0: enabling device (0000 -> 0003) [ 2.467817] pcieport 0000:06:05.0: rpm_idle [ 2.467843] pcieport 0000:06:05.0: rpm_suspend [ 2.523664] pcieport 0000:06:06.0: enabling device (0000 -> 0002) [ 2.523822] pcieport 0000:06:06.0: rpm_idle [ 2.523848] pcieport 0000:06:06.0: rpm_suspend [ 2.579685] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [ 2.579722] pcieport 0000:06:03.0: rpm_resume [ 2.635750] pciehp 0000:06:03.0:pcie204: Slot #3 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ [ 2.635853] pciehp 0000:06:03.0:pcie204: service driver pciehp loaded [ 2.635858] pcieport 0000:06:03.0: rpm_idle [ 2.635886] pcieport 0000:06:04.0: rpm_resume [ 2.647645] pcieport 0000:06:03.0: rpm_suspend [ 2.691747] pciehp 0000:06:04.0:pcie204: Slot #4 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ [ 2.691856] pciehp 0000:06:04.0:pcie204: service driver pciehp loaded [ 2.691859] pcieport 0000:06:04.0: rpm_idle [ 2.691888] pcieport 0000:06:05.0: rpm_resume [ 2.703649] pcieport 0000:06:04.0: rpm_suspend [ 2.747748] pciehp 0000:06:05.0:pcie204: Slot #5 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ [ 2.747842] pciehp 0000:06:05.0:pcie204: service driver pciehp loaded [ 2.747845] pcieport 0000:06:06.0: rpm_resume [ 2.749213] pcieport 0000:06:05.0: rpm_idle [ 2.759650] pcieport 0000:06:05.0: rpm_suspend [ 2.805049] pciehp 0000:06:06.0:pcie204: Slot #6 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ [ 2.806464] pciehp 0000:06:06.0:pcie204: service driver pciehp loaded [ 2.806471] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 [ 2.807876] intel_idle: MWAIT substates: 0x21120 [ 2.807878] intel_idle: v0.4.1 model 0x3A [ 2.808170] intel_idle: lapic_timer_reliable_states 0xffffffff [ 2.808201] GHES: HEST is not enabled! [ 2.809613] pcieport 0000:06:06.0: rpm_idle [ 2.809644] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled [ 2.810096] Linux agpgart interface v0.103 [ 2.810158] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de> [ 2.810158] AMD IOMMUv2 functionality not available on this system [ 2.816468] pcieport 0000:06:06.0: rpm_suspend I've added debug messages whenever ->runtime_idle / _suspend / _resume is called for a device. As can be seen, the PCI core now waits 50 ms after ->runtime_suspend. Also, the ports are resumed when the pciehp service driver is loaded, adding another 50 ms delay. For four hotplug ports, that's a total of 400 ms (versus 80 ms without the patch). I'm wondering if the delay after ->runtime_suspend is really needed. Is this mandated by the spec? We could perhaps increase the autosuspend_delay in pcie_portdrv_probe() slightly to prevent the port from going to sleep between pci_enable_device() and loading the pciehp service driver. Thanks, Lukas > > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> > --- > Hi Bjorn, > > Since this is only needed when bridges are powered down, I think it makes > sense to put this to your pci/pm branch. > > drivers/pci/pci.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index e785dc260e72..e645a3d4f2e0 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -2361,7 +2361,12 @@ void pci_pm_init(struct pci_dev *dev) > } > > dev->pm_cap = pm; > - dev->d3_delay = PCI_PM_D3_WAIT; > + /* > + * PCI PM 1.2 specification requires minimum of 50ms before any > + * function on the bus is accessed after the bus is transitioned > + * from B2 to B0. > + */ > + dev->d3_delay = pci_is_bridge(dev) ? PCI_PM_BUS_WAIT : PCI_PM_D3_WAIT; > dev->d3cold_delay = PCI_PM_D3COLD_WAIT; > dev->bridge_d3 = pci_bridge_d3_possible(dev); > dev->d3cold_allowed = true; > -- > 2.8.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 26, 2016 at 12:10:06PM +0200, Lukas Wunner wrote: > Hi, > > On Wed, May 25, 2016 at 06:04:56PM +0300, Mika Westerberg wrote: > > The PCI PM 1.2 specification requires minimum of 50ms before function on a > > bus is accessed after the bus is transitioned from B2 to B0. Now that we > > actually power down bridges we should make sure the specification is > > followed accordingly. > > This patch has the unfortunate side effect of increasing boot time on > Macs with Thunderbolt by 320 ms. Granted, it's not much, but still > noticeable: Yes, that's the drawback. > [ 2.358041] pcieport 0000:06:03.0: enabling device (0000 -> 0002) > [ 2.358195] pcieport 0000:06:03.0: rpm_idle > [ 2.358222] pcieport 0000:06:03.0: rpm_suspend > [ 2.411664] pcieport 0000:06:04.0: enabling device (0000 -> 0003) > [ 2.411821] pcieport 0000:06:04.0: rpm_idle > [ 2.411848] pcieport 0000:06:04.0: rpm_suspend > [ 2.467664] pcieport 0000:06:05.0: enabling device (0000 -> 0003) > [ 2.467817] pcieport 0000:06:05.0: rpm_idle > [ 2.467843] pcieport 0000:06:05.0: rpm_suspend > [ 2.523664] pcieport 0000:06:06.0: enabling device (0000 -> 0002) > [ 2.523822] pcieport 0000:06:06.0: rpm_idle > [ 2.523848] pcieport 0000:06:06.0: rpm_suspend > [ 2.579685] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 > [ 2.579722] pcieport 0000:06:03.0: rpm_resume > [ 2.635750] pciehp 0000:06:03.0:pcie204: Slot #3 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > [ 2.635853] pciehp 0000:06:03.0:pcie204: service driver pciehp loaded > [ 2.635858] pcieport 0000:06:03.0: rpm_idle > [ 2.635886] pcieport 0000:06:04.0: rpm_resume > [ 2.647645] pcieport 0000:06:03.0: rpm_suspend > [ 2.691747] pciehp 0000:06:04.0:pcie204: Slot #4 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > [ 2.691856] pciehp 0000:06:04.0:pcie204: service driver pciehp loaded > [ 2.691859] pcieport 0000:06:04.0: rpm_idle > [ 2.691888] pcieport 0000:06:05.0: rpm_resume > [ 2.703649] pcieport 0000:06:04.0: rpm_suspend > [ 2.747748] pciehp 0000:06:05.0:pcie204: Slot #5 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > [ 2.747842] pciehp 0000:06:05.0:pcie204: service driver pciehp loaded > [ 2.747845] pcieport 0000:06:06.0: rpm_resume > [ 2.749213] pcieport 0000:06:05.0: rpm_idle > [ 2.759650] pcieport 0000:06:05.0: rpm_suspend > [ 2.805049] pciehp 0000:06:06.0:pcie204: Slot #6 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > [ 2.806464] pciehp 0000:06:06.0:pcie204: service driver pciehp loaded > [ 2.806471] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 > [ 2.807876] intel_idle: MWAIT substates: 0x21120 > [ 2.807878] intel_idle: v0.4.1 model 0x3A > [ 2.808170] intel_idle: lapic_timer_reliable_states 0xffffffff > [ 2.808201] GHES: HEST is not enabled! > [ 2.809613] pcieport 0000:06:06.0: rpm_idle > [ 2.809644] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > [ 2.810096] Linux agpgart interface v0.103 > [ 2.810158] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de> > [ 2.810158] AMD IOMMUv2 functionality not available on this system > [ 2.816468] pcieport 0000:06:06.0: rpm_suspend > > I've added debug messages whenever ->runtime_idle / _suspend / _resume > is called for a device. > > As can be seen, the PCI core now waits 50 ms after ->runtime_suspend. > Also, the ports are resumed when the pciehp service driver is loaded, > adding another 50 ms delay. For four hotplug ports, that's a total of > 400 ms (versus 80 ms without the patch). > > I'm wondering if the delay after ->runtime_suspend is really needed. Is > this mandated by the spec? We could perhaps increase the autosuspend_delay > in pcie_portdrv_probe() slightly to prevent the port from going to sleep > between pci_enable_device() and loading the pciehp service driver. The PCI PM specification version 1.2 says this in chapter 4.2 (page 37): There is a minimum time requirement of 50 ms which must be provided by system software between when the bus is switched from B2 to B0 and when a function on the bus is accessed to allow time for the clock to start up and the bus to settle. Not sure how much of that still applies to modern hardware. I don't see why we cannot increase autosuspend delay time, though. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 26, 2016 at 01:25:20PM +0300, Mika Westerberg wrote: > On Thu, May 26, 2016 at 12:10:06PM +0200, Lukas Wunner wrote: > > On Wed, May 25, 2016 at 06:04:56PM +0300, Mika Westerberg wrote: > > > The PCI PM 1.2 specification requires minimum of 50ms before function on a > > > bus is accessed after the bus is transitioned from B2 to B0. Now that we > > > actually power down bridges we should make sure the specification is > > > followed accordingly. > > > > This patch has the unfortunate side effect of increasing boot time on > > Macs with Thunderbolt by 320 ms. Granted, it's not much, but still > > noticeable: > > Yes, that's the drawback. > > > [ 2.358041] pcieport 0000:06:03.0: enabling device (0000 -> 0002) > > [ 2.358195] pcieport 0000:06:03.0: rpm_idle > > [ 2.358222] pcieport 0000:06:03.0: rpm_suspend > > [ 2.411664] pcieport 0000:06:04.0: enabling device (0000 -> 0003) > > [ 2.411821] pcieport 0000:06:04.0: rpm_idle > > [ 2.411848] pcieport 0000:06:04.0: rpm_suspend > > [ 2.467664] pcieport 0000:06:05.0: enabling device (0000 -> 0003) > > [ 2.467817] pcieport 0000:06:05.0: rpm_idle > > [ 2.467843] pcieport 0000:06:05.0: rpm_suspend > > [ 2.523664] pcieport 0000:06:06.0: enabling device (0000 -> 0002) > > [ 2.523822] pcieport 0000:06:06.0: rpm_idle > > [ 2.523848] pcieport 0000:06:06.0: rpm_suspend > > [ 2.579685] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 > > [ 2.579722] pcieport 0000:06:03.0: rpm_resume > > [ 2.635750] pciehp 0000:06:03.0:pcie204: Slot #3 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > > [ 2.635853] pciehp 0000:06:03.0:pcie204: service driver pciehp loaded > > [ 2.635858] pcieport 0000:06:03.0: rpm_idle > > [ 2.635886] pcieport 0000:06:04.0: rpm_resume > > [ 2.647645] pcieport 0000:06:03.0: rpm_suspend > > [ 2.691747] pciehp 0000:06:04.0:pcie204: Slot #4 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > > [ 2.691856] pciehp 0000:06:04.0:pcie204: service driver pciehp loaded > > [ 2.691859] pcieport 0000:06:04.0: rpm_idle > > [ 2.691888] pcieport 0000:06:05.0: rpm_resume > > [ 2.703649] pcieport 0000:06:04.0: rpm_suspend > > [ 2.747748] pciehp 0000:06:05.0:pcie204: Slot #5 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > > [ 2.747842] pciehp 0000:06:05.0:pcie204: service driver pciehp loaded > > [ 2.747845] pcieport 0000:06:06.0: rpm_resume > > [ 2.749213] pcieport 0000:06:05.0: rpm_idle > > [ 2.759650] pcieport 0000:06:05.0: rpm_suspend > > [ 2.805049] pciehp 0000:06:06.0:pcie204: Slot #6 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl- LLActRep+ > > [ 2.806464] pciehp 0000:06:06.0:pcie204: service driver pciehp loaded > > [ 2.806471] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 > > [ 2.807876] intel_idle: MWAIT substates: 0x21120 > > [ 2.807878] intel_idle: v0.4.1 model 0x3A > > [ 2.808170] intel_idle: lapic_timer_reliable_states 0xffffffff > > [ 2.808201] GHES: HEST is not enabled! > > [ 2.809613] pcieport 0000:06:06.0: rpm_idle > > [ 2.809644] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > > [ 2.810096] Linux agpgart interface v0.103 > > [ 2.810158] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de> > > [ 2.810158] AMD IOMMUv2 functionality not available on this system > > [ 2.816468] pcieport 0000:06:06.0: rpm_suspend > > > > I've added debug messages whenever ->runtime_idle / _suspend / _resume > > is called for a device. > > > > As can be seen, the PCI core now waits 50 ms after ->runtime_suspend. > > Also, the ports are resumed when the pciehp service driver is loaded, > > adding another 50 ms delay. For four hotplug ports, that's a total of > > 400 ms (versus 80 ms without the patch). > > > > I'm wondering if the delay after ->runtime_suspend is really needed. Is > > this mandated by the spec? We could perhaps increase the autosuspend_delay > > in pcie_portdrv_probe() slightly to prevent the port from going to sleep > > between pci_enable_device() and loading the pciehp service driver. > > The PCI PM specification version 1.2 says this in chapter 4.2 (page 37): > > There is a minimum time requirement of 50 ms which must be provided by > system software between when the bus is switched from B2 to B0 and > when a function on the bus is accessed to allow time for the clock to > start up and the bus to settle. But why do we wait 50 ms when *suspending*, i.e. going from B0 to B2? (Assuming B2 is the state when the bridge goes to D3hot, which I'm not sure of. The spec says that the bus state may optionally be B3 if the bridge is in D3hot.) > Not sure how much of that still applies to modern hardware. Could you ask hardware engineers at Intel what the bus state is on modern chipsets (say, ILK or newer) and Thunderbolt ports to clarify this? Thanks, Lukas > > I don't see why we cannot increase autosuspend delay time, though. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 26, 2016 at 12:45:57PM +0200, Lukas Wunner wrote: > > The PCI PM specification version 1.2 says this in chapter 4.2 (page 37): > > > > There is a minimum time requirement of 50 ms which must be provided by > > system software between when the bus is switched from B2 to B0 and > > when a function on the bus is accessed to allow time for the clock to > > start up and the bus to settle. > > But why do we wait 50 ms when *suspending*, i.e. going from B0 to B2? I guess because PCI requires delays of 10ms for both directions D0 <-> D3hot (see pci_raw_set_power_state()). > (Assuming B2 is the state when the bridge goes to D3hot, which I'm not > sure of. The spec says that the bus state may optionally be B3 if the > bridge is in D3hot.) B3 is the state where the bus goes when it's power is removed so I would expect that to require also the 50ms even though the spec does not explicitly say so. > > Not sure how much of that still applies to modern hardware. > > Could you ask hardware engineers at Intel what the bus state is on > modern chipsets (say, ILK or newer) and Thunderbolt ports to clarify > this? I can try but it is not always easy to find the right person in company as big as Intel. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, May 26, 2016 02:03:08 PM Mika Westerberg wrote: > On Thu, May 26, 2016 at 12:45:57PM +0200, Lukas Wunner wrote: > > > The PCI PM specification version 1.2 says this in chapter 4.2 (page 37): > > > > > > There is a minimum time requirement of 50 ms which must be provided by > > > system software between when the bus is switched from B2 to B0 and > > > when a function on the bus is accessed to allow time for the clock to > > > start up and the bus to settle. > > > > But why do we wait 50 ms when *suspending*, i.e. going from B0 to B2? > > I guess because PCI requires delays of 10ms for both directions D0 <-> > D3hot (see pci_raw_set_power_state()). > > > (Assuming B2 is the state when the bridge goes to D3hot, which I'm not > > sure of. The spec says that the bus state may optionally be B3 if the > > bridge is in D3hot.) > > B3 is the state where the bus goes when it's power is removed so I would > expect that to require also the 50ms even though the spec does not > explicitly say so. > > > > Not sure how much of that still applies to modern hardware. > > > > Could you ask hardware engineers at Intel what the bus state is on > > modern chipsets (say, ILK or newer) and Thunderbolt ports to clarify > > this? > > I can try but it is not always easy to find the right person in company > as big as Intel. Well, even if you find someone to tell you that, what about non-Intel? Are we going to ever know a value that's going to work for everybody unless that value is clearly stated in the spec? Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, May 28, 2016 at 02:29:06PM +0200, Rafael J. Wysocki wrote: > On Thursday, May 26, 2016 02:03:08 PM Mika Westerberg wrote: > > On Thu, May 26, 2016 at 12:45:57PM +0200, Lukas Wunner wrote: > > > > The PCI PM specification version 1.2 says this in chapter 4.2 (page 37): > > > > > > > > There is a minimum time requirement of 50 ms which must be provided by > > > > system software between when the bus is switched from B2 to B0 and > > > > when a function on the bus is accessed to allow time for the clock to > > > > start up and the bus to settle. > > > > > > But why do we wait 50 ms when *suspending*, i.e. going from B0 to B2? > > > > I guess because PCI requires delays of 10ms for both directions D0 <-> > > D3hot (see pci_raw_set_power_state()). > > > > > (Assuming B2 is the state when the bridge goes to D3hot, which I'm not > > > sure of. The spec says that the bus state may optionally be B3 if the > > > bridge is in D3hot.) > > > > B3 is the state where the bus goes when it's power is removed so I would > > expect that to require also the 50ms even though the spec does not > > explicitly say so. > > > > > > Not sure how much of that still applies to modern hardware. > > > > > > Could you ask hardware engineers at Intel what the bus state is on > > > modern chipsets (say, ILK or newer) and Thunderbolt ports to clarify > > > this? > > > > I can try but it is not always easy to find the right person in company > > as big as Intel. > > Well, even if you find someone to tell you that, what about non-Intel? Indeed, I was hoping to find someone who knows more about PCIe in general and it turns out I found one :-) > Are we going to ever know a value that's going to work for everybody unless > that value is clearly stated in the spec? That is a good question and the person I managed to find tells me that the values are already in the spec. So there are really no B-states in PCIe. Closest thing is L-states which are defined in the PCIe spec. The spec has two timing limitations: - After bringing the device to D0/L0 from D3hot/L1 there is a required 10ms delay after the write to PMCSR before any other config space access (5.3.1.4 in PCIe spec 3.1a). - After bringing the device out of reset, which is the path to D0/L0 from D3cold, there is a required 100ms delay before accessing the device's config space (6.6.1 in PCIe spec 3.1a). I also learned that both of these can be shortened with following mechanisms: - PCIe readines notifications (6.23 in PCIe spec 3.1a) - ACPI _DSM method returning readines durations from (4.6.9 in PCI firmware spec 3.2). I think it is actually worth looking into the two mechanisms and try to get Linux support them. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 30, 2016 at 12:33:26PM +0300, Mika Westerberg wrote: > I also learned that both of these can be shortened with following > mechanisms: > > - PCIe readines notifications (6.23 in PCIe spec 3.1a) > - ACPI _DSM method returning readines durations from (4.6.9 in PCI > firmware spec 3.2). BTW, looks like the latter is already implemented in pci_acpi_optimize_delay(). -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 30, 2016 at 4:44 PM, Mika Westerberg <mika.westerberg@linux.intel.com> wrote: > On Mon, May 30, 2016 at 12:33:26PM +0300, Mika Westerberg wrote: >> I also learned that both of these can be shortened with following >> mechanisms: >> >> - PCIe readines notifications (6.23 in PCIe spec 3.1a) >> - ACPI _DSM method returning readines durations from (4.6.9 in PCI >> firmware spec 3.2). > > BTW, looks like the latter is already implemented in > pci_acpi_optimize_delay(). Hmm, Lukas's trace suggest a few independent problems (or optimisations): 1. Those devices are siblings, we should wake all of them in parallel and then wait once instead of one by one. 2. Why are we waiting after _suspend at all? It seems we should only wait before the next access. Sleeping after _suspend looks like a stop gap measure to guarantee that no accesses take place? 3. The idle timeout is too low, there should be no suspend between discovery and probing. 4. Even if the spec says 50ms, we might still want to have shorter sleep times for known good devices. Thunderbolt can produce PCI hierarchies which are 7 levels deep with 4 hp bridges on each level. Waking all of that would take 50ms * 7 * 4 = 1400ms (not counting upstream bridges which have the normal d3 delay). Does TB support PCIe readines notifications? Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 30, 2016 at 05:19:53PM +0200, Andreas Noever wrote: > On Mon, May 30, 2016 at 4:44 PM, Mika Westerberg > <mika.westerberg@linux.intel.com> wrote: > > On Mon, May 30, 2016 at 12:33:26PM +0300, Mika Westerberg wrote: > >> I also learned that both of these can be shortened with following > >> mechanisms: > >> > >> - PCIe readines notifications (6.23 in PCIe spec 3.1a) > >> - ACPI _DSM method returning readines durations from (4.6.9 in PCI > >> firmware spec 3.2). > > > > BTW, looks like the latter is already implemented in > > pci_acpi_optimize_delay(). > > > Hmm, Lukas's trace suggest a few independent problems (or optimisations): > 1. Those devices are siblings, we should wake all of them in parallel > and then wait once instead of one by one. > 2. Why are we waiting after _suspend at all? It seems we should only > wait before the next access. Sleeping after _suspend looks like a stop > gap measure to guarantee that no accesses take place? I guess it is because PCI PM spec says that the delay needs to be in both directions. See table 5-6 in PCI PM spec v1.2. The code in question is in pci_raw_set_power_state(): /* Mandatory power management transition delays */ /* see PCI PM 1.1 5.6.1 table 18 */ if (state == PCI_D3hot || dev->current_state == PCI_D3hot) pci_dev_d3_sleep(dev); else if (state == PCI_D2 || dev->current_state == PCI_D2) udelay(PCI_PM_D2_DELAY); > 3. The idle timeout is too low, there should be no suspend between > discovery and probing. I agree. We took the 10ms from the original patch but I don't see why we could not increase it to 100ms or even 500ms. > 4. Even if the spec says 50ms, we might still want to have shorter > sleep times for known good devices. Thunderbolt can produce PCI > hierarchies which are 7 levels deep with 4 hp bridges on each level. > Waking all of that would take 50ms * 7 * 4 = 1400ms (not counting > upstream bridges which have the normal d3 delay). PCIe spec want to have 100ms delay from when transitioning from D3cold to D0 and we already do that in __pci_start_power_transition(). In other words we should have all necessary delays for PCIe in place already. This patch should not be needed. For decreasing delays we already support ACPI _DSM method. What is missing is PCIe readines notification support which can be added separately. > Does TB support PCIe readines notifications? It seems to be pretty recent feature so it might not be supported by the existing TBT hardware. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 31, 2016 at 11:33:49AM +0300, Mika Westerberg wrote: > On Mon, May 30, 2016 at 05:19:53PM +0200, Andreas Noever wrote: > > On Mon, May 30, 2016 at 4:44 PM, Mika Westerberg > > <mika.westerberg@linux.intel.com> wrote: > > > On Mon, May 30, 2016 at 12:33:26PM +0300, Mika Westerberg wrote: > > >> I also learned that both of these can be shortened with following > > >> mechanisms: > > >> > > >> - PCIe readines notifications (6.23 in PCIe spec 3.1a) > > >> - ACPI _DSM method returning readines durations from (4.6.9 in PCI > > >> firmware spec 3.2). > > > > > > BTW, looks like the latter is already implemented in > > > pci_acpi_optimize_delay(). > > > > > > Hmm, Lukas's trace suggest a few independent problems (or optimisations): > > 1. Those devices are siblings, we should wake all of them in parallel > > and then wait once instead of one by one. > > 2. Why are we waiting after _suspend at all? It seems we should only > > wait before the next access. Sleeping after _suspend looks like a stop > > gap measure to guarantee that no accesses take place? > > I guess it is because PCI PM spec says that the delay needs to be in > both directions. See table 5-6 in PCI PM spec v1.2. > > The code in question is in pci_raw_set_power_state(): > > /* Mandatory power management transition delays */ > /* see PCI PM 1.1 5.6.1 table 18 */ > if (state == PCI_D3hot || dev->current_state == PCI_D3hot) > pci_dev_d3_sleep(dev); > else if (state == PCI_D2 || dev->current_state == PCI_D2) > udelay(PCI_PM_D2_DELAY); > > > 3. The idle timeout is too low, there should be no suspend between > > discovery and probing. > > I agree. We took the 10ms from the original patch but I don't see why we > could not increase it to 100ms or even 500ms. > > > 4. Even if the spec says 50ms, we might still want to have shorter > > sleep times for known good devices. Thunderbolt can produce PCI > > hierarchies which are 7 levels deep with 4 hp bridges on each level. > > Waking all of that would take 50ms * 7 * 4 = 1400ms (not counting > > upstream bridges which have the normal d3 delay). > > PCIe spec want to have 100ms delay from when transitioning from D3cold > to D0 and we already do that in __pci_start_power_transition(). > > In other words we should have all necessary delays for PCIe in place > already. This patch should not be needed. To summarize the next steps. I will send new version of the PCI PM patches with following changes. - Drop this 50ms patch, we should have the PCIe 100ms delay already covered. - Increase runtime PM autosuspend time from 10ms to 500ms (or whatever is the prefered default). - Add new version of ACPI hotplug patch where pm_runtime_get/put() is moved into acpiphp_check_bridge(). Let me know if I'm forgetting something. Bjorn, is this ok for you? It would be nice to get the updated series to linux-next for wider testing :) -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index e785dc260e72..e645a3d4f2e0 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2361,7 +2361,12 @@ void pci_pm_init(struct pci_dev *dev) } dev->pm_cap = pm; - dev->d3_delay = PCI_PM_D3_WAIT; + /* + * PCI PM 1.2 specification requires minimum of 50ms before any + * function on the bus is accessed after the bus is transitioned + * from B2 to B0. + */ + dev->d3_delay = pci_is_bridge(dev) ? PCI_PM_BUS_WAIT : PCI_PM_D3_WAIT; dev->d3cold_delay = PCI_PM_D3COLD_WAIT; dev->bridge_d3 = pci_bridge_d3_possible(dev); dev->d3cold_allowed = true;
The PCI PM 1.2 specification requires minimum of 50ms before function on a bus is accessed after the bus is transitioned from B2 to B0. Now that we actually power down bridges we should make sure the specification is followed accordingly. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> --- Hi Bjorn, Since this is only needed when bridges are powered down, I think it makes sense to put this to your pci/pm branch. drivers/pci/pci.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)