Message ID | 200903210003.55161.rjw@sisk.pl (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On Saturday 21 March 2009, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > The story in http://bugzilla.kernel.org/show_bug.cgi?id=12846 shows > that setting the power state of a PCI device by > pci_raw_set_power_state() may sometimes fail. For this reason, > pci_raw_set_power_state() should not assume that the power state of > the device has actually changed after writing into its PMCSR. > Instead, it should read the value from there and use it to update > dev->current_state. It also is useful to print a warning if the > device's power state hasn't changed as expected. > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > --- OK, since the Ben's radeonfb fix for bug #12846 has been merged, I'd like to do something a bit different. Patch 1/2 introduces __pci_set_power_state() that will allow the radeonfb driver not to open code PCI PM operations. Patch 2/2 makes the driver use __pci_set_power_state(). Comments welcome. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2009-03-22 at 22:08 +0100, Rafael J. Wysocki wrote: > On Saturday 21 March 2009, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > The story in http://bugzilla.kernel.org/show_bug.cgi?id=12846 shows > > that setting the power state of a PCI device by > > pci_raw_set_power_state() may sometimes fail. For this reason, > > pci_raw_set_power_state() should not assume that the power state of > > the device has actually changed after writing into its PMCSR. > > Instead, it should read the value from there and use it to update > > dev->current_state. It also is useful to print a warning if the > > device's power state hasn't changed as expected. > > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > > --- > OK, since the Ben's radeonfb fix for bug #12846 has been merged, I'd like > to do something a bit different. > > Patch 1/2 introduces __pci_set_power_state() that will allow the radeonfb > driver not to open code PCI PM operations. > > Patch 2/2 makes the driver use __pci_set_power_state(). > > Comments welcome. No objection. Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday 22 March 2009, Rafael J. Wysocki wrote: > On Saturday 21 March 2009, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > The story in http://bugzilla.kernel.org/show_bug.cgi?id=12846 shows > > that setting the power state of a PCI device by > > pci_raw_set_power_state() may sometimes fail. For this reason, > > pci_raw_set_power_state() should not assume that the power state of > > the device has actually changed after writing into its PMCSR. > > Instead, it should read the value from there and use it to update > > dev->current_state. It also is useful to print a warning if the > > device's power state hasn't changed as expected. > > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > > --- > OK, since the Ben's radeonfb fix for bug #12846 has been merged, I'd like > to do something a bit different. > > Patch 1/2 introduces __pci_set_power_state() that will allow the radeonfb > driver not to open code PCI PM operations. > > Patch 2/2 makes the driver use __pci_set_power_state(). > > Comments welcome. Well, Jesse doesn't like these patches very much, so here's an alternative. 1/2 changes platform_pci_set_power_state() into an exported function (and uses it to simplify pci_set_power_state() a bit) so that the radeonfb driver can use it. 2/2 modifies the radeonfb driver itself. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 23 Mar 2009 22:30:09 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > On Sunday 22 March 2009, Rafael J. Wysocki wrote: > > On Saturday 21 March 2009, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > > > The story in http://bugzilla.kernel.org/show_bug.cgi?id=12846 > > > shows that setting the power state of a PCI device by > > > pci_raw_set_power_state() may sometimes fail. For this reason, > > > pci_raw_set_power_state() should not assume that the power state > > > of the device has actually changed after writing into its PMCSR. > > > Instead, it should read the value from there and use it to update > > > dev->current_state. It also is useful to print a warning if the > > > device's power state hasn't changed as expected. > > > > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > > > --- > > OK, since the Ben's radeonfb fix for bug #12846 has been merged, > > I'd like to do something a bit different. > > > > Patch 1/2 introduces __pci_set_power_state() that will allow the > > radeonfb driver not to open code PCI PM operations. > > > > Patch 2/2 makes the driver use __pci_set_power_state(). > > > > Comments welcome. > > Well, Jesse doesn't like these patches very much, so here's an > alternative. > > 1/2 changes platform_pci_set_power_state() into an exported function > (and uses it to simplify pci_set_power_state() a bit) so that the > radeonfb driver can use it. > > 2/2 modifies the radeonfb driver itself. The thing I didn't like was that it made the radeon driver use an internal interface; I'd really prefer a proper return value from pci_set_power_state, which in turn means auditing all its current callers. But that doesn't seem worth it unless we see other drivers needing something similar... And if we did go with something like your first patch, I'd still rather see the timeout done in the driver, rather than having the attempts & delay included in the function...
On Monday 23 March 2009, Benjamin Herrenschmidt wrote: > On Sun, 2009-03-22 at 22:08 +0100, Rafael J. Wysocki wrote: > > On Saturday 21 March 2009, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > > > The story in http://bugzilla.kernel.org/show_bug.cgi?id=12846 shows > > > that setting the power state of a PCI device by > > > pci_raw_set_power_state() may sometimes fail. For this reason, > > > pci_raw_set_power_state() should not assume that the power state of > > > the device has actually changed after writing into its PMCSR. > > > Instead, it should read the value from there and use it to update > > > dev->current_state. It also is useful to print a warning if the > > > device's power state hasn't changed as expected. > > > > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > > > --- > > OK, since the Ben's radeonfb fix for bug #12846 has been merged, I'd like > > to do something a bit different. > > > > Patch 1/2 introduces __pci_set_power_state() that will allow the radeonfb > > driver not to open code PCI PM operations. > > > > Patch 2/2 makes the driver use __pci_set_power_state(). > > > > Comments welcome. > > No objection. Great, thanks. I also discussed the patchset with Jesse on IRC and we agreed it was better than the alternative approaches. (So, please disregard the patches sent earlier today). Would you mind if I put 1/2 and 2/2 into the suspend tree? Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2009-03-23 at 15:23 -0700, Jesse Barnes wrote: > The thing I didn't like was that it made the radeon driver use an > internal interface; I'd really prefer a proper return value from > pci_set_power_state, which in turn means auditing all its current > callers. But that doesn't seem worth it unless we see other drivers > needing something similar... > > And if we did go with something like your first patch, I'd still rather > see the timeout done in the driver, rather than having the attempts & > delay included in the function... So what ? The driver would call pci_set_power_state() until it stops failing ? I'm not too fan of that, because it will change the access pattern to the chip: - write PM to 2 - short delay - read PM, see 0, return error - driver does big delay - write PM to 2 - short delay - read PM .... vs. the current sequence which is - write PM to 2 - long delay - read PM, be happy Which -seems- to be pretty much what happens in practice, though on that chip, I don't know for sure about others. I'm worried of the possible side effects of the first sequence that you propose since it would do 2 things potentially confusing to the HW: - read PM after a short delay... it -should- be harmless but you know HW as well as I do ... - write PM to 2 a second time after the long delay. Again, it -should- be harmless since the chip at this stage should already be in D2 state but god knows how the HW will react. I'm especially worried about the later in fact. Maybe we can minimize it by having pci_set_power_state() dbl check the content of the PM reg before writing to it... Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 24 Mar 2009 11:57:19 +1100 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Mon, 2009-03-23 at 15:23 -0700, Jesse Barnes wrote: > > The thing I didn't like was that it made the radeon driver use an > > internal interface; I'd really prefer a proper return value from > > pci_set_power_state, which in turn means auditing all its current > > callers. But that doesn't seem worth it unless we see other drivers > > needing something similar... > > > > And if we did go with something like your first patch, I'd still > > rather see the timeout done in the driver, rather than having the > > attempts & delay included in the function... > > So what ? The driver would call pci_set_power_state() until it stops > failing ? Yeah, that's what I had in mind. > I'm not too fan of that, because it will change the access pattern > to the chip: > > - write PM to 2 > - short delay > - read PM, see 0, return error > - driver does big delay > - write PM to 2 > - short delay > - read PM .... > > vs. the current sequence which is > > - write PM to 2 > - long delay > - read PM, be happy > > Which -seems- to be pretty much what happens in practice, though on > that chip, I don't know for sure about others. > > I'm worried of the possible side effects of the first sequence that > you propose since it would do 2 things potentially confusing to the > HW: > > - read PM after a short delay... it -should- be harmless but you know > HW as well as I do ... > > - write PM to 2 a second time after the long delay. Again, it > -should- be harmless since the chip at this stage should already be > in D2 state but god knows how the HW will react. > > I'm especially worried about the later in fact. Maybe we can minimize > it by having pci_set_power_state() dbl check the content of the PM > reg before writing to it... Honestly I'm not too happy about any of the approaches, but yeah I see your point. The main thing is to prevent any config space access for a specified time after the first D-state transition, which I think we do correctly in the core. Beyond that, we just have to make sure the core state is updated correctly; Rafael's first patch does that correctly I think. Actually now that I think of it, maybe Alex can get us details on the errata here. If we just need a longer wait between a D-state transition and the next config space access for this chip (or set of chips), we could add that as a proper quirk to the core... Alex, any thoughts about the bug & patch in http://bugzilla.kernel.org/show_bug.cgi?id=12846 ? Looks like old radeon chips need a workaround when transitioning from D0 -> D2... Thanks,
On 3/23/09, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > On Tue, 24 Mar 2009 11:57:19 +1100 > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > > On Mon, 2009-03-23 at 15:23 -0700, Jesse Barnes wrote: > > > The thing I didn't like was that it made the radeon driver use an > > > internal interface; I'd really prefer a proper return value from > > > pci_set_power_state, which in turn means auditing all its current > > > callers. But that doesn't seem worth it unless we see other drivers > > > needing something similar... > > > > > > And if we did go with something like your first patch, I'd still > > > rather see the timeout done in the driver, rather than having the > > > attempts & delay included in the function... > > > > So what ? The driver would call pci_set_power_state() until it stops > > failing ? > > Yeah, that's what I had in mind. > > > I'm not too fan of that, because it will change the access pattern > > to the chip: > > > > - write PM to 2 > > - short delay > > - read PM, see 0, return error > > - driver does big delay > > - write PM to 2 > > - short delay > > - read PM .... > > > > vs. the current sequence which is > > > > - write PM to 2 > > - long delay > > - read PM, be happy > > > > Which -seems- to be pretty much what happens in practice, though on > > that chip, I don't know for sure about others. > > > > I'm worried of the possible side effects of the first sequence that > > you propose since it would do 2 things potentially confusing to the > > HW: > > > > - read PM after a short delay... it -should- be harmless but you know > > HW as well as I do ... > > > > - write PM to 2 a second time after the long delay. Again, it > > -should- be harmless since the chip at this stage should already be > > in D2 state but god knows how the HW will react. > > > > I'm especially worried about the later in fact. Maybe we can minimize > > it by having pci_set_power_state() dbl check the content of the PM > > reg before writing to it... > > Honestly I'm not too happy about any of the approaches, but yeah I see > your point. The main thing is to prevent any config space access for > a specified time after the first D-state transition, which I think we > do correctly in the core. Beyond that, we just have to make sure the > core state is updated correctly; Rafael's first patch does that > correctly I think. > > Actually now that I think of it, maybe Alex can get us details on the > errata here. If we just need a longer wait between a D-state transition > and the next config space access for this chip (or set of chips), we > could add that as a proper quirk to the core... > > Alex, any thoughts about the bug & patch in > http://bugzilla.kernel.org/show_bug.cgi?id=12846 ? Looks like old > radeon chips need a workaround when transitioning from D0 -> D2... I don't see any errata for this, but it's hard to find stuff as chips get older. Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -436,7 +436,7 @@ static inline int platform_pci_sleep_wak */ static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) { - u16 pmcsr; + u16 pmcsr, new_pmcsr; bool need_restore = false; /* Check if we're already there */ @@ -498,7 +498,15 @@ static int pci_raw_set_power_state(struc else if (state == PCI_D2 || dev->current_state == PCI_D2) udelay(PCI_PM_D2_DELAY); - dev->current_state = state; + pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &new_pmcsr); + if (new_pmcsr == pmcsr) { + dev->current_state = state; + } else { + dev->current_state = (new_pmcsr & PCI_PM_CTRL_STATE_MASK); + dev_warn(&dev->dev, + "failed to set power state to D%d, is D%d\n", state, + dev->current_state); + } /* According to section 5.4.1 of the "PCI BUS POWER MANAGEMENT * INTERFACE SPECIFICATION, REV. 1.2", a device transitioning