Message ID | 20161111143723.5818-1-emil.l.velikov@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Nov 11, 2016 at 02:37:23PM +0000, Emil Velikov wrote: > From: Emil Velikov <emil.velikov@collabora.com> > > Currently the revision isn't available via sysfs/libudev thus if one > wants to know the value they need to read through the config file. > > This in itself wakes/powers up the device, causing unwanted delay > since it can be quite costly. > > There are at least two userspace components which could make use the new > file libpciaccess and libdrm. The former [used in various places] wakes > up _every_ PCI device, which can be observed via glxinfo [when using > Mesa 10.0+ drivers]. While the latter [in association with Mesa 13.0] > can lead to 2-3 second delays while starting firefox, thunderbird or > chromium. > > Expose the revision as a separate file, just like we do for the device, > vendor, their subsystem version and class. > > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: linux-pci@vger.kernel.org > Cc: Greg KH <gregkh@linuxfoundation.org> > Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 > Tested-by: Mauro Santos <registo.mailling@gmail.com> > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Given that waking a gpu can take somewhere between ages and forever, and that we read the pci revisions everytime we launch a new gl app I think this is the correct approach. Of course we could just patch libdrm and everyone to not look at the pci revision, but that just leads to every pci-based driver having a driver-private ioctl/getparam thing to expose it. Which also doesn't make much sense. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Björn, if you're all ok with this we'd like to start landing at least libdrm patches before this patch hits a released kernel, just to shorten the pain window for users waiting for upgrades. Thanks, Daniel > --- > v2: Add r-b/t-b tags, slim down CC list, add note about userspace. > > v3: Add Documentation/ bits (Greg KH) > > Gents, please keep me in the CC list. > > Thanks > Emil > --- > Documentation/ABI/testing/sysfs-bus-pci | 7 +++++++ > Documentation/filesystems/sysfs-pci.txt | 2 ++ > drivers/pci/pci-sysfs.c | 2 ++ > 3 files changed, 11 insertions(+) > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci > index b3bc50f..5a1732b 100644 > --- a/Documentation/ABI/testing/sysfs-bus-pci > +++ b/Documentation/ABI/testing/sysfs-bus-pci > @@ -294,3 +294,10 @@ Description: > a firmware bug to the system vendor. Writing to this file > taints the kernel with TAINT_FIRMWARE_WORKAROUND, which > reduces the supportability of your system. > + > +What: /sys/bus/pci/devices/.../revision > +Date: November 2016 > +Contact: Emil Velikov <emil.l.velikov@gmail.com> > +Description: > + This file contains the revision field of the the PCI device. > + The value comes from device config space. The file is read only. > diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt > index 74eaac2..6ea1ced 100644 > --- a/Documentation/filesystems/sysfs-pci.txt > +++ b/Documentation/filesystems/sysfs-pci.txt > @@ -17,6 +17,7 @@ that support it. For example, a given bus might look like this: > | |-- resource0 > | |-- resource1 > | |-- resource2 > + | |-- revision > | |-- rom > | |-- subsystem_device > | |-- subsystem_vendor > @@ -41,6 +42,7 @@ files, each with their own function. > resource PCI resource host addresses (ascii, ro) > resource0..N PCI resource N, if present (binary, mmap, rw[1]) > resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap) > + revision PCI revision (ascii, ro) > rom PCI ROM resource, if present (binary, ro) > subsystem_device PCI subsystem device (ascii, ro) > subsystem_vendor PCI subsystem vendor (ascii, ro) > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index bcd10c7..0666287 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -50,6 +50,7 @@ pci_config_attr(vendor, "0x%04x\n"); > pci_config_attr(device, "0x%04x\n"); > pci_config_attr(subsystem_vendor, "0x%04x\n"); > pci_config_attr(subsystem_device, "0x%04x\n"); > +pci_config_attr(revision, "0x%02x\n"); > pci_config_attr(class, "0x%06x\n"); > pci_config_attr(irq, "%u\n"); > > @@ -568,6 +569,7 @@ static struct attribute *pci_dev_attrs[] = { > &dev_attr_device.attr, > &dev_attr_subsystem_vendor.attr, > &dev_attr_subsystem_device.attr, > + &dev_attr_revision.attr, > &dev_attr_class.attr, > &dev_attr_irq.attr, > &dev_attr_local_cpus.attr, > -- > 2.9.3 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
[+cc Sinan, Lukas] Hi Daniel, On Mon, Nov 14, 2016 at 07:40:03PM +0100, Daniel Vetter wrote: > On Fri, Nov 11, 2016 at 02:37:23PM +0000, Emil Velikov wrote: > > From: Emil Velikov <emil.velikov@collabora.com> > > > > Currently the revision isn't available via sysfs/libudev thus if one > > wants to know the value they need to read through the config file. > > > > This in itself wakes/powers up the device, causing unwanted delay > > since it can be quite costly. > > > > There are at least two userspace components which could make use the new > > file libpciaccess and libdrm. The former [used in various places] wakes > > up _every_ PCI device, which can be observed via glxinfo [when using > > Mesa 10.0+ drivers]. While the latter [in association with Mesa 13.0] > > can lead to 2-3 second delays while starting firefox, thunderbird or > > chromium. > > > > Expose the revision as a separate file, just like we do for the device, > > vendor, their subsystem version and class. > > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > Cc: linux-pci@vger.kernel.org > > Cc: Greg KH <gregkh@linuxfoundation.org> > > Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > Tested-by: Mauro Santos <registo.mailling@gmail.com> > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > Signed-off-by: Emil Velikov <emil.velikov@collabora.com> > > Given that waking a gpu can take somewhere between ages and forever, and > that we read the pci revisions everytime we launch a new gl app I think > this is the correct approach. Of course we could just patch libdrm and > everyone to not look at the pci revision, but that just leads to every > pci-based driver having a driver-private ioctl/getparam thing to expose > it. Which also doesn't make much sense. This re-asserts what has already been said, but doesn't address any of my questions in the v2 discussion, so I'm still looking to continue that thread. I am curious about this long wakeup issue, though. Are we talking about a D3cold -> D0 transition? I assume so, since config space is generally accessible in all power states except D3cold. From the device's point of view this is basically like a power-on. I think the gist of PCIe r3.0, sec 6.6.1 is that we need to wait 100ms, e.g., PCI_PM_D3COLD_WAIT, before doing config accesses. We do support Configuration Request Retry Status Software Visibility (pci_enable_crs()), so a device *can* take longer than 100ms after power-up to respond to a config read, but I think that only applies to reads of the Vendor ID. I cc'd Sinan because we do have some issues with our CRS support, and maybe he can shed some light on this. I'm not surprised if a GPU takes longer than 100ms to do device- specific, driver-managed, non-PCI things like detect and wake up monitors. But I *am* surprised if generic PCI bus-level things like config reads take longer than that. I also cc'd Lukas because he knows a lot more about PCI PM than I do. Bjorn
On 11/16/2016 3:58 PM, Bjorn Helgaas wrote: > [+cc Sinan, Lukas] > > Hi Daniel, > > On Mon, Nov 14, 2016 at 07:40:03PM +0100, Daniel Vetter wrote: >> On Fri, Nov 11, 2016 at 02:37:23PM +0000, Emil Velikov wrote: >>> From: Emil Velikov <emil.velikov@collabora.com> >>> >>> Currently the revision isn't available via sysfs/libudev thus if one >>> wants to know the value they need to read through the config file. >>> >>> This in itself wakes/powers up the device, causing unwanted delay >>> since it can be quite costly. >>> >>> There are at least two userspace components which could make use the new >>> file libpciaccess and libdrm. The former [used in various places] wakes >>> up _every_ PCI device, which can be observed via glxinfo [when using >>> Mesa 10.0+ drivers]. While the latter [in association with Mesa 13.0] >>> can lead to 2-3 second delays while starting firefox, thunderbird or >>> chromium. >>> >>> Expose the revision as a separate file, just like we do for the device, >>> vendor, their subsystem version and class. >>> >>> Cc: Bjorn Helgaas <bhelgaas@google.com> >>> Cc: linux-pci@vger.kernel.org >>> Cc: Greg KH <gregkh@linuxfoundation.org> >>> Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 >>> Tested-by: Mauro Santos <registo.mailling@gmail.com> >>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> >>> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> >> >> Given that waking a gpu can take somewhere between ages and forever, and >> that we read the pci revisions everytime we launch a new gl app I think >> this is the correct approach. Of course we could just patch libdrm and >> everyone to not look at the pci revision, but that just leads to every >> pci-based driver having a driver-private ioctl/getparam thing to expose >> it. Which also doesn't make much sense. > > This re-asserts what has already been said, but doesn't address any of > my questions in the v2 discussion, so I'm still looking to continue > that thread. > > I am curious about this long wakeup issue, though. Are we talking > about a D3cold -> D0 transition? I assume so, since config space is > generally accessible in all power states except D3cold. From the > device's point of view this is basically like a power-on. I think the > gist of PCIe r3.0, sec 6.6.1 is that we need to wait 100ms, e.g., > PCI_PM_D3COLD_WAIT, before doing config accesses. > > We do support Configuration Request Retry Status Software Visibility > (pci_enable_crs()), so a device *can* take longer than 100ms after > power-up to respond to a config read, but I think that only applies to > reads of the Vendor ID. I cc'd Sinan because we do have some issues > with our CRS support, and maybe he can shed some light on this. This applies to all config requests including vendor ID. It is just the vendor ID returns a special code (device id = 0x0001 vendor id =0xffff) so that CRS aware software can understand the difference between CRS and an actual failure. It is recommended to always read the vendor ID following a reset/power on to understand if the device is sending a CRS. As Bjorn mentioned, we do enable CRS visibility in the kernel but only for the root port in pci_enable_crs function. When CRS visibility is disabled, root port needs to retry the request until a good response is received. CRS polling happens behind the scenes in pci_bus_read_dev_vendor_id function. In order for reads to take longer than 100ms, the CRS must be disabled at a lower level in the bus. I wonder if you have a bridge between your device and the root port. Bridge Configuration Retry Enable needs to be programmed if we want kernel to know about retry responses behind a bridge. "Bridge Configuration Retry Enable – When Set, this bit enables PCI Express to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in response to Configuration Requests that target devices below the bridge. Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for further details. Default value of this bit is 0b." Kernel is currently not setting this. > > I'm not surprised if a GPU takes longer than 100ms to do device- > specific, driver-managed, non-PCI things like detect and wake up > monitors. But I *am* surprised if generic PCI bus-level things like > config reads take longer than that. I also cc'd Lukas because he > knows a lot more about PCI PM than I do. > > Bjorn >
On Wed, Nov 16, 2016 at 3:58 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > [+cc Sinan, Lukas] > > Hi Daniel, > > On Mon, Nov 14, 2016 at 07:40:03PM +0100, Daniel Vetter wrote: >> On Fri, Nov 11, 2016 at 02:37:23PM +0000, Emil Velikov wrote: >> > From: Emil Velikov <emil.velikov@collabora.com> >> > >> > Currently the revision isn't available via sysfs/libudev thus if one >> > wants to know the value they need to read through the config file. >> > >> > This in itself wakes/powers up the device, causing unwanted delay >> > since it can be quite costly. >> > >> > There are at least two userspace components which could make use the new >> > file libpciaccess and libdrm. The former [used in various places] wakes >> > up _every_ PCI device, which can be observed via glxinfo [when using >> > Mesa 10.0+ drivers]. While the latter [in association with Mesa 13.0] >> > can lead to 2-3 second delays while starting firefox, thunderbird or >> > chromium. >> > >> > Expose the revision as a separate file, just like we do for the device, >> > vendor, their subsystem version and class. >> > >> > Cc: Bjorn Helgaas <bhelgaas@google.com> >> > Cc: linux-pci@vger.kernel.org >> > Cc: Greg KH <gregkh@linuxfoundation.org> >> > Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 >> > Tested-by: Mauro Santos <registo.mailling@gmail.com> >> > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> >> > Signed-off-by: Emil Velikov <emil.velikov@collabora.com> >> >> Given that waking a gpu can take somewhere between ages and forever, and >> that we read the pci revisions everytime we launch a new gl app I think >> this is the correct approach. Of course we could just patch libdrm and >> everyone to not look at the pci revision, but that just leads to every >> pci-based driver having a driver-private ioctl/getparam thing to expose >> it. Which also doesn't make much sense. > > This re-asserts what has already been said, but doesn't address any of > my questions in the v2 discussion, so I'm still looking to continue > that thread. > > I am curious about this long wakeup issue, though. Are we talking > about a D3cold -> D0 transition? I assume so, since config space is > generally accessible in all power states except D3cold. From the > device's point of view this is basically like a power-on. I think the > gist of PCIe r3.0, sec 6.6.1 is that we need to wait 100ms, e.g., > PCI_PM_D3COLD_WAIT, before doing config accesses. > > We do support Configuration Request Retry Status Software Visibility > (pci_enable_crs()), so a device *can* take longer than 100ms after > power-up to respond to a config read, but I think that only applies to > reads of the Vendor ID. I cc'd Sinan because we do have some issues > with our CRS support, and maybe he can shed some light on this. > > I'm not surprised if a GPU takes longer than 100ms to do device- > specific, driver-managed, non-PCI things like detect and wake up > monitors. But I *am* surprised if generic PCI bus-level things like > config reads take longer than that. I also cc'd Lukas because he > knows a lot more about PCI PM than I do. FWIW, If you run lspci on a GPU that is in the powered off state (either D3 cold if supported or the older vendor specific power controls that pre-dated D3 cold), any fields that were not previously cached return all 1s. So for example the pci revision would be 0xff rather than whatever it's supposed to be. Alex
On Thu, Nov 17, 2016 at 08:28:50AM -0500, Alex Deucher wrote: > On Wed, Nov 16, 2016 at 3:58 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Mon, Nov 14, 2016 at 07:40:03PM +0100, Daniel Vetter wrote: > >> Given that waking a gpu can take somewhere between ages and forever, and > >> that we read the pci revisions everytime we launch a new gl app I think > >> this is the correct approach. Of course we could just patch libdrm and > >> everyone to not look at the pci revision, but that just leads to every > >> pci-based driver having a driver-private ioctl/getparam thing to expose > >> it. Which also doesn't make much sense. > > > > I am curious about this long wakeup issue, though. Are we talking > > about a D3cold -> D0 transition? I assume so, since config space is > > generally accessible in all power states except D3cold. From the > > device's point of view this is basically like a power-on. I think the > > gist of PCIe r3.0, sec 6.6.1 is that we need to wait 100ms, e.g., > > PCI_PM_D3COLD_WAIT, before doing config accesses. > > > > We do support Configuration Request Retry Status Software Visibility > > (pci_enable_crs()), so a device *can* take longer than 100ms after > > power-up to respond to a config read, but I think that only applies to > > reads of the Vendor ID. I cc'd Sinan because we do have some issues > > with our CRS support, and maybe he can shed some light on this. > > > > I'm not surprised if a GPU takes longer than 100ms to do device- > > specific, driver-managed, non-PCI things like detect and wake up > > monitors. But I *am* surprised if generic PCI bus-level things like > > config reads take longer than that. I also cc'd Lukas because he > > knows a lot more about PCI PM than I do. > > FWIW, If you run lspci on a GPU that is in the powered off state > (either D3 cold if supported or the older vendor specific power > controls that pre-dated D3 cold), any fields that were not previously > cached return all 1s. So for example the pci revision would be 0xff > rather than whatever it's supposed to be. That doesn't feel like the right behavior to me -- I would have expected lspci to either wake up the device and show valid data or maybe complain "this device is powered off" (this seems hard to do without races). Showing a mix of cached valid data and all 1s data seems like a strange user experience to me. Caching the revision would fix that particular piece, of course, but not the overall experience. Am I expecting too much? Maybe my expectations are out of line. I think in this particular case (reading config space of a powered-off device), CRS doesn't apply because the device is powered off, and there's probably no delay: the read would probably complete immediately because the link to the device is down, and the Root Port or Downstream Port would probably generate an Unsupported Request completion, which the Root Complex might handle by fabricating all 1s data for the CPU. Bjorn
On Wed, Nov 16, 2016 at 02:58:11PM -0600, Bjorn Helgaas wrote: > On Mon, Nov 14, 2016 at 07:40:03PM +0100, Daniel Vetter wrote: > > On Fri, Nov 11, 2016 at 02:37:23PM +0000, Emil Velikov wrote: > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > Currently the revision isn't available via sysfs/libudev thus if one > > > wants to know the value they need to read through the config file. > > > > > > This in itself wakes/powers up the device, causing unwanted delay > > > since it can be quite costly. > > > > > > There are at least two userspace components which could make use the new > > > file libpciaccess and libdrm. The former [used in various places] wakes > > > up _every_ PCI device, which can be observed via glxinfo [when using > > > Mesa 10.0+ drivers]. While the latter [in association with Mesa 13.0] > > > can lead to 2-3 second delays while starting firefox, thunderbird or > > > chromium. > > > > > > Expose the revision as a separate file, just like we do for the device, > > > vendor, their subsystem version and class. > > > > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > > Cc: linux-pci@vger.kernel.org > > > Cc: Greg KH <gregkh@linuxfoundation.org> > > > Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > > Tested-by: Mauro Santos <registo.mailling@gmail.com> > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > > Signed-off-by: Emil Velikov <emil.velikov@collabora.com> > > > > Given that waking a gpu can take somewhere between ages and forever, and > > that we read the pci revisions everytime we launch a new gl app I think > > this is the correct approach. Of course we could just patch libdrm and > > everyone to not look at the pci revision, but that just leads to every > > pci-based driver having a driver-private ioctl/getparam thing to expose > > it. Which also doesn't make much sense. > > This re-asserts what has already been said, but doesn't address any of > my questions in the v2 discussion, so I'm still looking to continue > that thread. > > I am curious about this long wakeup issue, though. Are we talking > about a D3cold -> D0 transition? Yes, this is about a D3cold -> D0 transition, the bugzilla entry Emil linked talks about a dual GPU notebook. E.g. the Nvidia Kepler GPU in my MacBook Pro has 5 different power rails which must be brought up in a specific sequence (3.3V, 1.8V, 5V for the GPU, 1.35V for the video RAM and 1.05V for PCIe), something on the order of half a second to one second is reasonable for that. And the DRM driver needs a lot of time as well, half a second in the case of nouveau: [ 1500.780052] nouveau 0000:01:00.0: DRM: resuming kernel object tree... [ 1501.176616] nouveau 0000:01:00.0: DRM: resuming client object trees... [ 1501.176672] nouveau 0000:01:00.0: DRM: resuming display... [ 1501.298985] nouveau 0000:01:00.0: DRM: resuming console... There are no special PCIe things happening here. And since Sinan asked, these discrete GPUs usually aren't located behind a bridge, just a regular root port. Thanks, Lukas
On Thu, Nov 17, 2016 at 9:35 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > On Thu, Nov 17, 2016 at 08:28:50AM -0500, Alex Deucher wrote: >> On Wed, Nov 16, 2016 at 3:58 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: >> > On Mon, Nov 14, 2016 at 07:40:03PM +0100, Daniel Vetter wrote: >> >> Given that waking a gpu can take somewhere between ages and forever, and >> >> that we read the pci revisions everytime we launch a new gl app I think >> >> this is the correct approach. Of course we could just patch libdrm and >> >> everyone to not look at the pci revision, but that just leads to every >> >> pci-based driver having a driver-private ioctl/getparam thing to expose >> >> it. Which also doesn't make much sense. >> > >> > I am curious about this long wakeup issue, though. Are we talking >> > about a D3cold -> D0 transition? I assume so, since config space is >> > generally accessible in all power states except D3cold. From the >> > device's point of view this is basically like a power-on. I think the >> > gist of PCIe r3.0, sec 6.6.1 is that we need to wait 100ms, e.g., >> > PCI_PM_D3COLD_WAIT, before doing config accesses. >> > >> > We do support Configuration Request Retry Status Software Visibility >> > (pci_enable_crs()), so a device *can* take longer than 100ms after >> > power-up to respond to a config read, but I think that only applies to >> > reads of the Vendor ID. I cc'd Sinan because we do have some issues >> > with our CRS support, and maybe he can shed some light on this. >> > >> > I'm not surprised if a GPU takes longer than 100ms to do device- >> > specific, driver-managed, non-PCI things like detect and wake up >> > monitors. But I *am* surprised if generic PCI bus-level things like >> > config reads take longer than that. I also cc'd Lukas because he >> > knows a lot more about PCI PM than I do. >> >> FWIW, If you run lspci on a GPU that is in the powered off state >> (either D3 cold if supported or the older vendor specific power >> controls that pre-dated D3 cold), any fields that were not previously >> cached return all 1s. So for example the pci revision would be 0xff >> rather than whatever it's supposed to be. > > That doesn't feel like the right behavior to me -- I would have > expected lspci to either wake up the device and show valid data or > maybe complain "this device is powered off" (this seems hard to do > without races). Showing a mix of cached valid data and all 1s data > seems like a strange user experience to me. > > Caching the revision would fix that particular piece, of course, but > not the overall experience. Am I expecting too much? Maybe my > expectations are out of line. I agree with you, I'm just saying that that is the current behavior. The GPU power is controlled by runtimepm, perhaps there is a corner case we are missing when pci config space is accessed on a powered off device? Alex > > I think in this particular case (reading config space of a powered-off > device), CRS doesn't apply because the device is powered off, and > there's probably no delay: the read would probably complete > immediately because the link to the device is down, and the Root Port > or Downstream Port would probably generate an Unsupported Request > completion, which the Root Complex might handle by fabricating all 1s > data for the CPU. > > Bjorn
On Thu, Nov 17, 2016 at 03:44:02PM +0100, Lukas Wunner wrote: > On Wed, Nov 16, 2016 at 02:58:11PM -0600, Bjorn Helgaas wrote: > > On Mon, Nov 14, 2016 at 07:40:03PM +0100, Daniel Vetter wrote: > > > On Fri, Nov 11, 2016 at 02:37:23PM +0000, Emil Velikov wrote: > > > > From: Emil Velikov <emil.velikov@collabora.com> > > > > > > > > Currently the revision isn't available via sysfs/libudev thus if one > > > > wants to know the value they need to read through the config file. > > > > > > > > This in itself wakes/powers up the device, causing unwanted delay > > > > since it can be quite costly. > > > > > > > > There are at least two userspace components which could make use the new > > > > file libpciaccess and libdrm. The former [used in various places] wakes > > > > up _every_ PCI device, which can be observed via glxinfo [when using > > > > Mesa 10.0+ drivers]. While the latter [in association with Mesa 13.0] > > > > can lead to 2-3 second delays while starting firefox, thunderbird or > > > > chromium. > > > > > > > > Expose the revision as a separate file, just like we do for the device, > > > > vendor, their subsystem version and class. > > > > > > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > > > Cc: linux-pci@vger.kernel.org > > > > Cc: Greg KH <gregkh@linuxfoundation.org> > > > > Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > > > Tested-by: Mauro Santos <registo.mailling@gmail.com> > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > > > > Signed-off-by: Emil Velikov <emil.velikov@collabora.com> > > > > > > Given that waking a gpu can take somewhere between ages and forever, and > > > that we read the pci revisions everytime we launch a new gl app I think > > > this is the correct approach. Of course we could just patch libdrm and > > > everyone to not look at the pci revision, but that just leads to every > > > pci-based driver having a driver-private ioctl/getparam thing to expose > > > it. Which also doesn't make much sense. > > > > This re-asserts what has already been said, but doesn't address any of > > my questions in the v2 discussion, so I'm still looking to continue > > that thread. > > > > I am curious about this long wakeup issue, though. Are we talking > > about a D3cold -> D0 transition? > > Yes, this is about a D3cold -> D0 transition, the bugzilla entry > Emil linked talks about a dual GPU notebook. > > E.g. the Nvidia Kepler GPU in my MacBook Pro has 5 different power > rails which must be brought up in a specific sequence (3.3V, 1.8V, 5V > for the GPU, 1.35V for the video RAM and 1.05V for PCIe), something > on the order of half a second to one second is reasonable for that. Ah, OK. That makes a lot more sense. Much of the time is actually for bringing up other power rails, so the PCIe part I've been considering is only the time between applying the 1.05V PCIe power and the config read, which is just a fraction of the whole powerup time. Popping the stack all the way back to Emil's Nov 8 message: When using the Mesa drivers alongside firefox [1] (since Mesa 13.0), glxinfo (Mesa 10.0) and others, all the GPUs* will be awaken, causing unwanted delays and increased power usage. [1] https://bugs.freedesktop.org/show_bug.cgi?id=98502 The bug is about a delay in starting firefox, thunderbird, or chromium. I assume the browser starts on the current, powered-up, GPU. I don't understand why we care about the revision of other, powered-off, GPUs. The fact that it's slow might be telling us "we need to optimize some code that is necessary but slow," or it might be telling us "we're doing some unnecessary work that we should stop doing." > And the DRM driver needs a lot of time as well, half a second in > the case of nouveau: > > [ 1500.780052] nouveau 0000:01:00.0: DRM: resuming kernel object tree... > [ 1501.176616] nouveau 0000:01:00.0: DRM: resuming client object trees... > [ 1501.176672] nouveau 0000:01:00.0: DRM: resuming display... > [ 1501.298985] nouveau 0000:01:00.0: DRM: resuming console... I assume this DRM time is unrelated to the sysfs "revision" file question. Bjorn
On 18/11/16 08:48 AM, Bjorn Helgaas wrote: > > Popping the stack all the way back to Emil's Nov 8 message: > > When using the Mesa drivers alongside firefox [1] (since Mesa 13.0), > glxinfo (Mesa 10.0) and others, all the GPUs* will be awaken, > causing unwanted delays and increased power usage. > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > The bug is about a delay in starting firefox, thunderbird, or > chromium. I assume the browser starts on the current, powered-up, > GPU. I don't understand why we care about the revision of other, > powered-off, GPUs. We don't. The problem is that the current libdrm API unconditionally provides the revision. The plan is to address this in two ways: * Add new libdrm API which allows the caller to say "I don't need the revision", and make Mesa use that. Users having those changes will not run into the problem even on older kernels. * Add the separate revision file in sysfs and make libdrm use that for its current API. This means that even callers of the current libdrm API will not run into the problem with newer kernels.
On Fri, Nov 18, 2016 at 10:42:20AM +0900, Michel Dänzer wrote: > On 18/11/16 08:48 AM, Bjorn Helgaas wrote: > > > > Popping the stack all the way back to Emil's Nov 8 message: > > > > When using the Mesa drivers alongside firefox [1] (since Mesa 13.0), > > glxinfo (Mesa 10.0) and others, all the GPUs* will be awaken, > > causing unwanted delays and increased power usage. > > > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > > > The bug is about a delay in starting firefox, thunderbird, or > > chromium. I assume the browser starts on the current, powered-up, > > GPU. I don't understand why we care about the revision of other, > > powered-off, GPUs. > > We don't. The problem is that the current libdrm API unconditionally > provides the revision. The plan is to address this in two ways: > > * Add new libdrm API which allows the caller to say "I don't need the > revision", and make Mesa use that. Users having those changes will not > run into the problem even on older kernels. > > * Add the separate revision file in sysfs and make libdrm use that for > its current API. This means that even callers of the current libdrm API > will not run into the problem with newer kernels. Why do we care about *anything* for the other, powered-off, GPUs? Even users of the new libdrm API who say "I don't need the revision" are still getting the vendor/device/etc for those other GPUs.
On Thu, Nov 17, 2016 at 08:40:10PM -0600, Bjorn Helgaas wrote: > On Fri, Nov 18, 2016 at 10:42:20AM +0900, Michel Dänzer wrote: > > On 18/11/16 08:48 AM, Bjorn Helgaas wrote: > > > > > > Popping the stack all the way back to Emil's Nov 8 message: > > > > > > When using the Mesa drivers alongside firefox [1] (since Mesa 13.0), > > > glxinfo (Mesa 10.0) and others, all the GPUs* will be awaken, > > > causing unwanted delays and increased power usage. > > > > > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > > > > > The bug is about a delay in starting firefox, thunderbird, or > > > chromium. I assume the browser starts on the current, powered-up, > > > GPU. I don't understand why we care about the revision of other, > > > powered-off, GPUs. > > > > We don't. The problem is that the current libdrm API unconditionally > > provides the revision. The plan is to address this in two ways: > > > > * Add new libdrm API which allows the caller to say "I don't need the > > revision", and make Mesa use that. Users having those changes will not > > run into the problem even on older kernels. > > > > * Add the separate revision file in sysfs and make libdrm use that for > > its current API. This means that even callers of the current libdrm API > > will not run into the problem with newer kernels. > > Why do we care about *anything* for the other, powered-off, GPUs? > Even users of the new libdrm API who say "I don't need the revision" > are still getting the vendor/device/etc for those other GPUs. egl device enumeration, and for that you need to know what gpus you have and load their drivers. Afaik. Yes by default they'll all select the online gpu, but they can opt for the faster/other one on multi-gpu machines. -Daniel
On Fri, Nov 18, 2016 at 10:42:07AM +0100, Daniel Vetter wrote: > On Thu, Nov 17, 2016 at 08:40:10PM -0600, Bjorn Helgaas wrote: > > On Fri, Nov 18, 2016 at 10:42:20AM +0900, Michel Dänzer wrote: > > > On 18/11/16 08:48 AM, Bjorn Helgaas wrote: > > > > > > > > Popping the stack all the way back to Emil's Nov 8 message: > > > > > > > > When using the Mesa drivers alongside firefox [1] (since Mesa 13.0), > > > > glxinfo (Mesa 10.0) and others, all the GPUs* will be awaken, > > > > causing unwanted delays and increased power usage. > > > > > > > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > > > > > > > The bug is about a delay in starting firefox, thunderbird, or > > > > chromium. I assume the browser starts on the current, powered-up, > > > > GPU. I don't understand why we care about the revision of other, > > > > powered-off, GPUs. > > > > > > We don't. The problem is that the current libdrm API unconditionally > > > provides the revision. The plan is to address this in two ways: > > > > > > * Add new libdrm API which allows the caller to say "I don't need the > > > revision", and make Mesa use that. Users having those changes will not > > > run into the problem even on older kernels. > > > > > > * Add the separate revision file in sysfs and make libdrm use that for > > > its current API. This means that even callers of the current libdrm API > > > will not run into the problem with newer kernels. > > > > Why do we care about *anything* for the other, powered-off, GPUs? > > Even users of the new libdrm API who say "I don't need the revision" > > are still getting the vendor/device/etc for those other GPUs. > > egl device enumeration, and for that you need to know what gpus you have > and load their drivers. Afaik. Yes by default they'll all select the > online gpu, but they can opt for the faster/other one on multi-gpu > machines. Now the only reason we can avoid parsing the revision flag is that most drivers don't care, and the one that does (i915) is generally the one that's on and only needs it after driver init. But fundamentally we can't avoid the enumeration, and imo it's better to cache this in the kernel than implement some userspace thingy (require udev or whatever). I guess in the end I don't really get why caching the revision in the kernel is a problem ... -Daniel
On Fri, Nov 18, 2016 at 10:48:46AM +0100, Daniel Vetter wrote: > On Fri, Nov 18, 2016 at 10:42:07AM +0100, Daniel Vetter wrote: > > On Thu, Nov 17, 2016 at 08:40:10PM -0600, Bjorn Helgaas wrote: > > > On Fri, Nov 18, 2016 at 10:42:20AM +0900, Michel Dänzer wrote: > > > > On 18/11/16 08:48 AM, Bjorn Helgaas wrote: > > > > > > > > > > Popping the stack all the way back to Emil's Nov 8 message: > > > > > > > > > > When using the Mesa drivers alongside firefox [1] (since Mesa 13.0), > > > > > glxinfo (Mesa 10.0) and others, all the GPUs* will be awaken, > > > > > causing unwanted delays and increased power usage. > > > > > > > > > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=98502 > > > > > > > > > > The bug is about a delay in starting firefox, thunderbird, or > > > > > chromium. I assume the browser starts on the current, powered-up, > > > > > GPU. I don't understand why we care about the revision of other, > > > > > powered-off, GPUs. > > > > > > > > We don't. The problem is that the current libdrm API unconditionally > > > > provides the revision. The plan is to address this in two ways: > > > > > > > > * Add new libdrm API which allows the caller to say "I don't need the > > > > revision", and make Mesa use that. Users having those changes will not > > > > run into the problem even on older kernels. > > > > > > > > * Add the separate revision file in sysfs and make libdrm use that for > > > > its current API. This means that even callers of the current libdrm API > > > > will not run into the problem with newer kernels. > > > > > > Why do we care about *anything* for the other, powered-off, GPUs? > > > Even users of the new libdrm API who say "I don't need the revision" > > > are still getting the vendor/device/etc for those other GPUs. > > > > egl device enumeration, and for that you need to know what gpus you have > > and load their drivers. Afaik. Yes by default they'll all select the > > online gpu, but they can opt for the faster/other one on multi-gpu > > machines. > > Now the only reason we can avoid parsing the revision flag is that most > drivers don't care, and the one that does (i915) is generally the one > that's on and only needs it after driver init. But fundamentally we can't > avoid the enumeration, and imo it's better to cache this in the kernel > than implement some userspace thingy (require udev or whatever). Sigh. AFAIK, this userspace enumeration is really unique to X, and it's been a PITA for years because it puts too much arch knowledge in userspace, e.g., PCI segment/domain knowledge, PCI BAR details, hotplug events, etc. I personally wouldn't object to udev or whatever, since that's the supported mechanism everybody else uses. > I guess in the end I don't really get why caching the revision in the > kernel is a problem ... It's not that it's such a problem, I just object to adding things to enable what seems like a broken design. But I guess I'm trying to boil the ocean here, and that energy would be better spent elsewhere, so I'll apply it. Bjorn
On Fri, Nov 18, 2016 at 4:48 AM, Daniel Vetter <daniel@ffwll.ch> wrote: > On Fri, Nov 18, 2016 at 10:42:07AM +0100, Daniel Vetter wrote: >> On Thu, Nov 17, 2016 at 08:40:10PM -0600, Bjorn Helgaas wrote: >> > On Fri, Nov 18, 2016 at 10:42:20AM +0900, Michel Dänzer wrote: >> > > On 18/11/16 08:48 AM, Bjorn Helgaas wrote: >> > > > >> > > > Popping the stack all the way back to Emil's Nov 8 message: >> > > > >> > > > When using the Mesa drivers alongside firefox [1] (since Mesa 13.0), >> > > > glxinfo (Mesa 10.0) and others, all the GPUs* will be awaken, >> > > > causing unwanted delays and increased power usage. >> > > > >> > > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=98502 >> > > > >> > > > The bug is about a delay in starting firefox, thunderbird, or >> > > > chromium. I assume the browser starts on the current, powered-up, >> > > > GPU. I don't understand why we care about the revision of other, >> > > > powered-off, GPUs. >> > > >> > > We don't. The problem is that the current libdrm API unconditionally >> > > provides the revision. The plan is to address this in two ways: >> > > >> > > * Add new libdrm API which allows the caller to say "I don't need the >> > > revision", and make Mesa use that. Users having those changes will not >> > > run into the problem even on older kernels. >> > > >> > > * Add the separate revision file in sysfs and make libdrm use that for >> > > its current API. This means that even callers of the current libdrm API >> > > will not run into the problem with newer kernels. >> > >> > Why do we care about *anything* for the other, powered-off, GPUs? >> > Even users of the new libdrm API who say "I don't need the revision" >> > are still getting the vendor/device/etc for those other GPUs. >> >> egl device enumeration, and for that you need to know what gpus you have >> and load their drivers. Afaik. Yes by default they'll all select the >> online gpu, but they can opt for the faster/other one on multi-gpu >> machines. > > Now the only reason we can avoid parsing the revision flag is that most > drivers don't care, and the one that does (i915) is generally the one > that's on and only needs it after driver init. But fundamentally we can't > avoid the enumeration, and imo it's better to cache this in the kernel > than implement some userspace thingy (require udev or whatever). > We need the revision id as well. Alex > I guess in the end I don't really get why caching the revision in the > kernel is a problem ... > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Fri, Nov 11, 2016 at 02:37:23PM +0000, Emil Velikov wrote: > From: Emil Velikov <emil.velikov@collabora.com> > > Currently the revision isn't available via sysfs/libudev thus if one > wants to know the value they need to read through the config file. > > This in itself wakes/powers up the device, causing unwanted delay > since it can be quite costly. > > There are at least two userspace components which could make use the new > file libpciaccess and libdrm. The former [used in various places] wakes > up _every_ PCI device, which can be observed via glxinfo [when using > Mesa 10.0+ drivers]. While the latter [in association with Mesa 13.0] > can lead to 2-3 second delays while starting firefox, thunderbird or > chromium. > > Expose the revision as a separate file, just like we do for the device, > vendor, their subsystem version and class. > > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: linux-pci@vger.kernel.org > Cc: Greg KH <gregkh@linuxfoundation.org> > Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502 > Tested-by: Mauro Santos <registo.mailling@gmail.com> > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Applied with Daniel's reviewed-by to pci/misc for v4.10, thanks. > --- > v2: Add r-b/t-b tags, slim down CC list, add note about userspace. > > v3: Add Documentation/ bits (Greg KH) > > Gents, please keep me in the CC list. > > Thanks > Emil > --- > Documentation/ABI/testing/sysfs-bus-pci | 7 +++++++ > Documentation/filesystems/sysfs-pci.txt | 2 ++ > drivers/pci/pci-sysfs.c | 2 ++ > 3 files changed, 11 insertions(+) > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci > index b3bc50f..5a1732b 100644 > --- a/Documentation/ABI/testing/sysfs-bus-pci > +++ b/Documentation/ABI/testing/sysfs-bus-pci > @@ -294,3 +294,10 @@ Description: > a firmware bug to the system vendor. Writing to this file > taints the kernel with TAINT_FIRMWARE_WORKAROUND, which > reduces the supportability of your system. > + > +What: /sys/bus/pci/devices/.../revision > +Date: November 2016 > +Contact: Emil Velikov <emil.l.velikov@gmail.com> > +Description: > + This file contains the revision field of the the PCI device. > + The value comes from device config space. The file is read only. > diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt > index 74eaac2..6ea1ced 100644 > --- a/Documentation/filesystems/sysfs-pci.txt > +++ b/Documentation/filesystems/sysfs-pci.txt > @@ -17,6 +17,7 @@ that support it. For example, a given bus might look like this: > | |-- resource0 > | |-- resource1 > | |-- resource2 > + | |-- revision > | |-- rom > | |-- subsystem_device > | |-- subsystem_vendor > @@ -41,6 +42,7 @@ files, each with their own function. > resource PCI resource host addresses (ascii, ro) > resource0..N PCI resource N, if present (binary, mmap, rw[1]) > resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap) > + revision PCI revision (ascii, ro) > rom PCI ROM resource, if present (binary, ro) > subsystem_device PCI subsystem device (ascii, ro) > subsystem_vendor PCI subsystem vendor (ascii, ro) > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index bcd10c7..0666287 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -50,6 +50,7 @@ pci_config_attr(vendor, "0x%04x\n"); > pci_config_attr(device, "0x%04x\n"); > pci_config_attr(subsystem_vendor, "0x%04x\n"); > pci_config_attr(subsystem_device, "0x%04x\n"); > +pci_config_attr(revision, "0x%02x\n"); > pci_config_attr(class, "0x%06x\n"); > pci_config_attr(irq, "%u\n"); > > @@ -568,6 +569,7 @@ static struct attribute *pci_dev_attrs[] = { > &dev_attr_device.attr, > &dev_attr_subsystem_vendor.attr, > &dev_attr_subsystem_device.attr, > + &dev_attr_revision.attr, > &dev_attr_class.attr, > &dev_attr_irq.attr, > &dev_attr_local_cpus.attr, > -- > 2.9.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index b3bc50f..5a1732b 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -294,3 +294,10 @@ Description: a firmware bug to the system vendor. Writing to this file taints the kernel with TAINT_FIRMWARE_WORKAROUND, which reduces the supportability of your system. + +What: /sys/bus/pci/devices/.../revision +Date: November 2016 +Contact: Emil Velikov <emil.l.velikov@gmail.com> +Description: + This file contains the revision field of the the PCI device. + The value comes from device config space. The file is read only. diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt index 74eaac2..6ea1ced 100644 --- a/Documentation/filesystems/sysfs-pci.txt +++ b/Documentation/filesystems/sysfs-pci.txt @@ -17,6 +17,7 @@ that support it. For example, a given bus might look like this: | |-- resource0 | |-- resource1 | |-- resource2 + | |-- revision | |-- rom | |-- subsystem_device | |-- subsystem_vendor @@ -41,6 +42,7 @@ files, each with their own function. resource PCI resource host addresses (ascii, ro) resource0..N PCI resource N, if present (binary, mmap, rw[1]) resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap) + revision PCI revision (ascii, ro) rom PCI ROM resource, if present (binary, ro) subsystem_device PCI subsystem device (ascii, ro) subsystem_vendor PCI subsystem vendor (ascii, ro) diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index bcd10c7..0666287 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -50,6 +50,7 @@ pci_config_attr(vendor, "0x%04x\n"); pci_config_attr(device, "0x%04x\n"); pci_config_attr(subsystem_vendor, "0x%04x\n"); pci_config_attr(subsystem_device, "0x%04x\n"); +pci_config_attr(revision, "0x%02x\n"); pci_config_attr(class, "0x%06x\n"); pci_config_attr(irq, "%u\n"); @@ -568,6 +569,7 @@ static struct attribute *pci_dev_attrs[] = { &dev_attr_device.attr, &dev_attr_subsystem_vendor.attr, &dev_attr_subsystem_device.attr, + &dev_attr_revision.attr, &dev_attr_class.attr, &dev_attr_irq.attr, &dev_attr_local_cpus.attr,