diff mbox

[v2] PCI: create revision file in sysfs

Message ID 20161109165607.26322-1-emil.l.velikov@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Emil Velikov Nov. 9, 2016, 4:56 p.m. UTC
From: Emil Velikov <emil.velikov@collabora.com>

Currently the revision isn't available via sysfs/libudev thus if one
wants to know the value they need to read through the config file.

This in itself wakes/powers up the device, causing unwanted delays.

There are at least two userspace components which could make use the new
file - libpciaccess and libdrm. At the moment the former will wake up
_every_ PCI device for simple invocation of glxinfo [when using Mesa
10.0+ drivers]. While the latter [in association with Mesa 13.0] can
lead to 2-3 second delays while starting firefox, thunderbird or
chromium.

Expose the revision as a separate file, just like we do for the device,
vendor, their subsystem version and class.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502
Tested-by: Mauro Santos <registo.mailling@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
v2:
 - Add r-b/t-b tags
 - Slim down CC list
 - Add note about userspace.

As before, please keep me in the CC list. Additionally if there's
anything else I can do to get things going please let me know.

Thanks
Emil
---
 drivers/pci/pci-sysfs.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Greg KH Nov. 10, 2016, 7:13 a.m. UTC | #1
On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
> From: Emil Velikov <emil.velikov@collabora.com>
> 
> Currently the revision isn't available via sysfs/libudev thus if one
> wants to know the value they need to read through the config file.
> 
> This in itself wakes/powers up the device, causing unwanted delays.
> 
> There are at least two userspace components which could make use the new
> file - libpciaccess and libdrm. At the moment the former will wake up
> _every_ PCI device for simple invocation of glxinfo [when using Mesa
> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
> lead to 2-3 second delays while starting firefox, thunderbird or
> chromium.
> 
> Expose the revision as a separate file, just like we do for the device,
> vendor, their subsystem version and class.
> 
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502
> Tested-by: Mauro Santos <registo.mailling@gmail.com>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
> ---
> v2:
>  - Add r-b/t-b tags
>  - Slim down CC list
>  - Add note about userspace.
> 
> As before, please keep me in the CC list. Additionally if there's
> anything else I can do to get things going please let me know.
> 
> Thanks
> Emil
> ---
>  drivers/pci/pci-sysfs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index bcd10c7..0666287 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -50,6 +50,7 @@ pci_config_attr(vendor, "0x%04x\n");
>  pci_config_attr(device, "0x%04x\n");
>  pci_config_attr(subsystem_vendor, "0x%04x\n");
>  pci_config_attr(subsystem_device, "0x%04x\n");
> +pci_config_attr(revision, "0x%02x\n");
>  pci_config_attr(class, "0x%06x\n");
>  pci_config_attr(irq, "%u\n");

Shouldn't we get a Documentation/ABI/ update for this as well?

thanks,

greg k-h
Emil Velikov Nov. 10, 2016, 1:14 p.m. UTC | #2
On 10 November 2016 at 07:13, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
>> From: Emil Velikov <emil.velikov@collabora.com>
>>
>> Currently the revision isn't available via sysfs/libudev thus if one
>> wants to know the value they need to read through the config file.
>>
>> This in itself wakes/powers up the device, causing unwanted delays.
>>
>> There are at least two userspace components which could make use the new
>> file - libpciaccess and libdrm. At the moment the former will wake up
>> _every_ PCI device for simple invocation of glxinfo [when using Mesa
>> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
>> lead to 2-3 second delays while starting firefox, thunderbird or
>> chromium.
>>
>> Expose the revision as a separate file, just like we do for the device,
>> vendor, their subsystem version and class.
>>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: linux-pci@vger.kernel.org
>> Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502
>> Tested-by: Mauro Santos <registo.mailling@gmail.com>
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
>> ---
>> v2:
>>  - Add r-b/t-b tags
>>  - Slim down CC list
>>  - Add note about userspace.
>>
>> As before, please keep me in the CC list. Additionally if there's
>> anything else I can do to get things going please let me know.
>>
>> Thanks
>> Emil
>> ---
>>  drivers/pci/pci-sysfs.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index bcd10c7..0666287 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -50,6 +50,7 @@ pci_config_attr(vendor, "0x%04x\n");
>>  pci_config_attr(device, "0x%04x\n");
>>  pci_config_attr(subsystem_vendor, "0x%04x\n");
>>  pci_config_attr(subsystem_device, "0x%04x\n");
>> +pci_config_attr(revision, "0x%02x\n");
>>  pci_config_attr(class, "0x%06x\n");
>>  pci_config_attr(irq, "%u\n");
>
> Shouldn't we get a Documentation/ABI/ update for this as well?
>
Definitely, we should.

I've updated Documentation/filesystems/sysfs-pci.txt [locally] yet
looking through ABI/ there is only a 'testing' one -
Documentation/ABI/testing/sysfs-bus-pci.

Feels a bit strange there is no stable one, guess I should/could start one ?

Thanks
Emil
Bjorn Helgaas Nov. 10, 2016, 11:59 p.m. UTC | #3
Hi Emil,

On Thu, Nov 10, 2016 at 01:14:35PM +0000, Emil Velikov wrote:
> On 10 November 2016 at 07:13, Greg KH <gregkh@linuxfoundation.org> wrote:
> > On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
> >> From: Emil Velikov <emil.velikov@collabora.com>
> >>
> >> Currently the revision isn't available via sysfs/libudev thus if one
> >> wants to know the value they need to read through the config file.
> >>
> >> This in itself wakes/powers up the device, causing unwanted delays.
> >>
> >> There are at least two userspace components which could make use the new
> >> file - libpciaccess and libdrm. At the moment the former will wake up
> >> _every_ PCI device for simple invocation of glxinfo [when using Mesa
> >> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
> >> lead to 2-3 second delays while starting firefox, thunderbird or
> >> chromium.

I agree, these unwanted delays are completely unacceptable.  My
question is whether we should fix them by exporting more information
from the kernel, or by changing the way the userspace components work.

It should not take anywhere near 2 seconds to wake up a PCI device.
That makes me think there's a more serious problem than just a lack of
caching for the revision field, e.g., maybe we're looking at far more
PCI devices than we need to, or we're doing it many times to the same
device, or ...

If I understand correctly, the delay was bisected to
https://cgit.freedesktop.org/mesa/mesa/commit/?id=be239326aa4f, which
removed a bunch of code that looked up the vendor and device IDs, and
replaced it with drmGetDevice().  And apparently drmGetDevice(), in
this path:

  drmGetDevice
    drmProcessPciDevice
      drmParsePciDeviceInfo

is a little more thorough in that it looks up the *revision* in
addition to the vendor and device IDs.  So we pay the cost for the
revision even though in this instance we don't care about the revision
at all.

drmParsePciDeviceInfo() currently reads the whole config header from
sysfs (https://cgit.freedesktop.org/drm/libdrm/tree/xf86drm.c#n2949),
but I think you're extending that to try the vendor, device,
subsystem_vendor, subsystem_device, and (if present) revision sysfs
files first (http://www.spinics.net/lists/dri-devel/msg122319.html).

Bottom line, I guess I'm not super opposed to this, but I do feel like
we're making a kernel change to cover up a userspace problem, and I
think it would be better to push on that userspace problem a little
more.

> >> Expose the revision as a separate file, just like we do for the device,
> >> vendor, their subsystem version and class.
> >>
> >> Cc: Bjorn Helgaas <bhelgaas@google.com>
> >> Cc: linux-pci@vger.kernel.org
> >> Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502
> >> Tested-by: Mauro Santos <registo.mailling@gmail.com>
> >> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> >> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
> >> ---
> >> v2:
> >>  - Add r-b/t-b tags
> >>  - Slim down CC list
> >>  - Add note about userspace.
> >>
> >> As before, please keep me in the CC list. Additionally if there's
> >> anything else I can do to get things going please let me know.
> >>
> >> Thanks
> >> Emil
> >> ---
> >>  drivers/pci/pci-sysfs.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> >> index bcd10c7..0666287 100644
> >> --- a/drivers/pci/pci-sysfs.c
> >> +++ b/drivers/pci/pci-sysfs.c
> >> @@ -50,6 +50,7 @@ pci_config_attr(vendor, "0x%04x\n");
> >>  pci_config_attr(device, "0x%04x\n");
> >>  pci_config_attr(subsystem_vendor, "0x%04x\n");
> >>  pci_config_attr(subsystem_device, "0x%04x\n");
> >> +pci_config_attr(revision, "0x%02x\n");
> >>  pci_config_attr(class, "0x%06x\n");
> >>  pci_config_attr(irq, "%u\n");
> >
> > Shouldn't we get a Documentation/ABI/ update for this as well?
> >
> Definitely, we should.
> 
> I've updated Documentation/filesystems/sysfs-pci.txt [locally] yet
> looking through ABI/ there is only a 'testing' one -
> Documentation/ABI/testing/sysfs-bus-pci.
> 
> Feels a bit strange there is no stable one, guess I should/could start one ?

I wouldn't jump to the conclusion that this new ABI is "stable" when
all the existing ones are only "testing".  I'd just leave it in
testing along with all the others.

Bjorn
Emil Velikov Nov. 11, 2016, 12:31 a.m. UTC | #4
On 10 November 2016 at 23:59, Bjorn Helgaas <helgaas@kernel.org> wrote:
> Hi Emil,
>
> On Thu, Nov 10, 2016 at 01:14:35PM +0000, Emil Velikov wrote:
>> On 10 November 2016 at 07:13, Greg KH <gregkh@linuxfoundation.org> wrote:
>> > On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
>> >> From: Emil Velikov <emil.velikov@collabora.com>
>> >>
>> >> Currently the revision isn't available via sysfs/libudev thus if one
>> >> wants to know the value they need to read through the config file.
>> >>
>> >> This in itself wakes/powers up the device, causing unwanted delays.
>> >>
>> >> There are at least two userspace components which could make use the new
>> >> file - libpciaccess and libdrm. At the moment the former will wake up
>> >> _every_ PCI device for simple invocation of glxinfo [when using Mesa
>> >> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
>> >> lead to 2-3 second delays while starting firefox, thunderbird or
>> >> chromium.
>
> I agree, these unwanted delays are completely unacceptable.  My
> question is whether we should fix them by exporting more information
> from the kernel, or by changing the way the userspace components work.
>
> It should not take anywhere near 2 seconds to wake up a PCI device.
> That makes me think there's a more serious problem than just a lack of
> caching for the revision field, e.g., maybe we're looking at far more
> PCI devices than we need to, or we're doing it many times to the same
> device, or ...
>
> If I understand correctly, the delay was bisected to
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=be239326aa4f, which
> removed a bunch of code that looked up the vendor and device IDs, and
> replaced it with drmGetDevice().  And apparently drmGetDevice(), in
> this path:
>
>   drmGetDevice
>     drmProcessPciDevice
>       drmParsePciDeviceInfo
>
> is a little more thorough in that it looks up the *revision* in
> addition to the vendor and device IDs.  So we pay the cost for the
> revision even though in this instance we don't care about the revision
> at all.
>
Above all, apologies for all the "lovely" code that you had to go
through for these.
And yes, you've got it spot on.

> drmParsePciDeviceInfo() currently reads the whole config header from
> sysfs (https://cgit.freedesktop.org/drm/libdrm/tree/xf86drm.c#n2949),
> but I think you're extending that to try the vendor, device,
> subsystem_vendor, subsystem_device, and (if present) revision sysfs
> files first (http://www.spinics.net/lists/dri-devel/msg122319.html).
>
Yes, making the revision file optional and "faking it" was my first
thought, esp. since we don't have any users of it (yet).
Although people are not too keen on it, so we'll likely opt for
revision-less API.

> Bottom line, I guess I'm not super opposed to this, but I do feel like
> we're making a kernel change to cover up a userspace problem, and I
> think it would be better to push on that userspace problem a little
> more.
>
Yes, definitely we can beat some sense into userspace. Yet that
shouldn't be a deterrent for exposing the revision.

As hinted before the other prominent user libpciaccess wakes up probes
_every_ pci device.
Atm that library is used by Xorg, Spice, libvirt and a few others.
Amongst which are the Intel GL drivers (via libdrm_intel.so), [only]
when GLX_MESA_query_renderer is used.

Or in other words - if Firefox/other GL app wants to use the
extension, they'll get similar delays.
We should look into that one as well, but it will be more picky to
address (read "slower to reach end users").

>> I've updated Documentation/filesystems/sysfs-pci.txt [locally] yet
>> looking through ABI/ there is only a 'testing' one -
>> Documentation/ABI/testing/sysfs-bus-pci.
>>
>> Feels a bit strange there is no stable one, guess I should/could start one ?
>
> I wouldn't jump to the conclusion that this new ABI is "stable" when
> all the existing ones are only "testing".  I'd just leave it in
> testing along with all the others.
>

Agreed. Thank you !
Emil
Bjorn Helgaas Nov. 11, 2016, 2:49 p.m. UTC | #5
On Fri, Nov 11, 2016 at 12:31:47AM +0000, Emil Velikov wrote:
> On 10 November 2016 at 23:59, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > Hi Emil,
> >
> > On Thu, Nov 10, 2016 at 01:14:35PM +0000, Emil Velikov wrote:
> >> On 10 November 2016 at 07:13, Greg KH <gregkh@linuxfoundation.org> wrote:
> >> > On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
> >> >> From: Emil Velikov <emil.velikov@collabora.com>
> >> >>
> >> >> Currently the revision isn't available via sysfs/libudev thus if one
> >> >> wants to know the value they need to read through the config file.
> >> >>
> >> >> This in itself wakes/powers up the device, causing unwanted delays.
> >> >>
> >> >> There are at least two userspace components which could make use the new
> >> >> file - libpciaccess and libdrm. At the moment the former will wake up
> >> >> _every_ PCI device for simple invocation of glxinfo [when using Mesa
> >> >> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
> >> >> lead to 2-3 second delays while starting firefox, thunderbird or
> >> >> chromium.
> >
> > I agree, these unwanted delays are completely unacceptable.  My
> > question is whether we should fix them by exporting more information
> > from the kernel, or by changing the way the userspace components work.
> >
> > It should not take anywhere near 2 seconds to wake up a PCI device.
> > That makes me think there's a more serious problem than just a lack of
> > caching for the revision field, e.g., maybe we're looking at far more
> > PCI devices than we need to, or we're doing it many times to the same
> > device, or ...
> >
> > If I understand correctly, the delay was bisected to
> > https://cgit.freedesktop.org/mesa/mesa/commit/?id=be239326aa4f, which
> > removed a bunch of code that looked up the vendor and device IDs, and
> > replaced it with drmGetDevice().  And apparently drmGetDevice(), in
> > this path:
> >
> >   drmGetDevice
> >     drmProcessPciDevice
> >       drmParsePciDeviceInfo
> >
> > is a little more thorough in that it looks up the *revision* in
> > addition to the vendor and device IDs.  So we pay the cost for the
> > revision even though in this instance we don't care about the revision
> > at all.
> >
> Above all, apologies for all the "lovely" code that you had to go
> through for these.
> And yes, you've got it spot on.
> 
> > drmParsePciDeviceInfo() currently reads the whole config header from
> > sysfs (https://cgit.freedesktop.org/drm/libdrm/tree/xf86drm.c#n2949),
> > but I think you're extending that to try the vendor, device,
> > subsystem_vendor, subsystem_device, and (if present) revision sysfs
> > files first (http://www.spinics.net/lists/dri-devel/msg122319.html).
> >
> Yes, making the revision file optional and "faking it" was my first
> thought, esp. since we don't have any users of it (yet).
> Although people are not too keen on it, so we'll likely opt for
> revision-less API.
> 
> > Bottom line, I guess I'm not super opposed to this, but I do feel like
> > we're making a kernel change to cover up a userspace problem, and I
> > think it would be better to push on that userspace problem a little
> > more.
> >
> Yes, definitely we can beat some sense into userspace. Yet that
> shouldn't be a deterrent for exposing the revision.

Maybe.  If we speed things up by extending this kernel ABI, there's
much less incentive to optimize the userspace stuff.  I feel a little
bit like an enabler for undesirable userspace behavior :)

> As hinted before the other prominent user libpciaccess wakes up probes
> _every_ pci device.

Is it really necessary to probe *every* PCI device?  That doesn't
sound like a scalable design.

As you can tell, the argument that "we should add this kernel ABI to
make suboptimal userspace algorithms go faster" doesn't feel very
compelling to me.

> Atm that library is used by Xorg, Spice, libvirt and a few others.
> Amongst which are the Intel GL drivers (via libdrm_intel.so), [only]
> when GLX_MESA_query_renderer is used.
> 
> Or in other words - if Firefox/other GL app wants to use the
> extension, they'll get similar delays.
> We should look into that one as well, but it will be more picky to
> address (read "slower to reach end users").
Emil Velikov Nov. 11, 2016, 6:56 p.m. UTC | #6
Hi Bjorn,

On 11 November 2016 at 14:49, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Fri, Nov 11, 2016 at 12:31:47AM +0000, Emil Velikov wrote:
>> On 10 November 2016 at 23:59, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > Hi Emil,
>> >
>> > On Thu, Nov 10, 2016 at 01:14:35PM +0000, Emil Velikov wrote:
>> >> On 10 November 2016 at 07:13, Greg KH <gregkh@linuxfoundation.org> wrote:
>> >> > On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
>> >> >> From: Emil Velikov <emil.velikov@collabora.com>
>> >> >>
>> >> >> Currently the revision isn't available via sysfs/libudev thus if one
>> >> >> wants to know the value they need to read through the config file.
>> >> >>
>> >> >> This in itself wakes/powers up the device, causing unwanted delays.
>> >> >>
>> >> >> There are at least two userspace components which could make use the new
>> >> >> file - libpciaccess and libdrm. At the moment the former will wake up
>> >> >> _every_ PCI device for simple invocation of glxinfo [when using Mesa
>> >> >> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
>> >> >> lead to 2-3 second delays while starting firefox, thunderbird or
>> >> >> chromium.
>> >
>> > I agree, these unwanted delays are completely unacceptable.  My
>> > question is whether we should fix them by exporting more information
>> > from the kernel, or by changing the way the userspace components work.
>> >
>> > It should not take anywhere near 2 seconds to wake up a PCI device.
>> > That makes me think there's a more serious problem than just a lack of
>> > caching for the revision field, e.g., maybe we're looking at far more
>> > PCI devices than we need to, or we're doing it many times to the same
>> > device, or ...
>> >
>> > If I understand correctly, the delay was bisected to
>> > https://cgit.freedesktop.org/mesa/mesa/commit/?id=be239326aa4f, which
>> > removed a bunch of code that looked up the vendor and device IDs, and
>> > replaced it with drmGetDevice().  And apparently drmGetDevice(), in
>> > this path:
>> >
>> >   drmGetDevice
>> >     drmProcessPciDevice
>> >       drmParsePciDeviceInfo
>> >
>> > is a little more thorough in that it looks up the *revision* in
>> > addition to the vendor and device IDs.  So we pay the cost for the
>> > revision even though in this instance we don't care about the revision
>> > at all.
>> >
>> Above all, apologies for all the "lovely" code that you had to go
>> through for these.
>> And yes, you've got it spot on.
>>
>> > drmParsePciDeviceInfo() currently reads the whole config header from
>> > sysfs (https://cgit.freedesktop.org/drm/libdrm/tree/xf86drm.c#n2949),
>> > but I think you're extending that to try the vendor, device,
>> > subsystem_vendor, subsystem_device, and (if present) revision sysfs
>> > files first (http://www.spinics.net/lists/dri-devel/msg122319.html).
>> >
>> Yes, making the revision file optional and "faking it" was my first
>> thought, esp. since we don't have any users of it (yet).
>> Although people are not too keen on it, so we'll likely opt for
>> revision-less API.
>>
>> > Bottom line, I guess I'm not super opposed to this, but I do feel like
>> > we're making a kernel change to cover up a userspace problem, and I
>> > think it would be better to push on that userspace problem a little
>> > more.
>> >
>> Yes, definitely we can beat some sense into userspace. Yet that
>> shouldn't be a deterrent for exposing the revision.
>
> Maybe.  If we speed things up by extending this kernel ABI, there's
> much less incentive to optimize the userspace stuff.  I feel a little
> bit like an enabler for undesirable userspace behavior :)
>
Yes, fixing userspace to not do silly things is the goal. But at the
same time even if userspace is perfect, there is no reason to power on
the device just to get the revision field, is it ?
Especially since everything else is readily available.

>> As hinted before the other prominent user libpciaccess wakes up probes
>> _every_ pci device.
>
> Is it really necessary to probe *every* PCI device?  That doesn't
> sound like a scalable design.
>
> As you can tell, the argument that "we should add this kernel ABI to
> make suboptimal userspace algorithms go faster" doesn't feel very
> compelling to me.
>
"Don't shoot the messenger" comes to mind. I'm just the stupid^Wnice
person who's trying to untangle unfortunate design decisions - don't
force me to rewrite more than a dozen pieces of software, please ?
Even then, I wonder how long it'll take for those to hit end users.

Yes I see your concern - userspace does do stupid stuff. Yet it
[sometimes] must know the information and the current way of
retrieving it (waking up the device) is quite sub-optimal.

Thanks
Emil
P.S. Some drivers have custom ioctls to retrieve the device info
(incl. revision). Surely we won't want to continue promoting/assisting
that ?
Michel Dänzer Nov. 14, 2016, 3:35 a.m. UTC | #7
On 11/11/16 08:59 AM, Bjorn Helgaas wrote:
> On Thu, Nov 10, 2016 at 01:14:35PM +0000, Emil Velikov wrote:
>> On 10 November 2016 at 07:13, Greg KH <gregkh@linuxfoundation.org> wrote:
>>> On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
>>>> From: Emil Velikov <emil.velikov@collabora.com>
>>>>
>>>> Currently the revision isn't available via sysfs/libudev thus if one
>>>> wants to know the value they need to read through the config file.
>>>>
>>>> This in itself wakes/powers up the device, causing unwanted delays.
>>>>
>>>> There are at least two userspace components which could make use the new
>>>> file - libpciaccess and libdrm. At the moment the former will wake up
>>>> _every_ PCI device for simple invocation of glxinfo [when using Mesa
>>>> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
>>>> lead to 2-3 second delays while starting firefox, thunderbird or
>>>> chromium.
> 
> I agree, these unwanted delays are completely unacceptable.  My
> question is whether we should fix them by exporting more information
> from the kernel, or by changing the way the userspace components work.
> 
> It should not take anywhere near 2 seconds to wake up a PCI device.

The DRM drivers for AMD/ATI GPUs can take on the order of that to
initialize, so a single wakeup might be sufficient for the described
symptoms.
Bjorn Helgaas Nov. 14, 2016, 5:20 p.m. UTC | #8
On Fri, Nov 11, 2016 at 06:56:51PM +0000, Emil Velikov wrote:
> Hi Bjorn,
> 
> On 11 November 2016 at 14:49, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Fri, Nov 11, 2016 at 12:31:47AM +0000, Emil Velikov wrote:
> >> On 10 November 2016 at 23:59, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> > Hi Emil,
> >> >
> >> > On Thu, Nov 10, 2016 at 01:14:35PM +0000, Emil Velikov wrote:
> >> >> On 10 November 2016 at 07:13, Greg KH <gregkh@linuxfoundation.org> wrote:
> >> >> > On Wed, Nov 09, 2016 at 04:56:07PM +0000, Emil Velikov wrote:
> >> >> >> From: Emil Velikov <emil.velikov@collabora.com>
> >> >> >>
> >> >> >> Currently the revision isn't available via sysfs/libudev thus if one
> >> >> >> wants to know the value they need to read through the config file.
> >> >> >>
> >> >> >> This in itself wakes/powers up the device, causing unwanted delays.
> >> >> >>
> >> >> >> There are at least two userspace components which could make use the new
> >> >> >> file - libpciaccess and libdrm. At the moment the former will wake up
> >> >> >> _every_ PCI device for simple invocation of glxinfo [when using Mesa
> >> >> >> 10.0+ drivers]. While the latter [in association with Mesa 13.0] can
> >> >> >> lead to 2-3 second delays while starting firefox, thunderbird or
> >> >> >> chromium.
> >> >
> >> > I agree, these unwanted delays are completely unacceptable.  My
> >> > question is whether we should fix them by exporting more information
> >> > from the kernel, or by changing the way the userspace components work.
> >> >
> >> > It should not take anywhere near 2 seconds to wake up a PCI device.
> >> > That makes me think there's a more serious problem than just a lack of
> >> > caching for the revision field, e.g., maybe we're looking at far more
> >> > PCI devices than we need to, or we're doing it many times to the same
> >> > device, or ...
> >> >
> >> > If I understand correctly, the delay was bisected to
> >> > https://cgit.freedesktop.org/mesa/mesa/commit/?id=be239326aa4f, which
> >> > removed a bunch of code that looked up the vendor and device IDs, and
> >> > replaced it with drmGetDevice().  And apparently drmGetDevice(), in
> >> > this path:
> >> >
> >> >   drmGetDevice
> >> >     drmProcessPciDevice
> >> >       drmParsePciDeviceInfo
> >> >
> >> > is a little more thorough in that it looks up the *revision* in
> >> > addition to the vendor and device IDs.  So we pay the cost for the
> >> > revision even though in this instance we don't care about the revision
> >> > at all.
> >> >
> >> Above all, apologies for all the "lovely" code that you had to go
> >> through for these.
> >> And yes, you've got it spot on.
> >>
> >> > drmParsePciDeviceInfo() currently reads the whole config header from
> >> > sysfs (https://cgit.freedesktop.org/drm/libdrm/tree/xf86drm.c#n2949),
> >> > but I think you're extending that to try the vendor, device,
> >> > subsystem_vendor, subsystem_device, and (if present) revision sysfs
> >> > files first (http://www.spinics.net/lists/dri-devel/msg122319.html).
> >> >
> >> Yes, making the revision file optional and "faking it" was my first
> >> thought, esp. since we don't have any users of it (yet).
> >> Although people are not too keen on it, so we'll likely opt for
> >> revision-less API.
> >>
> >> > Bottom line, I guess I'm not super opposed to this, but I do feel like
> >> > we're making a kernel change to cover up a userspace problem, and I
> >> > think it would be better to push on that userspace problem a little
> >> > more.
> >> >
> >> Yes, definitely we can beat some sense into userspace. Yet that
> >> shouldn't be a deterrent for exposing the revision.
> >
> > Maybe.  If we speed things up by extending this kernel ABI, there's
> > much less incentive to optimize the userspace stuff.  I feel a little
> > bit like an enabler for undesirable userspace behavior :)
> >
> Yes, fixing userspace to not do silly things is the goal. But at the
> same time even if userspace is perfect, there is no reason to power on
> the device just to get the revision field, is it ?
> Especially since everything else is readily available.
> 
> >> As hinted before the other prominent user libpciaccess wakes up probes
> >> _every_ pci device.
> >
> > Is it really necessary to probe *every* PCI device?  That doesn't
> > sound like a scalable design.
> >
> > As you can tell, the argument that "we should add this kernel ABI to
> > make suboptimal userspace algorithms go faster" doesn't feel very
> > compelling to me.
> >
> "Don't shoot the messenger" comes to mind. I'm just the stupid^Wnice
> person who's trying to untangle unfortunate design decisions - don't
> force me to rewrite more than a dozen pieces of software, please ?
> Even then, I wonder how long it'll take for those to hit end users.

Pre-be239326aa4f, you had:
  int libudev_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
  int sysfs_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
  int drm_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
None of them returned the revision.  There was some duplicated code,
but it was apparently functional and fast.

be239326aa4f removed libudev_get_pci_id_for_fd() and
sysfs_get_pci_id_for_fd(), which made the code prettier.  It also
changed drm_get_pci_id_for_fd() to use drmGetDevice() instead of the
awful hard-coding of vendor/device IDs based on drmGetVersion()->name.
But drmGetDevice() also returns the revision, which we don't need.

If you applied http://www.spinics.net/lists/dri-devel/msg122319.html,
you'd have code that's fast but unreliable (as you pointed out, it
returns the revision on new kernels, but 0 on old kernels, with no
hint to the caller about whether the revision is accurate).

If the caller can say "I don't care about the revision", e.g.,
http://www.spinics.net/lists/dri-devel/msg123013.html, you can make
drm_get_pci_id_for_fd() fast again.  But it will be fast and
functional even if the kernel doesn't export a "revision" sysfs file.

So what's the benefit of adding it?  This seems like a long circular
chain of making things simpler in one area but having to add new
complications in another to compensate.

Bjorn
diff mbox

Patch

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index bcd10c7..0666287 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -50,6 +50,7 @@  pci_config_attr(vendor, "0x%04x\n");
 pci_config_attr(device, "0x%04x\n");
 pci_config_attr(subsystem_vendor, "0x%04x\n");
 pci_config_attr(subsystem_device, "0x%04x\n");
+pci_config_attr(revision, "0x%02x\n");
 pci_config_attr(class, "0x%06x\n");
 pci_config_attr(irq, "%u\n");
 
@@ -568,6 +569,7 @@  static struct attribute *pci_dev_attrs[] = {
 	&dev_attr_device.attr,
 	&dev_attr_subsystem_vendor.attr,
 	&dev_attr_subsystem_device.attr,
+	&dev_attr_revision.attr,
 	&dev_attr_class.attr,
 	&dev_attr_irq.attr,
 	&dev_attr_local_cpus.attr,