diff mbox series

pci-driver: Add driver load messages

Message ID 20210125194138.1103155-1-prarit@redhat.com (mailing list archive)
State Not Applicable
Delegated to: Bjorn Helgaas
Headers show
Series pci-driver: Add driver load messages | expand

Commit Message

Prarit Bhargava Jan. 25, 2021, 7:41 p.m. UTC
There are two situations where driver load messages are helpful.

1) Some drivers silently load on devices and debugging driver or system
failures in these cases is difficult.  While some drivers (networking
for example) may not completely initialize when the PCI driver probe() function
has returned, it is still useful to have some idea of driver completion.

2) Storage and Network device vendors have relatively short lives for
some of their hardware.  Some devices may continue to function but are
problematic due to out-of-date firmware or other issues.  Maintaining
a database of the hardware is out-of-the-question in the kernel as it would
require constant updating.  Outputting a message in the log would allow
different OSes to determine if the problem hardware was truly supported or not.

Add optional driver load messages from the PCI core that indicates which
driver was loaded, on which slot, and on which device.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Myron Stowe <mstowe@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-doc@vger.kernel.org
---
 Documentation/admin-guide/kernel-parameters.txt |  2 ++
 drivers/pci/pci-driver.c                        | 14 +++++++++++++-
 drivers/pci/pci.c                               |  2 ++
 drivers/pci/pci.h                               |  1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

Comments

Leon Romanovsky Jan. 26, 2021, 6:39 a.m. UTC | #1
On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> There are two situations where driver load messages are helpful.
>
> 1) Some drivers silently load on devices and debugging driver or system
> failures in these cases is difficult.  While some drivers (networking
> for example) may not completely initialize when the PCI driver probe() function
> has returned, it is still useful to have some idea of driver completion.

Sorry, probably it is me, but I don't understand this use case.
Are you adding global to whole kernel command line boot argument to debug
what and when?

During boot:
If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
If device fails, you should get an error from that device (fix the
device to return an error), or something immediately won't work and
you won't see it in sysfs.

During run:
We have many other solutions to get debug prints during run, for example
tracing, which is possible to toggle dynamically.

Right now, my laptop will print 34 prints on boot and endless amount during
day-to-day usage.

➜  kernel git:(rdma-next) ✗ lspci |wc -l
34

>
> 2) Storage and Network device vendors have relatively short lives for
> some of their hardware.  Some devices may continue to function but are
> problematic due to out-of-date firmware or other issues.  Maintaining
> a database of the hardware is out-of-the-question in the kernel as it would
> require constant updating.  Outputting a message in the log would allow
> different OSes to determine if the problem hardware was truly supported or not.

And rely on some dmesg output as a true source of supported/not supported and
making this ABI which needs knob in command line. ?

>
> Add optional driver load messages from the PCI core that indicates which
> driver was loaded, on which slot, and on which device.

Why don't you add simple pr_debug(..) without any knob? You will be able
to enable/disable it through dynamic prints facility.

Thanks
Prarit Bhargava Jan. 26, 2021, 12:54 p.m. UTC | #2
Leon Romanovsky <leon@kernel.org> wrote:
> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> > There are two situations where driver load messages are helpful.
> >
> > 1) Some drivers silently load on devices and debugging driver or system
> > failures in these cases is difficult.  While some drivers (networking
> > for example) may not completely initialize when the PCI driver probe() function
> > has returned, it is still useful to have some idea of driver completion.
> 
> Sorry, probably it is me, but I don't understand this use case.
> Are you adding global to whole kernel command line boot argument to debug
> what and when?
> 
> During boot:
> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
> If device fails, you should get an error from that device (fix the
> device to return an error), or something immediately won't work and
> you won't see it in sysfs.
> 

What if there is a panic during boot?  There's no way to get to sysfs.
That's the case where this is helpful.

> During run:
> We have many other solutions to get debug prints during run, for example
> tracing, which is possible to toggle dynamically.
> 
> Right now, my laptop will print 34 prints on boot and endless amount during
> day-to-day usage.
> 
> ➜  kernel git:(rdma-next) ✗ lspci |wc -l
> 34
> 
> >
> > 2) Storage and Network device vendors have relatively short lives for
> > some of their hardware.  Some devices may continue to function but are
> > problematic due to out-of-date firmware or other issues.  Maintaining
> > a database of the hardware is out-of-the-question in the kernel as it would
> > require constant updating.  Outputting a message in the log would allow
> > different OSes to determine if the problem hardware was truly supported or not.
> 
> And rely on some dmesg output as a true source of supported/not supported and
> making this ABI which needs knob in command line. ?

Yes.  The console log being saved would work as a true source of load
messages to be interpreted by an OS tool.  But I see your point about the
knob below...

> 
> >
> > Add optional driver load messages from the PCI core that indicates which
> > driver was loaded, on which slot, and on which device.
> 
> Why don't you add simple pr_debug(..) without any knob? You will be able
> to enable/disable it through dynamic prints facility.

Good point.  I'll wait for more feedback and submit a v2 with pr_debug.

P.

> 
> Thanks
Leon Romanovsky Jan. 26, 2021, 1:14 p.m. UTC | #3
On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
>   Leon Romanovsky <leon@kernel.org> wrote:
> > On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> > > There are two situations where driver load messages are helpful.
> > >
> > > 1) Some drivers silently load on devices and debugging driver or system
> > > failures in these cases is difficult.  While some drivers (networking
> > > for example) may not completely initialize when the PCI driver probe() function
> > > has returned, it is still useful to have some idea of driver completion.
> >
> > Sorry, probably it is me, but I don't understand this use case.
> > Are you adding global to whole kernel command line boot argument to debug
> > what and when?
> >
> > During boot:
> > If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
> > If device fails, you should get an error from that device (fix the
> > device to return an error), or something immediately won't work and
> > you won't see it in sysfs.
> >
>
> What if there is a panic during boot?  There's no way to get to sysfs.
> That's the case where this is helpful.

How? If you have kernel panic, it means you have much more worse problem
than not-supported device. If kernel panic was caused by the driver, you
will see call trace related to it. If kernel panic was caused by
something else, supported/not supported won't help here.

>
> > During run:
> > We have many other solutions to get debug prints during run, for example
> > tracing, which is possible to toggle dynamically.
> >
> > Right now, my laptop will print 34 prints on boot and endless amount during
> > day-to-day usage.
> >
> > ➜  kernel git:(rdma-next) ✗ lspci |wc -l
> > 34
> >
> > >
> > > 2) Storage and Network device vendors have relatively short lives for
> > > some of their hardware.  Some devices may continue to function but are
> > > problematic due to out-of-date firmware or other issues.  Maintaining
> > > a database of the hardware is out-of-the-question in the kernel as it would
> > > require constant updating.  Outputting a message in the log would allow
> > > different OSes to determine if the problem hardware was truly supported or not.
> >
> > And rely on some dmesg output as a true source of supported/not supported and
> > making this ABI which needs knob in command line. ?
>
> Yes.  The console log being saved would work as a true source of load
> messages to be interpreted by an OS tool.  But I see your point about the
> knob below...

You will need much more stronger claim than the above if you want to proceed
ABI path through dmesg prints.

>
> >
> > >
> > > Add optional driver load messages from the PCI core that indicates which
> > > driver was loaded, on which slot, and on which device.
> >
> > Why don't you add simple pr_debug(..) without any knob? You will be able
> > to enable/disable it through dynamic prints facility.
>
> Good point.  I'll wait for more feedback and submit a v2 with pr_debug.

Just to be clear, none of this can be ABI and any kernel print can
be changed or removed any minute without any announcement.

Thanks

>
> P.
>
> >
> > Thanks
>
Prarit Bhargava Jan. 26, 2021, 1:42 p.m. UTC | #4
On 1/26/21 8:14 AM, Leon Romanovsky wrote:
> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
>>   Leon Romanovsky <leon@kernel.org> wrote:
>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
>>>> There are two situations where driver load messages are helpful.
>>>>
>>>> 1) Some drivers silently load on devices and debugging driver or system
>>>> failures in these cases is difficult.  While some drivers (networking
>>>> for example) may not completely initialize when the PCI driver probe() function
>>>> has returned, it is still useful to have some idea of driver completion.
>>>
>>> Sorry, probably it is me, but I don't understand this use case.
>>> Are you adding global to whole kernel command line boot argument to debug
>>> what and when?
>>>
>>> During boot:
>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
>>> If device fails, you should get an error from that device (fix the
>>> device to return an error), or something immediately won't work and
>>> you won't see it in sysfs.
>>>
>>
>> What if there is a panic during boot?  There's no way to get to sysfs.
>> That's the case where this is helpful.
> 
> How? If you have kernel panic, it means you have much more worse problem
> than not-supported device. If kernel panic was caused by the driver, you
> will see call trace related to it. If kernel panic was caused by
> something else, supported/not supported won't help here.

I still have no idea *WHICH* device it was that the panic occurred on.
> 
>>
>>> During run:
>>> We have many other solutions to get debug prints during run, for example
>>> tracing, which is possible to toggle dynamically.
>>>
>>> Right now, my laptop will print 34 prints on boot and endless amount during
>>> day-to-day usage.
>>>
>>> ➜  kernel git:(rdma-next) ✗ lspci |wc -l
>>> 34
>>>
>>>>
>>>> 2) Storage and Network device vendors have relatively short lives for
>>>> some of their hardware.  Some devices may continue to function but are
>>>> problematic due to out-of-date firmware or other issues.  Maintaining
>>>> a database of the hardware is out-of-the-question in the kernel as it would
>>>> require constant updating.  Outputting a message in the log would allow
>>>> different OSes to determine if the problem hardware was truly supported or not.
>>>
>>> And rely on some dmesg output as a true source of supported/not supported and
>>> making this ABI which needs knob in command line. ?
>>
>> Yes.  The console log being saved would work as a true source of load
>> messages to be interpreted by an OS tool.  But I see your point about the
>> knob below...
> 
> You will need much more stronger claim than the above if you want to proceed
> ABI path through dmesg prints.
> 

See my answer below.  I agree with you on the ABI statement.

>>
>>>
>>>>
>>>> Add optional driver load messages from the PCI core that indicates which
>>>> driver was loaded, on which slot, and on which device.
>>>
>>> Why don't you add simple pr_debug(..) without any knob? You will be able
>>> to enable/disable it through dynamic prints facility.
>>
>> Good point.  I'll wait for more feedback and submit a v2 with pr_debug.
> 
> Just to be clear, none of this can be ABI and any kernel print can
> be changed or removed any minute without any announcement.

Yes, that's absolutely the case and I agree with you that nothing can guarantee
ABI of those pr_debug() statements.  They are *debug* after all.

P.

> 
> Thanks
> 
>>
>> P.
>>
>>>
>>> Thanks
>>
>
Leon Romanovsky Jan. 26, 2021, 1:53 p.m. UTC | #5
On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
>
>
> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
> > On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
> >>   Leon Romanovsky <leon@kernel.org> wrote:
> >>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> >>>> There are two situations where driver load messages are helpful.
> >>>>
> >>>> 1) Some drivers silently load on devices and debugging driver or system
> >>>> failures in these cases is difficult.  While some drivers (networking
> >>>> for example) may not completely initialize when the PCI driver probe() function
> >>>> has returned, it is still useful to have some idea of driver completion.
> >>>
> >>> Sorry, probably it is me, but I don't understand this use case.
> >>> Are you adding global to whole kernel command line boot argument to debug
> >>> what and when?
> >>>
> >>> During boot:
> >>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
> >>> If device fails, you should get an error from that device (fix the
> >>> device to return an error), or something immediately won't work and
> >>> you won't see it in sysfs.
> >>>
> >>
> >> What if there is a panic during boot?  There's no way to get to sysfs.
> >> That's the case where this is helpful.
> >
> > How? If you have kernel panic, it means you have much more worse problem
> > than not-supported device. If kernel panic was caused by the driver, you
> > will see call trace related to it. If kernel panic was caused by
> > something else, supported/not supported won't help here.
>
> I still have no idea *WHICH* device it was that the panic occurred on.

The kernel panic is printed from the driver. There is one driver loaded
for all same PCI devices which are probed without relation to their
number.

If you have host with ten same cards, you will see one driver and this
is where the problem and not in supported/not-supported device.

> >
> >>
> >>> During run:
> >>> We have many other solutions to get debug prints during run, for example
> >>> tracing, which is possible to toggle dynamically.
> >>>
> >>> Right now, my laptop will print 34 prints on boot and endless amount during
> >>> day-to-day usage.
> >>>
> >>> ➜  kernel git:(rdma-next) ✗ lspci |wc -l
> >>> 34
> >>>
> >>>>
> >>>> 2) Storage and Network device vendors have relatively short lives for
> >>>> some of their hardware.  Some devices may continue to function but are
> >>>> problematic due to out-of-date firmware or other issues.  Maintaining
> >>>> a database of the hardware is out-of-the-question in the kernel as it would
> >>>> require constant updating.  Outputting a message in the log would allow
> >>>> different OSes to determine if the problem hardware was truly supported or not.
> >>>
> >>> And rely on some dmesg output as a true source of supported/not supported and
> >>> making this ABI which needs knob in command line. ?
> >>
> >> Yes.  The console log being saved would work as a true source of load
> >> messages to be interpreted by an OS tool.  But I see your point about the
> >> knob below...
> >
> > You will need much more stronger claim than the above if you want to proceed
> > ABI path through dmesg prints.
> >
>
> See my answer below.  I agree with you on the ABI statement.
>
> >>
> >>>
> >>>>
> >>>> Add optional driver load messages from the PCI core that indicates which
> >>>> driver was loaded, on which slot, and on which device.
> >>>
> >>> Why don't you add simple pr_debug(..) without any knob? You will be able
> >>> to enable/disable it through dynamic prints facility.
> >>
> >> Good point.  I'll wait for more feedback and submit a v2 with pr_debug.
> >
> > Just to be clear, none of this can be ABI and any kernel print can
> > be changed or removed any minute without any announcement.
>
> Yes, that's absolutely the case and I agree with you that nothing can guarantee
> ABI of those pr_debug() statements.  They are *debug* after all.

You missed the point. ALL pr*() prints are not ABI, without relation to their level.

Thanks

>
> P.
>
> >
> > Thanks
> >
> >>
> >> P.
> >>
> >>>
> >>> Thanks
> >>
> >
>
Prarit Bhargava Jan. 26, 2021, 2:05 p.m. UTC | #6
On 1/26/21 8:53 AM, Leon Romanovsky wrote:
> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
>>
>>
>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
>>>>   Leon Romanovsky <leon@kernel.org> wrote:
>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
>>>>>> There are two situations where driver load messages are helpful.
>>>>>>
>>>>>> 1) Some drivers silently load on devices and debugging driver or system
>>>>>> failures in these cases is difficult.  While some drivers (networking
>>>>>> for example) may not completely initialize when the PCI driver probe() function
>>>>>> has returned, it is still useful to have some idea of driver completion.
>>>>>
>>>>> Sorry, probably it is me, but I don't understand this use case.
>>>>> Are you adding global to whole kernel command line boot argument to debug
>>>>> what and when?
>>>>>
>>>>> During boot:
>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
>>>>> If device fails, you should get an error from that device (fix the
>>>>> device to return an error), or something immediately won't work and
>>>>> you won't see it in sysfs.
>>>>>
>>>>
>>>> What if there is a panic during boot?  There's no way to get to sysfs.
>>>> That's the case where this is helpful.
>>>
>>> How? If you have kernel panic, it means you have much more worse problem
>>> than not-supported device. If kernel panic was caused by the driver, you
>>> will see call trace related to it. If kernel panic was caused by
>>> something else, supported/not supported won't help here.
>>
>> I still have no idea *WHICH* device it was that the panic occurred on.
> 
> The kernel panic is printed from the driver. There is one driver loaded
> for all same PCI devices which are probed without relation to their
> number.>
> If you have host with ten same cards, you will see one driver and this
> is where the problem and not in supported/not-supported device.

That's true, but you can also have different cards loading the same driver.
See, for example, any PCI_IDs list in a driver.

For example,

10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)

Both load the megaraid driver and have different profiles within the driver.  I
have no idea which one actually panicked until removing one card.

It's MUCH worse when debugging new hardware and getting a panic from, for
example, the uncore code which binds to a PCI mapped device.  One device might
work and the next one doesn't.  And then you can multiply that by seeing *many*
panics at once and trying to determine if the problem was on one specific
socket, die, or core.

> 
>>>
>>>>
>>>>> During run:
>>>>> We have many other solutions to get debug prints during run, for example
>>>>> tracing, which is possible to toggle dynamically.
>>>>>
>>>>> Right now, my laptop will print 34 prints on boot and endless amount during
>>>>> day-to-day usage.
>>>>>
>>>>> ➜  kernel git:(rdma-next) ✗ lspci |wc -l
>>>>> 34
>>>>>
>>>>>>
>>>>>> 2) Storage and Network device vendors have relatively short lives for
>>>>>> some of their hardware.  Some devices may continue to function but are
>>>>>> problematic due to out-of-date firmware or other issues.  Maintaining
>>>>>> a database of the hardware is out-of-the-question in the kernel as it would
>>>>>> require constant updating.  Outputting a message in the log would allow
>>>>>> different OSes to determine if the problem hardware was truly supported or not.
>>>>>
>>>>> And rely on some dmesg output as a true source of supported/not supported and
>>>>> making this ABI which needs knob in command line. ?
>>>>
>>>> Yes.  The console log being saved would work as a true source of load
>>>> messages to be interpreted by an OS tool.  But I see your point about the
>>>> knob below...
>>>
>>> You will need much more stronger claim than the above if you want to proceed
>>> ABI path through dmesg prints.
>>>
>>
>> See my answer below.  I agree with you on the ABI statement.
>>
>>>>
>>>>>
>>>>>>
>>>>>> Add optional driver load messages from the PCI core that indicates which
>>>>>> driver was loaded, on which slot, and on which device.
>>>>>
>>>>> Why don't you add simple pr_debug(..) without any knob? You will be able
>>>>> to enable/disable it through dynamic prints facility.
>>>>
>>>> Good point.  I'll wait for more feedback and submit a v2 with pr_debug.
>>>
>>> Just to be clear, none of this can be ABI and any kernel print can
>>> be changed or removed any minute without any announcement.
>>
>> Yes, that's absolutely the case and I agree with you that nothing can guarantee
>> ABI of those pr_debug() statements.  They are *debug* after all.
> 
> You missed the point. ALL pr*() prints are not ABI, without relation to their level.
> 

Yes, I understood that.  I'm just emphasizing your ABI concern.

P.

> Thanks
> 
>>
>> P.
>>
>>>
>>> Thanks
>>>
>>>>
>>>> P.
>>>>
>>>>>
>>>>> Thanks
>>>>
>>>
>>
>
Bjorn Helgaas Jan. 26, 2021, 3:12 p.m. UTC | #7
Hi Prarit,

On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
> > On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
> >> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
> >>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
> >>>>   Leon Romanovsky <leon@kernel.org> wrote:
> >>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> >>>>>> There are two situations where driver load messages are helpful.
> >>>>>>
> >>>>>> 1) Some drivers silently load on devices and debugging driver or system
> >>>>>> failures in these cases is difficult.  While some drivers (networking
> >>>>>> for example) may not completely initialize when the PCI driver probe() function
> >>>>>> has returned, it is still useful to have some idea of driver completion.
> >>>>>
> >>>>> Sorry, probably it is me, but I don't understand this use case.
> >>>>> Are you adding global to whole kernel command line boot argument to debug
> >>>>> what and when?
> >>>>>
> >>>>> During boot:
> >>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
> >>>>> If device fails, you should get an error from that device (fix the
> >>>>> device to return an error), or something immediately won't work and
> >>>>> you won't see it in sysfs.
> >>>>
> >>>> What if there is a panic during boot?  There's no way to get to sysfs.
> >>>> That's the case where this is helpful.
> >>>
> >>> How? If you have kernel panic, it means you have much more worse problem
> >>> than not-supported device. If kernel panic was caused by the driver, you
> >>> will see call trace related to it. If kernel panic was caused by
> >>> something else, supported/not supported won't help here.
> >>
> >> I still have no idea *WHICH* device it was that the panic occurred on.
> > 
> > The kernel panic is printed from the driver. There is one driver loaded
> > for all same PCI devices which are probed without relation to their
> > number.>
> > If you have host with ten same cards, you will see one driver and this
> > is where the problem and not in supported/not-supported device.
> 
> That's true, but you can also have different cards loading the same driver.
> See, for example, any PCI_IDs list in a driver.
> 
> For example,
> 
> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
> 
> Both load the megaraid driver and have different profiles within the
> driver.  I have no idea which one actually panicked until removing
> one card.
> 
> It's MUCH worse when debugging new hardware and getting a panic
> from, for example, the uncore code which binds to a PCI mapped
> device.  One device might work and the next one doesn't.  And then
> you can multiply that by seeing *many* panics at once and trying to
> determine if the problem was on one specific socket, die, or core.

Would a dev_panic() interface that identified the device and driver
help with this?

For driver_load_messages, it doesn't seem necessarily PCI-specific.
If we want a message like that, maybe it could be in
driver_probe_device() or similar?  There are already a few pr_debug()
calls in that path.  There are some enabled by initcall_debug that
include the return value from the probe; would those be close to what
you're looking for?

Bjorn
Prarit Bhargava Jan. 29, 2021, 6:38 p.m. UTC | #8
On 1/26/21 10:12 AM, Bjorn Helgaas wrote:
> Hi Prarit,
>
> On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
>> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
>>> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
>>>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
>>>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
>>>>>>   Leon Romanovsky <leon@kernel.org> wrote:
>>>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
>>>>>>>> There are two situations where driver load messages are helpful.
>>>>>>>>
>>>>>>>> 1) Some drivers silently load on devices and debugging driver or system
>>>>>>>> failures in these cases is difficult.  While some drivers (networking
>>>>>>>> for example) may not completely initialize when the PCI driver probe()
function
>>>>>>>> has returned, it is still useful to have some idea of driver completion.
>>>>>>>
>>>>>>> Sorry, probably it is me, but I don't understand this use case.
>>>>>>> Are you adding global to whole kernel command line boot argument to debug
>>>>>>> what and when?
>>>>>>>
>>>>>>> During boot:
>>>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
>>>>>>> If device fails, you should get an error from that device (fix the
>>>>>>> device to return an error), or something immediately won't work and
>>>>>>> you won't see it in sysfs.
>>>>>>
>>>>>> What if there is a panic during boot?  There's no way to get to sysfs.
>>>>>> That's the case where this is helpful.
>>>>>
>>>>> How? If you have kernel panic, it means you have much more worse problem
>>>>> than not-supported device. If kernel panic was caused by the driver, you
>>>>> will see call trace related to it. If kernel panic was caused by
>>>>> something else, supported/not supported won't help here.
>>>>
>>>> I still have no idea *WHICH* device it was that the panic occurred on.
>>>
>>> The kernel panic is printed from the driver. There is one driver loaded
>>> for all same PCI devices which are probed without relation to their
>>> number.>
>>> If you have host with ten same cards, you will see one driver and this
>>> is where the problem and not in supported/not-supported device.
>>
>> That's true, but you can also have different cards loading the same driver.
>> See, for example, any PCI_IDs list in a driver.
>>
>> For example,
>>
>> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
>> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader]
(rev 02)
>>
>> Both load the megaraid driver and have different profiles within the
>> driver.  I have no idea which one actually panicked until removing
>> one card.
>>
>> It's MUCH worse when debugging new hardware and getting a panic
>> from, for example, the uncore code which binds to a PCI mapped
>> device.  One device might work and the next one doesn't.  And then
>> you can multiply that by seeing *many* panics at once and trying to
>> determine if the problem was on one specific socket, die, or core.
>
> Would a dev_panic() interface that identified the device and driver
> help with this?

It would but,

>
> For driver_load_messages, it doesn't seem necessarily PCI-specific.
> If we want a message like that, maybe it could be in
> driver_probe_device() or similar?  There are already a few pr_debug()
> calls in that path.  There are some enabled by initcall_debug that
> include the return value from the probe; would those be close to what
> you're looking for?

I think this drivers/base/dd.c:727 might suffice.  Let me try some tests
with that and get back to you.

Thanks for the pointers,

P.
>
> Bjorn
>
Prarit Bhargava Feb. 18, 2021, 6:36 p.m. UTC | #9
On 1/26/21 10:12 AM, Bjorn Helgaas wrote:
> Hi Prarit,
> 
> On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
>> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
>>> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
>>>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
>>>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
>>>>>>   Leon Romanovsky <leon@kernel.org> wrote:
>>>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
>>>>>>>> There are two situations where driver load messages are helpful.
>>>>>>>>
>>>>>>>> 1) Some drivers silently load on devices and debugging driver or system
>>>>>>>> failures in these cases is difficult.  While some drivers (networking
>>>>>>>> for example) may not completely initialize when the PCI driver probe() function
>>>>>>>> has returned, it is still useful to have some idea of driver completion.
>>>>>>>
>>>>>>> Sorry, probably it is me, but I don't understand this use case.
>>>>>>> Are you adding global to whole kernel command line boot argument to debug
>>>>>>> what and when?
>>>>>>>
>>>>>>> During boot:
>>>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
>>>>>>> If device fails, you should get an error from that device (fix the
>>>>>>> device to return an error), or something immediately won't work and
>>>>>>> you won't see it in sysfs.
>>>>>>
>>>>>> What if there is a panic during boot?  There's no way to get to sysfs.
>>>>>> That's the case where this is helpful.
>>>>>
>>>>> How? If you have kernel panic, it means you have much more worse problem
>>>>> than not-supported device. If kernel panic was caused by the driver, you
>>>>> will see call trace related to it. If kernel panic was caused by
>>>>> something else, supported/not supported won't help here.
>>>>
>>>> I still have no idea *WHICH* device it was that the panic occurred on.
>>>
>>> The kernel panic is printed from the driver. There is one driver loaded
>>> for all same PCI devices which are probed without relation to their
>>> number.>
>>> If you have host with ten same cards, you will see one driver and this
>>> is where the problem and not in supported/not-supported device.
>>
>> That's true, but you can also have different cards loading the same driver.
>> See, for example, any PCI_IDs list in a driver.
>>
>> For example,
>>
>> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
>> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
>>
>> Both load the megaraid driver and have different profiles within the
>> driver.  I have no idea which one actually panicked until removing
>> one card.
>>
>> It's MUCH worse when debugging new hardware and getting a panic
>> from, for example, the uncore code which binds to a PCI mapped
>> device.  One device might work and the next one doesn't.  And then
>> you can multiply that by seeing *many* panics at once and trying to
>> determine if the problem was on one specific socket, die, or core.
> 
> Would a dev_panic() interface that identified the device and driver
> help with this?
> 

^^ the more I look at this problem, the more a dev_panic() that would output a
device specific message at panic time is what I really need.

> For driver_load_messages, it doesn't seem necessarily PCI-specific.
> If we want a message like that, maybe it could be in
> driver_probe_device() or similar?  There are already a few pr_debug()
> calls in that path.  There are some enabled by initcall_debug that
> include the return value from the probe; would those be close to what
> you're looking for?

I took a look at those, and unfortunately they do not meet my requirements.
Ultimately, at panic time, I need to know that a driver was loaded on a device
at a specific location in the PCI space.

The driver_probe_device() pr_debug calls tell me the location and the driver,
but not anything to uniquely identify the device (ie, the PCI vendor and device
IDs).

It sounds like you've had some thoughts about a dev_panic() implementation.
Care to share them with me?  I'm more than willing to implement it but just want
to get your more experienced view of what is needed.

P.

> 
> Bjorn
>
Bjorn Helgaas Feb. 18, 2021, 7:06 p.m. UTC | #10
On Thu, Feb 18, 2021 at 01:36:35PM -0500, Prarit Bhargava wrote:
> On 1/26/21 10:12 AM, Bjorn Helgaas wrote:
> > On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
> >> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
> >>> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
> >>>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
> >>>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
> >>>>>>   Leon Romanovsky <leon@kernel.org> wrote:
> >>>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> >>>>>>>> There are two situations where driver load messages are helpful.
> >>>>>>>>
> >>>>>>>> 1) Some drivers silently load on devices and debugging driver or system
> >>>>>>>> failures in these cases is difficult.  While some drivers (networking
> >>>>>>>> for example) may not completely initialize when the PCI driver probe() function
> >>>>>>>> has returned, it is still useful to have some idea of driver completion.
> >>>>>>>
> >>>>>>> Sorry, probably it is me, but I don't understand this use case.
> >>>>>>> Are you adding global to whole kernel command line boot argument to debug
> >>>>>>> what and when?
> >>>>>>>
> >>>>>>> During boot:
> >>>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
> >>>>>>> If device fails, you should get an error from that device (fix the
> >>>>>>> device to return an error), or something immediately won't work and
> >>>>>>> you won't see it in sysfs.
> >>>>>>
> >>>>>> What if there is a panic during boot?  There's no way to get to sysfs.
> >>>>>> That's the case where this is helpful.
> >>>>>
> >>>>> How? If you have kernel panic, it means you have much more worse problem
> >>>>> than not-supported device. If kernel panic was caused by the driver, you
> >>>>> will see call trace related to it. If kernel panic was caused by
> >>>>> something else, supported/not supported won't help here.
> >>>>
> >>>> I still have no idea *WHICH* device it was that the panic occurred on.
> >>>
> >>> The kernel panic is printed from the driver. There is one driver loaded
> >>> for all same PCI devices which are probed without relation to their
> >>> number.>
> >>> If you have host with ten same cards, you will see one driver and this
> >>> is where the problem and not in supported/not-supported device.
> >>
> >> That's true, but you can also have different cards loading the same driver.
> >> See, for example, any PCI_IDs list in a driver.
> >>
> >> For example,
> >>
> >> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
> >> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
> >>
> >> Both load the megaraid driver and have different profiles within the
> >> driver.  I have no idea which one actually panicked until removing
> >> one card.
> >>
> >> It's MUCH worse when debugging new hardware and getting a panic
> >> from, for example, the uncore code which binds to a PCI mapped
> >> device.  One device might work and the next one doesn't.  And
> >> then you can multiply that by seeing *many* panics at once and
> >> trying to determine if the problem was on one specific socket,
> >> die, or core.
> > 
> > Would a dev_panic() interface that identified the device and
> > driver help with this?
> 
> ^^ the more I look at this problem, the more a dev_panic() that
> would output a device specific message at panic time is what I
> really need.
> 
> > For driver_load_messages, it doesn't seem necessarily
> > PCI-specific.  If we want a message like that, maybe it could be
> > in driver_probe_device() or similar?  There are already a few
> > pr_debug() calls in that path.  There are some enabled by
> > initcall_debug that include the return value from the probe; would
> > those be close to what you're looking for?
> 
> I took a look at those, and unfortunately they do not meet my
> requirements.  Ultimately, at panic time, I need to know that a
> driver was loaded on a device at a specific location in the PCI
> space.
> 
> The driver_probe_device() pr_debug calls tell me the location and
> the driver, but not anything to uniquely identify the device (ie,
> the PCI vendor and device IDs).
> 
> It sounds like you've had some thoughts about a dev_panic()
> implementation.  Care to share them with me?  I'm more than willing
> to implement it but just want to get your more experienced view of
> what is needed.

A dev_panic() might indeed be useful.  It would be nice to capture the
device information at the *first* crash instead of having to boot
again with initcall_debug and try to reproduce the crash.

include/linux/dev_printk.h has a whole mess of #defines for dev_info()
et al, and maybe that could be extended for dev_panic().  Or maybe the
dev_WARN() definition at the bottom, which basically just pastes the
driver and device info at the beginning of the string, would be
better.

Even if you do add a dev_panic(), I would also take a look through
drivers/base/dd.c and similar files to see whether the
pr_warn/pr_debug/etc they do could be converted to
dev_warn/dev_dbg/etc.

  $ git grep -A2 pr_ drivers/base | grep dev_name

finds several likely-looking candidates.  Here are some past patches
along that line:

  64b3eaf3715d ("xenbus: Use dev_printk() when possible")
  a88a7b3eb076 ("vfio: Use dev_printk() when possible")
  780da9e4f5bf ("iommu: Use dev_printk() when possible")

Bjorn
Prarit Bhargava March 4, 2021, 2:42 p.m. UTC | #11
On 2/18/21 2:06 PM, Bjorn Helgaas wrote:
> On Thu, Feb 18, 2021 at 01:36:35PM -0500, Prarit Bhargava wrote:
>> On 1/26/21 10:12 AM, Bjorn Helgaas wrote:
>>> On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
>>>> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
>>>>> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
>>>>>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
>>>>>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
>>>>>>>>   Leon Romanovsky <leon@kernel.org> wrote:
>>>>>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
>>>>>>>>>> There are two situations where driver load messages are helpful.
>>>>>>>>>>
>>>>>>>>>> 1) Some drivers silently load on devices and debugging driver or system
>>>>>>>>>> failures in these cases is difficult.  While some drivers (networking
>>>>>>>>>> for example) may not completely initialize when the PCI driver probe() function
>>>>>>>>>> has returned, it is still useful to have some idea of driver completion.
>>>>>>>>>
>>>>>>>>> Sorry, probably it is me, but I don't understand this use case.
>>>>>>>>> Are you adding global to whole kernel command line boot argument to debug
>>>>>>>>> what and when?
>>>>>>>>>
>>>>>>>>> During boot:
>>>>>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
>>>>>>>>> If device fails, you should get an error from that device (fix the
>>>>>>>>> device to return an error), or something immediately won't work and
>>>>>>>>> you won't see it in sysfs.
>>>>>>>>
>>>>>>>> What if there is a panic during boot?  There's no way to get to sysfs.
>>>>>>>> That's the case where this is helpful.
>>>>>>>
>>>>>>> How? If you have kernel panic, it means you have much more worse problem
>>>>>>> than not-supported device. If kernel panic was caused by the driver, you
>>>>>>> will see call trace related to it. If kernel panic was caused by
>>>>>>> something else, supported/not supported won't help here.
>>>>>>
>>>>>> I still have no idea *WHICH* device it was that the panic occurred on.
>>>>>
>>>>> The kernel panic is printed from the driver. There is one driver loaded
>>>>> for all same PCI devices which are probed without relation to their
>>>>> number.>
>>>>> If you have host with ten same cards, you will see one driver and this
>>>>> is where the problem and not in supported/not-supported device.
>>>>
>>>> That's true, but you can also have different cards loading the same driver.
>>>> See, for example, any PCI_IDs list in a driver.
>>>>
>>>> For example,
>>>>
>>>> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
>>>> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
>>>>
>>>> Both load the megaraid driver and have different profiles within the
>>>> driver.  I have no idea which one actually panicked until removing
>>>> one card.
>>>>
>>>> It's MUCH worse when debugging new hardware and getting a panic
>>>> from, for example, the uncore code which binds to a PCI mapped
>>>> device.  One device might work and the next one doesn't.  And
>>>> then you can multiply that by seeing *many* panics at once and
>>>> trying to determine if the problem was on one specific socket,
>>>> die, or core.
>>>
>>> Would a dev_panic() interface that identified the device and
>>> driver help with this?
>>
>> ^^ the more I look at this problem, the more a dev_panic() that
>> would output a device specific message at panic time is what I
>> really need.

Bjorn,

I went down this road a bit and had a realization.  The issue isn't with
printing something at panic time, but the *data* that is output.  Each PCI
device is associated with a struct device.  That device struct's name is output
for dev_dbg(), etc., commands.  The PCI subsystem sets the device struct name at
drivers/pci/probe.c: 1799

	        dev_set_name(&dev->dev, "%04x:%02x:%02x.%d", pci_domain_nr(dev->bus),
                     dev->bus->number, PCI_SLOT(dev->devfn),
                     PCI_FUNC(dev->devfn));

My problem really is that the above information is insufficient when I (or a
user) need to debug a system.  The complexities of debugging multiple broken
driver loads would be much easier if I didn't have to constantly add this output
manually :).

Would you be okay with adding a *debug* parameter to expand the device name to
include the vendor & device ID pair?  FWIW, I'm somewhat against
yet-another-kernel-option but that's really the information I need.  I could
then add dev_dbg() statements in the local_pci_probe() function.

Thoughts?

P.
Bjorn Helgaas March 4, 2021, 3:50 p.m. UTC | #12
On Thu, Mar 04, 2021 at 09:42:44AM -0500, Prarit Bhargava wrote:
> 
> 
> On 2/18/21 2:06 PM, Bjorn Helgaas wrote:
> > On Thu, Feb 18, 2021 at 01:36:35PM -0500, Prarit Bhargava wrote:
> >> On 1/26/21 10:12 AM, Bjorn Helgaas wrote:
> >>> On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
> >>>> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
> >>>>> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
> >>>>>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
> >>>>>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
> >>>>>>>>   Leon Romanovsky <leon@kernel.org> wrote:
> >>>>>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> >>>>>>>>>> There are two situations where driver load messages are helpful.
> >>>>>>>>>>
> >>>>>>>>>> 1) Some drivers silently load on devices and debugging driver or system
> >>>>>>>>>> failures in these cases is difficult.  While some drivers (networking
> >>>>>>>>>> for example) may not completely initialize when the PCI driver probe() function
> >>>>>>>>>> has returned, it is still useful to have some idea of driver completion.
> >>>>>>>>>
> >>>>>>>>> Sorry, probably it is me, but I don't understand this use case.
> >>>>>>>>> Are you adding global to whole kernel command line boot argument to debug
> >>>>>>>>> what and when?
> >>>>>>>>>
> >>>>>>>>> During boot:
> >>>>>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
> >>>>>>>>> If device fails, you should get an error from that device (fix the
> >>>>>>>>> device to return an error), or something immediately won't work and
> >>>>>>>>> you won't see it in sysfs.
> >>>>>>>>
> >>>>>>>> What if there is a panic during boot?  There's no way to get to sysfs.
> >>>>>>>> That's the case where this is helpful.
> >>>>>>>
> >>>>>>> How? If you have kernel panic, it means you have much more worse problem
> >>>>>>> than not-supported device. If kernel panic was caused by the driver, you
> >>>>>>> will see call trace related to it. If kernel panic was caused by
> >>>>>>> something else, supported/not supported won't help here.
> >>>>>>
> >>>>>> I still have no idea *WHICH* device it was that the panic occurred on.
> >>>>>
> >>>>> The kernel panic is printed from the driver. There is one driver loaded
> >>>>> for all same PCI devices which are probed without relation to their
> >>>>> number.>
> >>>>> If you have host with ten same cards, you will see one driver and this
> >>>>> is where the problem and not in supported/not-supported device.
> >>>>
> >>>> That's true, but you can also have different cards loading the same driver.
> >>>> See, for example, any PCI_IDs list in a driver.
> >>>>
> >>>> For example,
> >>>>
> >>>> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
> >>>> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
> >>>>
> >>>> Both load the megaraid driver and have different profiles within the
> >>>> driver.  I have no idea which one actually panicked until removing
> >>>> one card.
> >>>>
> >>>> It's MUCH worse when debugging new hardware and getting a panic
> >>>> from, for example, the uncore code which binds to a PCI mapped
> >>>> device.  One device might work and the next one doesn't.  And
> >>>> then you can multiply that by seeing *many* panics at once and
> >>>> trying to determine if the problem was on one specific socket,
> >>>> die, or core.
> >>>
> >>> Would a dev_panic() interface that identified the device and
> >>> driver help with this?
> >>
> >> ^^ the more I look at this problem, the more a dev_panic() that
> >> would output a device specific message at panic time is what I
> >> really need.
> 
> I went down this road a bit and had a realization.  The issue isn't
> with printing something at panic time, but the *data* that is
> output.  Each PCI device is associated with a struct device.  That
> device struct's name is output for dev_dbg(), etc., commands.  The
> PCI subsystem sets the device struct name at drivers/pci/probe.c:
> 1799
> 
> 	        dev_set_name(&dev->dev, "%04x:%02x:%02x.%d", pci_domain_nr(dev->bus),
>                      dev->bus->number, PCI_SLOT(dev->devfn),
>                      PCI_FUNC(dev->devfn));
> 
> My problem really is that the above information is insufficient when
> I (or a user) need to debug a system.  The complexities of debugging
> multiple broken driver loads would be much easier if I didn't have
> to constantly add this output manually :).

This *should* already be in the dmesg log:

  pci 0000:00:00.0: [8086:5910] type 00 class 0x060000
  pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
  pci 0000:00:02.0: [8086:591b] type 00 class 0x030000

So if you had a dev_panic(), that message would include the
bus/device/function number, and that would be enough to find the
vendor/device ID from when the device was first enumerated.

Or are you saying you can't get the part of the dmesg log that
contains those vendor/device IDs?

> Would you be okay with adding a *debug* parameter to expand the
> device name to include the vendor & device ID pair?  FWIW, I'm
> somewhat against yet-another-kernel-option but that's really the
> information I need.  I could then add dev_dbg() statements in the
> local_pci_probe() function.
Prarit Bhargava March 5, 2021, 6:20 p.m. UTC | #13
On 3/4/21 10:50 AM, Bjorn Helgaas wrote:
> On Thu, Mar 04, 2021 at 09:42:44AM -0500, Prarit Bhargava wrote:
>>
>>
>> On 2/18/21 2:06 PM, Bjorn Helgaas wrote:
>>> On Thu, Feb 18, 2021 at 01:36:35PM -0500, Prarit Bhargava wrote:
>>>> On 1/26/21 10:12 AM, Bjorn Helgaas wrote:
>>>>> On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
>>>>>> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
>>>>>>> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
>>>>>>>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
>>>>>>>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
>>>>>>>>>>   Leon Romanovsky <leon@kernel.org> wrote:
>>>>>>>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
>>>>>>>>>>>> There are two situations where driver load messages are helpful.
>>>>>>>>>>>>
>>>>>>>>>>>> 1) Some drivers silently load on devices and debugging driver or system
>>>>>>>>>>>> failures in these cases is difficult.  While some drivers (networking
>>>>>>>>>>>> for example) may not completely initialize when the PCI driver probe() function
>>>>>>>>>>>> has returned, it is still useful to have some idea of driver completion.
>>>>>>>>>>>
>>>>>>>>>>> Sorry, probably it is me, but I don't understand this use case.
>>>>>>>>>>> Are you adding global to whole kernel command line boot argument to debug
>>>>>>>>>>> what and when?
>>>>>>>>>>>
>>>>>>>>>>> During boot:
>>>>>>>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
>>>>>>>>>>> If device fails, you should get an error from that device (fix the
>>>>>>>>>>> device to return an error), or something immediately won't work and
>>>>>>>>>>> you won't see it in sysfs.
>>>>>>>>>>
>>>>>>>>>> What if there is a panic during boot?  There's no way to get to sysfs.
>>>>>>>>>> That's the case where this is helpful.
>>>>>>>>>
>>>>>>>>> How? If you have kernel panic, it means you have much more worse problem
>>>>>>>>> than not-supported device. If kernel panic was caused by the driver, you
>>>>>>>>> will see call trace related to it. If kernel panic was caused by
>>>>>>>>> something else, supported/not supported won't help here.
>>>>>>>>
>>>>>>>> I still have no idea *WHICH* device it was that the panic occurred on.
>>>>>>>
>>>>>>> The kernel panic is printed from the driver. There is one driver loaded
>>>>>>> for all same PCI devices which are probed without relation to their
>>>>>>> number.>
>>>>>>> If you have host with ten same cards, you will see one driver and this
>>>>>>> is where the problem and not in supported/not-supported device.
>>>>>>
>>>>>> That's true, but you can also have different cards loading the same driver.
>>>>>> See, for example, any PCI_IDs list in a driver.
>>>>>>
>>>>>> For example,
>>>>>>
>>>>>> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
>>>>>> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
>>>>>>
>>>>>> Both load the megaraid driver and have different profiles within the
>>>>>> driver.  I have no idea which one actually panicked until removing
>>>>>> one card.
>>>>>>
>>>>>> It's MUCH worse when debugging new hardware and getting a panic
>>>>>> from, for example, the uncore code which binds to a PCI mapped
>>>>>> device.  One device might work and the next one doesn't.  And
>>>>>> then you can multiply that by seeing *many* panics at once and
>>>>>> trying to determine if the problem was on one specific socket,
>>>>>> die, or core.
>>>>>
>>>>> Would a dev_panic() interface that identified the device and
>>>>> driver help with this?
>>>>
>>>> ^^ the more I look at this problem, the more a dev_panic() that
>>>> would output a device specific message at panic time is what I
>>>> really need.
>>
>> I went down this road a bit and had a realization.  The issue isn't
>> with printing something at panic time, but the *data* that is
>> output.  Each PCI device is associated with a struct device.  That
>> device struct's name is output for dev_dbg(), etc., commands.  The
>> PCI subsystem sets the device struct name at drivers/pci/probe.c:
>> 1799
>>
>> 	        dev_set_name(&dev->dev, "%04x:%02x:%02x.%d", pci_domain_nr(dev->bus),
>>                      dev->bus->number, PCI_SLOT(dev->devfn),
>>                      PCI_FUNC(dev->devfn));
>>
>> My problem really is that the above information is insufficient when
>> I (or a user) need to debug a system.  The complexities of debugging
>> multiple broken driver loads would be much easier if I didn't have
>> to constantly add this output manually :).
> 
> This *should* already be in the dmesg log:
> 
>   pci 0000:00:00.0: [8086:5910] type 00 class 0x060000
>   pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
>   pci 0000:00:02.0: [8086:591b] type 00 class 0x030000
> 
> So if you had a dev_panic(), that message would include the
> bus/device/function number, and that would be enough to find the
> vendor/device ID from when the device was first enumerated.
> 
> Or are you saying you can't get the part of the dmesg log that
> contains those vendor/device IDs?

/me hangs head in shame

I didn't notice that until now. :)

Uh thanks for the polite hit with a cluebat :)  I *think* that will work.  Let
me try some additional driver failure tests.

P.

> 
>> Would you be okay with adding a *debug* parameter to expand the
>> device name to include the vendor & device ID pair?  FWIW, I'm
>> somewhat against yet-another-kernel-option but that's really the
>> information I need.  I could then add dev_dbg() statements in the
>> local_pci_probe() function.
>
diff mbox series

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 23f209c8c19a..32ecee6a4ef0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3793,6 +3793,8 @@ 
 		nomio		[S390] Do not use MIO instructions.
 		norid		[S390] ignore the RID field and force use of
 				one PCI domain per PCI function
+		driver_load_messages
+				Output driver load status messages.
 
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 8b587fc97f7b..35d5b6973578 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -23,6 +23,8 @@ 
 #include "pci.h"
 #include "pcie/portdrv.h"
 
+bool driver_load_messages;
+
 struct pci_dynid {
 	struct list_head node;
 	struct pci_device_id id;
@@ -305,10 +307,20 @@  static long local_pci_probe(void *_ddi)
 	 */
 	pm_runtime_get_sync(dev);
 	pci_dev->driver = pci_drv;
+	if (driver_load_messages)
+		pci_info(pci_dev, "loading on device [%0x:%0x]\n",
+			 pci_dev->vendor, pci_dev->device);
 	rc = pci_drv->probe(pci_dev, ddi->id);
-	if (!rc)
+	if (!rc) {
+		if (driver_load_messages)
+			pci_info(pci_dev, "loaded on device [%0x:%0x]\n",
+				 pci_dev->vendor, pci_dev->device);
 		return rc;
+	}
 	if (rc < 0) {
+		if (driver_load_messages)
+			pci_info(pci_dev, "failed (%d) on device [%0x:%0x]\n",
+				 rc, pci_dev->vendor, pci_dev->device);
 		pci_dev->driver = NULL;
 		pm_runtime_put_sync(dev);
 		return rc;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e578d34095e9..c64b3b6e1e8d 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6547,6 +6547,8 @@  static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "driver_load_messages", 24)) {
+				driver_load_messages = true;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f86cae9aa1f4..db1218e188ac 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -690,4 +690,5 @@  static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
 extern const struct attribute_group aspm_ctrl_attr_group;
 #endif
 
+extern bool driver_load_messages;
 #endif /* DRIVERS_PCI_H */