Message ID | 20210608054857.18963-8-ameynarkhede03@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | Expose and manage PCI device reset | expand |
On Tue, Jun 08, 2021 at 11:18:56AM +0530, Amey Narkhede wrote: > From: Shanker Donthineni <sdonthineni@nvidia.com> > > On select platforms, some Nvidia GPU devices do not work with SBR. Interesting that you say "on select platforms." Apparently SBR does work for some of these GPUs, but not on all platforms? If you have any clarification here, I can still update the commit log. > Triggering SBR would leave the device inoperable for the current > system boot. It requires a system hard-reboot to get the GPU device > back to normal operating condition post-SBR. For the affected > devices, enable NO_BUS_RESET quirk to fix the issue. > > This issue will be fixed in the next generation of hardware. > > Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> > Reviewed-by: Sinan Kaya <okaya@kernel.org> This patch doesn't seem to have any dependencies or particular connection to the rest of the reset series, so I applied this patch by itself to for-linus for v5.13 and marked it for stable. If that's not right, let me know. > --- > drivers/pci/quirks.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index e86cf4a3b..45a8c3caa 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -3546,6 +3546,18 @@ static void quirk_no_bus_reset(struct pci_dev *dev) > dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; > } > > +/* > + * Some Nvidia GPU devices do not work with bus reset, SBR needs to be > + * prevented for those affected devices. > + */ > +static void quirk_nvidia_no_bus_reset(struct pci_dev *dev) > +{ > + if ((dev->device & 0xffc0) == 0x2340) > + quirk_no_bus_reset(dev); > +} > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, > + quirk_nvidia_no_bus_reset); > + > /* > * Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset. > * The device will throw a Link Down error on AER-capable systems and > -- > 2.31.1 >
Hi Bjorn, On 6/10/21 6:16 PM, Bjorn Helgaas wrote: >> Triggering SBR would leave the device inoperable for the current >> system boot. It requires a system hard-reboot to get the GPU device >> back to normal operating condition post-SBR. For the affected >> devices, enable NO_BUS_RESET quirk to fix the issue. >> >> This issue will be fixed in the next generation of hardware. >> >> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> >> Reviewed-by: Sinan Kaya <okaya@kernel.org> > This patch doesn't seem to have any dependencies or particular > connection to the rest of the reset series, so I applied this patch by > itself to for-linus for v5.13 and marked it for stable. > > If that's not right, let me know. > Yes, you're right this patch no dependency on reset method series.
Hi Bjorn, On 6/10/21 6:16 PM, Bjorn Helgaas wrote: >> From: Shanker Donthineni <sdonthineni@nvidia.com> >> >> On select platforms, some Nvidia GPU devices do not work with SBR. > Interesting that you say "on select platforms." Apparently SBR does > work for some of these GPUs, but not on all platforms? If you have > any clarification here, I can still update the commit log. > Yes, SBR works for some GPUs but GPUs which are listed in this quirk will not work and these GPUs are available only on selected server platforms. I believe commit text reflects the issue but please update if needed. -
On Thu, Jun 10, 2021 at 06:43:26PM -0500, Shanker R Donthineni wrote: > On 6/10/21 6:16 PM, Bjorn Helgaas wrote: > >> From: Shanker Donthineni <sdonthineni@nvidia.com> > >> > >> On select platforms, some Nvidia GPU devices do not work with SBR. > > Interesting that you say "on select platforms." Apparently SBR does > > work for some of these GPUs, but not on all platforms? If you have > > any clarification here, I can still update the commit log. > > > Yes, SBR works for some GPUs but GPUs which are listed in this quirk will > not work and these GPUs are available only on selected server platforms. > I believe commit text reflects the issue but please update if needed. It sounds like there is no actual dependency on the platform. So even though these GPUs are only available on certain platforms, if one were to move one of them to a different, non-supported platform, SBR would still not work. So I think I'll remove the reference to "select platforms" since it doesn't add any useful information and might suggest that SBR should work on some platforms, if you could only find the right ones. Bjorn
Hi Bjorn, On 6/10/21 6:53 PM, Bjorn Helgaas wrote: > It sounds like there is no actual dependency on the platform. So even > though these GPUs are only available on certain platforms, if one were > to move one of them to a different, non-supported platform, SBR would > still not work. > > So I think I'll remove the reference to "select platforms" since it > doesn't add any useful information and might suggest that SBR should > work on some platforms, if you could only find the right ones. Appreciate your time on code review, providing better text, and picking patch for v5.14. Please let us know if any code improvements or suggestions for the remaining reset patch series to be considered for v5.14
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e86cf4a3b..45a8c3caa 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3546,6 +3546,18 @@ static void quirk_no_bus_reset(struct pci_dev *dev) dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; } +/* + * Some Nvidia GPU devices do not work with bus reset, SBR needs to be + * prevented for those affected devices. + */ +static void quirk_nvidia_no_bus_reset(struct pci_dev *dev) +{ + if ((dev->device & 0xffc0) == 0x2340) + quirk_no_bus_reset(dev); +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, + quirk_nvidia_no_bus_reset); + /* * Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset. * The device will throw a Link Down error on AER-capable systems and