Message ID | 20230813001218.19716-1-decui@microsoft.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Lorenzo Pieralisi |
Headers | show |
Series | PCI: hv: Fix a crash in hv_pci_restore_msi_msg() during hibernation | expand |
From: Dexuan Cui <decui@microsoft.com> Sent: Saturday, August 12, 2023 5:12 PM > > For a Linux VM with a NVIDIA GPU running on Hyper-V, before the GPU driver > is installed, hibernating the VM will trigger a panic: if the GPU driver > is not installed and loaded, MSI-X/MSI is not enabled on the device, so > pdev->dev.msi.data is NULL, and msi_lock_descs(&pdev->dev) causes the > NULL pointer dereference. Fix this by checking pdev->dev.msi.data. Is the scenario here a little broader than just the NVIDIA GPU driver? For any virtual PCI device that is presented in the guest VM as a VMBus device, the driver might not be installed. There could have been some initial problem getting the driver installed, or it might have been manually uninstalled later in the life of the VM. Also the host might have rescinded the virtual PCI device and added it back later, creating another opportunity where the driver might not be loaded. In any case, it seems like we could have the VMBus aspects of the device setup, but not the driver for the device. This suspend/resume code in pci-hyperv.c is all about handling the VMBus aspects of the device anyway. Assuming my thinking is correct, is there some Hyper-V/VMBus setting owned by the pci-hyperv.c driver that would be better to test here than the low-level dev.msi.data pointer? The MSI code rework that added the descriptor lock encapsulates the internals with appropriate accessor functions, and reaching in to directly test dev.msi.data violates that encapsulation. That said, I think this code works as written. I'm picking at details. Michael > > Fixes: dc2b453290c4 ("PCI: hv: Rework MSI handling") > Signed-off-by: Dexuan Cui <decui@microsoft.com> > --- > drivers/pci/controller/pci-hyperv.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c > index 2d93d0c4f10d..fdd01bfb8e10 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -3983,6 +3983,9 @@ static int hv_pci_restore_msi_msg(struct pci_dev *pdev, > void *arg) > struct msi_desc *entry; > int ret = 0; > > + if (!pdev->dev.msi.data) > + return 0; > + > msi_lock_descs(&pdev->dev); > msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) { > irq_data = irq_get_irq_data(entry->irq); > -- > 2.25.1
> From: Michael Kelley (LINUX) <mikelley@microsoft.com> > Sent: Tuesday, August 15, 2023 5:35 PM > > [...] > > For a Linux VM with a NVIDIA GPU running on Hyper-V, before the GPU > driver > > is installed, hibernating the VM will trigger a panic: if the GPU driver > > is not installed and loaded, MSI-X/MSI is not enabled on the device, so > > pdev->dev.msi.data is NULL, and msi_lock_descs(&pdev->dev) causes the > > NULL pointer dereference. Fix this by checking pdev->dev.msi.data. > > Is the scenario here a little broader than just the NVIDIA GPU driver? For > any virtual PCI device that is presented in the guest VM as a VMBus device, > the driver might not be installed. There could have been some initial > problem getting the driver installed, or it might have been manually > uninstalled later in the life of the VM. Also the host might have rescinded > the virtual PCI device and added it back later, creating another opportunity > where the driver might not be loaded. In any case, it seems like we could > have the VMBus aspects of the device setup, but not the driver for the > device. This suspend/resume code in pci-hyperv.c is all about handling > the VMBus aspects of the device anyway. Good point! The bug also affects other PCI devices, e.g. if I unload mlx5_core and let the VM with a Mellanox VF NIC hibernate, I hit the same NULL pointer dereference. > Assuming my thinking is correct, is there some Hyper-V/VMBus setting > owned by the pci-hyperv.c driver that would be better to test here than > the low-level dev.msi.data pointer? The MSI code rework that added IMO there is no easy and reliable way in Hyper-V/VMBus/pci-hyperv to tell if MSI/MSI-X is enabled for a PCI device. We can potentially track the MSI/MSI-X irqs in hv_compose_msi_msg() and hv_irq_unmask(), but IMO that's not very easy and may be inaccurate. > the descriptor lock encapsulates the internals with appropriate accessor > functions, and reaching in to directly test dev.msi.data violates that > encapsulation. I agree. Compared with: if (!pdev->dev.msi.data) return 0; I think it's better to use this: if (!pdev->msi_enabled && !pdev->msix_enabled) return 0; pdev-> msix_enabled has been used in many drivers, e.g. "arch/x86/pci/xen.c": xen_pv_teardown_msi_irqs() "drivers/hid/intel-ish-hid/ipc/pci-ish.c": ish_probe() "drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c": pvrdma_intr0_handler() "drivers/scsi/vmw_pvscsi.c": pvscsi_probe() and more. So it looks like pdev-> msix_enabled is a legit and stable API. I'll post v2 with it. I'll update the changelog accordingly. Please let me know if you have concerns about it.
From: Dexuan Cui <decui@microsoft.com> Sent: Tuesday, August 15, 2023 9:00 PM > > > From: Michael Kelley (LINUX) <mikelley@microsoft.com> > > Sent: Tuesday, August 15, 2023 5:35 PM > > > [...] > > > For a Linux VM with a NVIDIA GPU running on Hyper-V, before the GPU driver > > > is installed, hibernating the VM will trigger a panic: if the GPU driver > > > is not installed and loaded, MSI-X/MSI is not enabled on the device, so > > > pdev->dev.msi.data is NULL, and msi_lock_descs(&pdev->dev) causes the > > > NULL pointer dereference. Fix this by checking pdev->dev.msi.data. > > > > Is the scenario here a little broader than just the NVIDIA GPU driver? For > > any virtual PCI device that is presented in the guest VM as a VMBus device, > > the driver might not be installed. There could have been some initial > > problem getting the driver installed, or it might have been manually > > uninstalled later in the life of the VM. Also the host might have rescinded > > the virtual PCI device and added it back later, creating another opportunity > > where the driver might not be loaded. In any case, it seems like we could > > have the VMBus aspects of the device setup, but not the driver for the > > device. This suspend/resume code in pci-hyperv.c is all about handling > > the VMBus aspects of the device anyway. > > Good point! The bug also affects other PCI devices, e.g. if I unload mlx5_core > and let the VM with a Mellanox VF NIC hibernate, I hit the same NULL > pointer dereference. > > > Assuming my thinking is correct, is there some Hyper-V/VMBus setting > > owned by the pci-hyperv.c driver that would be better to test here than > > the low-level dev.msi.data pointer? The MSI code rework that added > > IMO there is no easy and reliable way in Hyper-V/VMBus/pci-hyperv to > tell if MSI/MSI-X is enabled for a PCI device. We can potentially track the > MSI/MSI-X irqs in hv_compose_msi_msg() and hv_irq_unmask(), but > IMO that's not very easy and may be inaccurate. > > > the descriptor lock encapsulates the internals with appropriate accessor > > functions, and reaching in to directly test dev.msi.data violates that > > encapsulation. > > I agree. > > Compared with: > if (!pdev->dev.msi.data) > return 0; > > I think it's better to use this: > if (!pdev->msi_enabled && !pdev->msix_enabled) > return 0; > > pdev-> msix_enabled has been used in many drivers, e.g. > > "arch/x86/pci/xen.c": xen_pv_teardown_msi_irqs() > "drivers/hid/intel-ish-hid/ipc/pci-ish.c": ish_probe() > "drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c": pvrdma_intr0_handler() > "drivers/scsi/vmw_pvscsi.c": pvscsi_probe() > and more. > > So it looks like pdev-> msix_enabled is a legit and stable API. > I'll post v2 with it. I'll update the changelog accordingly. > Please let me know if you have concerns about it. Sounds good. I agree that what you propose is a better approach. Michael
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 2d93d0c4f10d..fdd01bfb8e10 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -3983,6 +3983,9 @@ static int hv_pci_restore_msi_msg(struct pci_dev *pdev, void *arg) struct msi_desc *entry; int ret = 0; + if (!pdev->dev.msi.data) + return 0; + msi_lock_descs(&pdev->dev); msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) { irq_data = irq_get_irq_data(entry->irq);
For a Linux VM with a NVIDIA GPU running on Hyper-V, before the GPU driver is installed, hibernating the VM will trigger a panic: if the GPU driver is not installed and loaded, MSI-X/MSI is not enabled on the device, so pdev->dev.msi.data is NULL, and msi_lock_descs(&pdev->dev) causes the NULL pointer dereference. Fix this by checking pdev->dev.msi.data. Fixes: dc2b453290c4 ("PCI: hv: Rework MSI handling") Signed-off-by: Dexuan Cui <decui@microsoft.com> --- drivers/pci/controller/pci-hyperv.c | 3 +++ 1 file changed, 3 insertions(+)