Message ID | 20230730191519.3124390-1-vidyas@nvidia.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [V3] PCI: pciehp: Disable ACS Source Validation during hot-remove | expand |
On Mon, Jul 31, 2023 at 12:45:19AM +0530, Vidya Sagar wrote: > PCIe 6.0, 6.12.1.1 specifies that downstream devices are permitted to > send upstream messages before they have been assigned a bus number and > such messages have a Requester ID with Bus number set to 00h. > If the Downstream port has ACS Source Validation enabled, these messages > will be detected as ACS violation error. > > Hence, disable ACS Source Validation in the bridge device during > hot-remove operation and re-enable it after enumeration of the > downstream hierarchy but before binding the respective device drivers. What are these messages that are sent before assignment of a bus number? What's the user-visible issue that occurs when they're blocked? Doesn't disabling Source Validation introduce a security hole because the device may spoof messages before Source Validation is re-enabled? PCIe r6.1 sec 6.12.1.1 does indeed point out that the downstream device is *permitted* to send these messages but the Implementation Note does *not* prescribe that Source Validation shall be disabled to let them through. It merely points out that the messages may be filtered if Source Validation is enabled. Thanks, Lukas
Thanks Lukas for the quick review. I commented inline for the queries/concerns raised. On 7/31/2023 1:10 AM, Lukas Wunner wrote: > External email: Use caution opening links or attachments > > > On Mon, Jul 31, 2023 at 12:45:19AM +0530, Vidya Sagar wrote: >> PCIe 6.0, 6.12.1.1 specifies that downstream devices are permitted to >> send upstream messages before they have been assigned a bus number and >> such messages have a Requester ID with Bus number set to 00h. >> If the Downstream port has ACS Source Validation enabled, these messages >> will be detected as ACS violation error. >> >> Hence, disable ACS Source Validation in the bridge device during >> hot-remove operation and re-enable it after enumeration of the >> downstream hierarchy but before binding the respective device drivers. > > What are these messages that are sent before assignment of a bus number? One example is the DRS (Device Readiness Status) message. > > What's the user-visible issue that occurs when they're blocked? I'm not sure about the issue one can observe when they are blocked, but, we have seen one issue when they are not blocked. When an endpoint sends a DRS message and an ACS violation is raised for it, the system can trigger DPC (Downstream Port Containment) if it is configured to do so for ACS violations. Once the DPC is released after handling it, system would go for link-up again, which results in root port receiving DRS once again from the endpoint and the cycle continues. > > Doesn't disabling Source Validation introduce a security hole because the > device may spoof messages before Source Validation is re-enabled? Agree, but this concern is already/has always been there during boot scenario where the link-up happens first and the ACS is enabled at a later point and endpoint can spoof messages in between if it wishes so. > > PCIe r6.1 sec 6.12.1.1 does indeed point out that the downstream device > is *permitted* to send these messages but the Implementation Note > does *not* prescribe that Source Validation shall be disabled to let them > through. It merely points out that the messages may be filtered if > Source Validation is enabled. Could you please elaborate on the filtering part. Do you expect this to be implemented in the hardware or software? > > Thanks, > > Lukas
On Mon, Jul 31, 2023 at 01:32:27AM +0530, Vidya Sagar wrote: > On 7/31/2023 1:10 AM, Lukas Wunner wrote: > > On Mon, Jul 31, 2023 at 12:45:19AM +0530, Vidya Sagar wrote: > > > PCIe 6.0, 6.12.1.1 specifies that downstream devices are permitted to > > > send upstream messages before they have been assigned a bus number and > > > such messages have a Requester ID with Bus number set to 00h. > > > If the Downstream port has ACS Source Validation enabled, these messages > > > will be detected as ACS violation error. > > > > > > Hence, disable ACS Source Validation in the bridge device during > > > hot-remove operation and re-enable it after enumeration of the > > > downstream hierarchy but before binding the respective device drivers. > > > > What are these messages that are sent before assignment of a bus number? > > One example is the DRS (Device Readiness Status) message. Please mention that in the commit message. > > What's the user-visible issue that occurs when they're blocked? > > I'm not sure about the issue one can observe when they are blocked, but, we > have seen one issue when they are not blocked. When an endpoint sends a DRS > message and an ACS violation is raised for it, the system can trigger DPC > (Downstream Port Containment) if it is configured to do so for ACS > violations. Once the DPC is released after handling it, system would go for > link-up again, which results in root port receiving DRS once again from the > endpoint and the cycle continues. As an alternative to disabling ACS, have you explored masking ACS Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and unmasking them after assignment of a bus number? That would alleviate concerns that disabling ACS Source Validation weakens security (because it doesn't have to be disabled in the first place). You'd need to clear the ACS Violation Status bit in the Uncorrectable Error Status Register though after assignment of a bus number, in addition to unmasking it, because that bit is still set despite the error being masked. The kernel affords a generous 60 sec timeout for devices to become ready (PCIE_RESET_READY_POLL_MS) and is not dependent on DRS messages coming through, so blocking them with ACS shouldn't cause issues. > > Doesn't disabling Source Validation introduce a security hole because the > > device may spoof messages before Source Validation is re-enabled? > > Agree, but this concern is already/has always been there during boot > scenario where the link-up happens first and the ACS is enabled at a later > point and endpoint can spoof messages in between if it wishes so. The problem is that devices may be removed only logically (via sysfs "power" attribute or Attention Button) and still remain in the system physically. They may spoof messages until they're physically removed or the hotplug slot is brought up again. > > PCIe r6.1 sec 6.12.1.1 does indeed point out that the downstream device > > is *permitted* to send these messages but the Implementation Note > > does *not* prescribe that Source Validation shall be disabled to let them > > through. It merely points out that the messages may be filtered if > > Source Validation is enabled. > > Could you please elaborate on the filtering part. Do you expect this to be > implemented in the hardware or software? By "filtered" I meant that TLPs are blocked by ACS. Sorry for the confusing word choice. Thanks, Lukas
[ add linux-cxl ] Hi Vidya, Lukas highlighted this thread to me as we, in linux-cxl land, are also seeing conflicts between ACS source validation and flows like CXL PM. Lukas Wunner wrote: > On Mon, Jul 31, 2023 at 01:32:27AM +0530, Vidya Sagar wrote: > > On 7/31/2023 1:10 AM, Lukas Wunner wrote: > > > On Mon, Jul 31, 2023 at 12:45:19AM +0530, Vidya Sagar wrote: > > > > PCIe 6.0, 6.12.1.1 specifies that downstream devices are permitted to > > > > send upstream messages before they have been assigned a bus number and > > > > such messages have a Requester ID with Bus number set to 00h. > > > > If the Downstream port has ACS Source Validation enabled, these messages > > > > will be detected as ACS violation error. > > > > > > > > Hence, disable ACS Source Validation in the bridge device during > > > > hot-remove operation and re-enable it after enumeration of the > > > > downstream hierarchy but before binding the respective device drivers. > > > > > > What are these messages that are sent before assignment of a bus number? > > > > One example is the DRS (Device Readiness Status) message. > > Please mention that in the commit message. > > > > > What's the user-visible issue that occurs when they're blocked? > > > > I'm not sure about the issue one can observe when they are blocked, but, we > > have seen one issue when they are not blocked. When an endpoint sends a DRS > > message and an ACS violation is raised for it, the system can trigger DPC > > (Downstream Port Containment) if it is configured to do so for ACS > > violations. Once the DPC is released after handling it, system would go for > > link-up again, which results in root port receiving DRS once again from the > > endpoint and the cycle continues. > > As an alternative to disabling ACS, have you explored masking ACS > Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and > unmasking them after assignment of a bus number? The problem is that still prevents things like CXL PM negotiation from completing. The conflict for CXL PM can hopefully be fixed in the spec and future devices, but that is at least a full generation of CXL devices that will fail to handle hotplug and secondary-bus resets. One proposal I had for this was to enforce that the Downstream Port disables bus-master-enable and enforces P2P to redirect upstream when source validation is turned off. Then, when the device re-establishes the link and is re-enabled source-validation can be turned back on before downstream-port bus-master enable is set so that there is no window to launch memory-cycle attacks while source-validation is turned off. Is that something you would be willing to investigate for the next round of this patch Vidya?
On 8/1/2023 1:29 AM, Lukas Wunner wrote: > External email: Use caution opening links or attachments > > > On Mon, Jul 31, 2023 at 01:32:27AM +0530, Vidya Sagar wrote: >> On 7/31/2023 1:10 AM, Lukas Wunner wrote: >>> On Mon, Jul 31, 2023 at 12:45:19AM +0530, Vidya Sagar wrote: >>>> PCIe 6.0, 6.12.1.1 specifies that downstream devices are permitted to >>>> send upstream messages before they have been assigned a bus number and >>>> such messages have a Requester ID with Bus number set to 00h. >>>> If the Downstream port has ACS Source Validation enabled, these messages >>>> will be detected as ACS violation error. >>>> >>>> Hence, disable ACS Source Validation in the bridge device during >>>> hot-remove operation and re-enable it after enumeration of the >>>> downstream hierarchy but before binding the respective device drivers. >>> >>> What are these messages that are sent before assignment of a bus number? >> >> One example is the DRS (Device Readiness Status) message. > > Please mention that in the commit message. > > >>> What's the user-visible issue that occurs when they're blocked? >> >> I'm not sure about the issue one can observe when they are blocked, but, we >> have seen one issue when they are not blocked. When an endpoint sends a DRS >> message and an ACS violation is raised for it, the system can trigger DPC >> (Downstream Port Containment) if it is configured to do so for ACS >> violations. Once the DPC is released after handling it, system would go for >> link-up again, which results in root port receiving DRS once again from the >> endpoint and the cycle continues. > > As an alternative to disabling ACS, have you explored masking ACS > Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and > unmasking them after assignment of a bus number? Hi Lukas, I explored this option and it seemed to work as expected. But, the issue is that this works only if the AER registers are owned by the OS. If the AER registers are owned by the firmware (i.e. Firmware-First approach of handling the errors), OS is not supposed to access the AER registers and there is no indication from the OS to the firmware as to when the enumeration is completed and time is apt to unmask the ACSViolation errors in the AER's Uncorrectable Error Mask register. Any thoughts on accommodating the Firmware-First approach also? > > That would alleviate concerns that disabling ACS Source Validation > weakens security (because it doesn't have to be disabled in the > first place). > > You'd need to clear the ACS Violation Status bit in the Uncorrectable > Error Status Register though after assignment of a bus number, > in addition to unmasking it, because that bit is still set despite > the error being masked. > > The kernel affords a generous 60 sec timeout for devices to become > ready (PCIE_RESET_READY_POLL_MS) and is not dependent on DRS messages > coming through, so blocking them with ACS shouldn't cause issues. > > >>> Doesn't disabling Source Validation introduce a security hole because the >>> device may spoof messages before Source Validation is re-enabled? >> >> Agree, but this concern is already/has always been there during boot >> scenario where the link-up happens first and the ACS is enabled at a later >> point and endpoint can spoof messages in between if it wishes so. > > The problem is that devices may be removed only logically (via sysfs > "power" attribute or Attention Button) and still remain in the system > physically. They may spoof messages until they're physically removed > or the hotplug slot is brought up again. > > >>> PCIe r6.1 sec 6.12.1.1 does indeed point out that the downstream device >>> is *permitted* to send these messages but the Implementation Note >>> does *not* prescribe that Source Validation shall be disabled to let them >>> through. It merely points out that the messages may be filtered if >>> Source Validation is enabled. >> >> Could you please elaborate on the filtering part. Do you expect this to be >> implemented in the hardware or software? > > By "filtered" I meant that TLPs are blocked by ACS. Sorry for the > confusing word choice. > > Thanks, > > Lukas
On Thu, Jan 04, 2024 at 08:01:06PM +0530, Vidya Sagar wrote: > On 8/1/2023 1:29 AM, Lukas Wunner wrote: > > As an alternative to disabling ACS, have you explored masking ACS > > Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and > > unmasking them after assignment of a bus number? > > I explored this option and it seemed to work as expected. But, the issue > is that this works only if the AER registers are owned by the OS. If the > AER registers are owned by the firmware (i.e. Firmware-First approach of > handling the errors), OS is not supposed to access the AER registers and > there is no indication from the OS to the firmware as to when the > enumeration is completed and time is apt to unmask the ACSViolation > errors in the AER's Uncorrectable Error Mask register. > Any thoughts on accommodating the Firmware-First approach also? Are you actually using firmware-controlled AER or is it a theoretical question? PCI Firmware Spec r3.3 sec 4.6.12 talks about a _DSM to disable DPC on surprise-hotplug-capable ports. Maybe that would be an option? BTW what happens if the system resumes from sleep and a device in a hotplug-capable port doesn't have a bus number configured yet (because it's been powered off and is now in D0uninitialized state)? Could the ACS Violations then occur as well? Do we have to mask ACS Violations *generally* on Root Ports and Downstream Ports when going to system sleep and unmask them after setting a bus number in the attached device on resume? And I suppose that would not only be necessary for hotplug ports? Thanks, Lukas
On 1/8/2024 7:49 PM, Lukas Wunner wrote: > External email: Use caution opening links or attachments > > > On Thu, Jan 04, 2024 at 08:01:06PM +0530, Vidya Sagar wrote: >> On 8/1/2023 1:29 AM, Lukas Wunner wrote: >>> As an alternative to disabling ACS, have you explored masking ACS >>> Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and >>> unmasking them after assignment of a bus number? >> >> I explored this option and it seemed to work as expected. But, the issue >> is that this works only if the AER registers are owned by the OS. If the >> AER registers are owned by the firmware (i.e. Firmware-First approach of >> handling the errors), OS is not supposed to access the AER registers and >> there is no indication from the OS to the firmware as to when the >> enumeration is completed and time is apt to unmask the ACSViolation >> errors in the AER's Uncorrectable Error Mask register. >> Any thoughts on accommodating the Firmware-First approach also? > > Are you actually using firmware-controlled AER or is it a theoretical > question? Yes. We indeed have a system with Firmware-Controlled AER. > > PCI Firmware Spec r3.3 sec 4.6.12 talks about a _DSM to disable DPC > on surprise-hotplug-capable ports. Maybe that would be an option? It looks like this _DSM is totally dependent on the port having SFI capability implemented and unfortunately our system doesn't have SFI implemented. > > BTW what happens if the system resumes from sleep and a device in > a hotplug-capable port doesn't have a bus number configured yet > (because it's been powered off and is now in D0uninitialized state)? Theoretically the answer seems to be yes, but, since the platform we have is a server platform, there is no support for sleep and resume on this platform and hence can't really confirm this behavior though. > Could the ACS Violations then occur as well? Do we have to mask > ACS Violations *generally* on Root Ports and Downstream Ports when > going to system sleep and unmask them after setting a bus number > in the attached device on resume? And I suppose that would not > only be necessary for hotplug ports? Again, how to do that in a system where AER is not handled natively in the OS? AFAIU, there is no mechanism for the OS to inform about the time it updates the bus number. > > Thanks, > > Lukas
Hi Lucas/Bjorn, any thoughts on this? On 1/11/2024 7:14 PM, Vidya Sagar wrote: > > > On 1/8/2024 7:49 PM, Lukas Wunner wrote: >> External email: Use caution opening links or attachments >> >> >> On Thu, Jan 04, 2024 at 08:01:06PM +0530, Vidya Sagar wrote: >>> On 8/1/2023 1:29 AM, Lukas Wunner wrote: >>>> As an alternative to disabling ACS, have you explored masking ACS >>>> Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and >>>> unmasking them after assignment of a bus number? >>> >>> I explored this option and it seemed to work as expected. But, the issue >>> is that this works only if the AER registers are owned by the OS. If the >>> AER registers are owned by the firmware (i.e. Firmware-First approach of >>> handling the errors), OS is not supposed to access the AER registers and >>> there is no indication from the OS to the firmware as to when the >>> enumeration is completed and time is apt to unmask the ACSViolation >>> errors in the AER's Uncorrectable Error Mask register. >>> Any thoughts on accommodating the Firmware-First approach also? >> >> Are you actually using firmware-controlled AER or is it a theoretical >> question? > Yes. We indeed have a system with Firmware-Controlled AER. > >> >> PCI Firmware Spec r3.3 sec 4.6.12 talks about a _DSM to disable DPC >> on surprise-hotplug-capable ports. Maybe that would be an option? > It looks like this _DSM is totally dependent on the port having SFI > capability implemented and unfortunately our system doesn't have > SFI implemented. > >> >> BTW what happens if the system resumes from sleep and a device in >> a hotplug-capable port doesn't have a bus number configured yet >> (because it's been powered off and is now in D0uninitialized state)? > Theoretically the answer seems to be yes, but, since the platform we > have is a server platform, there is no support for sleep and resume on > this platform and hence can't really confirm this behavior though. > >> Could the ACS Violations then occur as well? Do we have to mask >> ACS Violations *generally* on Root Ports and Downstream Ports when >> going to system sleep and unmask them after setting a bus number >> in the attached device on resume? And I suppose that would not >> only be necessary for hotplug ports? > Again, how to do that in a system where AER is not handled natively in > the OS? AFAIU, there is no mechanism for the OS to inform about the time > it updates the bus number. > >> >> Thanks, >> >> Lukas
On Thu, Jan 11, 2024 at 07:14:54PM +0530, Vidya Sagar wrote: > On 1/8/2024 7:49 PM, Lukas Wunner wrote: > > On Thu, Jan 04, 2024 at 08:01:06PM +0530, Vidya Sagar wrote: > > > On 8/1/2023 1:29 AM, Lukas Wunner wrote: > > > > As an alternative to disabling ACS, have you explored masking ACS > > > > Violations (PCI_ERR_UNC_ACSV) upon de-enumeration of a device and > > > > unmasking them after assignment of a bus number? > > > > > > I explored this option and it seemed to work as expected. But, the issue > > > is that this works only if the AER registers are owned by the OS. If the > > > AER registers are owned by the firmware (i.e. Firmware-First approach of > > > handling the errors), OS is not supposed to access the AER registers and > > > there is no indication from the OS to the firmware as to when the > > > enumeration is completed and time is apt to unmask the ACSViolation > > > errors in the AER's Uncorrectable Error Mask register. > > > Any thoughts on accommodating the Firmware-First approach also? I'm sorry, I don't have any good ideas. I just would like to avoid disabling ACS Source Validation because it would diminish our security posture. I guess setting the secondary bus number in the hotplug port to 0 isn't a good solution either because it would allow hotplugged devices to temporarily spoof TLPs from devices on the root bus, right? One option might be to have separate code paths: If AER is owned by the OS, mask PCI_ERR_UNC_ACSV on hot-removal, unmask on hot-add. If AER is *not* owned by the OS, disable ACS Source Validation on hot-removal, enable on hot-add, and warn loudly about the security implications. Another option might be to change error handling, i.e. ignore ACS Source Validation errors if they occur before assignment of a bus number. And temporarily disable DPC. None of these options look pretty. I'm generally not a fan of having the firmware own certain features. The user experience is better if everything is owned by the OS. This is just one more case in point. :( Thanks, Lukas
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c index ad12515a4a12..42d4328f2a9b 100644 --- a/drivers/pci/hotplug/pciehp_pci.c +++ b/drivers/pci/hotplug/pciehp_pci.c @@ -63,6 +63,7 @@ int pciehp_configure_device(struct controller *ctrl) pci_assign_unassigned_bridge_resources(bridge); pcie_bus_configure_settings(parent); + pci_configure_acs_sv(bridge, true); /* * Release reset_lock during driver binding @@ -132,6 +133,16 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence) } pci_dev_put(dev); } - + /* + * PCIe 6.0, 6.12.1.1 specifies that downstream devices are permitted + * to send upstream messages before they have been assigned a bus + * number and such messages have a Requester ID with Bus number + * set to 00h. If the Downstream port has ACS Source Validation enabled, + * these messages will be detected as ACS violation error. + * Hence, disable ACS Source Validation here and re-enable it after + * enumeration of the downstream hierarchy and before binding the + * respective device drivers in pciehp_configure_device(). + */ + pci_configure_acs_sv(ctrl->pcie->port, false); pci_unlock_rescan_remove(); } diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0..5a21640de355 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1004,6 +1004,27 @@ static void pci_enable_acs(struct pci_dev *dev) pci_disable_acs_redir(dev); } +#ifdef CONFIG_HOTPLUG_PCI_PCIE +void pci_configure_acs_sv(struct pci_dev *dev, bool flag) +{ + u16 cap; + u16 ctrl; + + if (!pci_acs_enable || !dev->acs_cap) + return; + + pci_read_config_word(dev, dev->acs_cap + PCI_ACS_CAP, &cap); + pci_read_config_word(dev, dev->acs_cap + PCI_ACS_CTRL, &ctrl); + + if (flag) + ctrl |= (cap & PCI_ACS_SV); + else + ctrl &= ~(cap & PCI_ACS_SV); + + pci_write_config_word(dev, dev->acs_cap + PCI_ACS_CTRL, ctrl); +} +#endif + /** * pci_restore_bars - restore a device's BAR values (e.g. after wake-up) * @dev: PCI device to have its BARs restored diff --git a/include/linux/pci.h b/include/linux/pci.h index eeb2e6f6130f..b6d53fe28371 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2350,6 +2350,12 @@ void pci_hp_create_module_link(struct pci_slot *pci_slot); void pci_hp_remove_module_link(struct pci_slot *pci_slot); #endif +#ifdef CONFIG_HOTPLUG_PCI_PCIE +void pci_configure_acs_sv(struct pci_dev *dev, bool flag); +#else +static inline void pci_configure_acs_sv(struct pci_dev *dev, bool flag) { } +#endif + /** * pci_pcie_cap - get the saved PCIe capability offset * @dev: PCI device
PCIe 6.0, 6.12.1.1 specifies that downstream devices are permitted to send upstream messages before they have been assigned a bus number and such messages have a Requester ID with Bus number set to 00h. If the Downstream port has ACS Source Validation enabled, these messages will be detected as ACS violation error. Hence, disable ACS Source Validation in the bridge device during hot-remove operation and re-enable it after enumeration of the downstream hierarchy but before binding the respective device drivers. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> --- v3: * Addressed review comments from Bjon v2: * Fixed build issues drivers/pci/hotplug/pciehp_pci.c | 13 ++++++++++++- drivers/pci/pci.c | 21 +++++++++++++++++++++ include/linux/pci.h | 6 ++++++ 3 files changed, 39 insertions(+), 1 deletion(-)