Message ID | 20220831081603.3415-1-rrichter@amd.com |
---|---|
Headers | show |
Series | cxl: Add support for Restricted CXL hosts (RCD mode) | expand |
On Wed, 31 Aug 2022 10:15:48 +0200 Robert Richter <rrichter@amd.com> wrote: > In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) > the PCIe enumeration hierarchy is different from CXL VH Enumeration > (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 > and 9.12, [1]). This series adds support for RCD mode. It implements > the detection of Restricted CXL Hosts (RCHs) and its corresponding > Restricted CXL Devices (RCDs). It does the necessary enumeration of > ports and connects the endpoints. With all the plumbing an RCH/RCD > pair is registered at the Linux CXL bus and becomes visible in sysfs > in the same way as CXL VH hosts and devices do already. RCDs are > brought up as CXL endpoints and bound to subsequent drivers such as > cxl_mem. > > For CXL VH the host driver (cxl_acpi) starts host bridge discovery > once the ACPI0017 CXL root device is detected and then searches for > ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device > might not necessarily exist and the host bridge can have a standard > PCIe host bridge PNP0A08 ID, there aren't any CXL port or switches in > the PCIe hierarchy visible. As such the RCD mode enumeration and host > discovery is very different from CXL VH. See patch #5 for > implementation details. > > This implementation expects the host's downstream and upstream port > RCRBs base address being reported by firmware using the optional CEDT > CHBS entry of the host bridge (see CXL spec 3.0, 9.17.1.2). > > RCD mode does not support hot-plug, so host discovery is at boot time > only. > > Patches #1 to #4 are prerequisites of the series with fixes needed and > a rework of debug messages for port enumeration. Those are general > patches and could be applied earlier and independently from the rest > assuming there are no objections with them. Patches #5 to #15 contain > the actual implementation of RCD mode support. > > [1] https://www.computeexpresslink.org/spec-landing Hi Robert, I'm curious on the aims of this work. Given expectation for RCDs is often that the host firmware has set them up before the OS loads, what functionality do you want to gain by mapping these into the CXL 2.0+ focused infrastructure? When I did some analysis a while back on CXL 1.1 I was pretty much assuming that there was no real reason to let the OS know about it because it couldn't do much of any use with the information. There are some corners like RAS where it might be useful or perhaps to enable some of the CXL 3.0 features that are allowed to be EP only and so could be relevant for an older host (e.g. CPMUs). With my QEMU hat on I wasn't planning to bother with anything pre 2.0 but it might be worth considering to let us exercise this code... Jonathan > > Robert Richter (15): > cxl/core: Remove duplicate declaration of devm_cxl_iomap_block() > cxl/core: Check physical address before mapping it in > devm_cxl_iomap_block() > cxl: Unify debug messages when calling devm_cxl_add_port() > cxl: Unify debug messages when calling devm_cxl_add_dport() > cxl/acpi: Add probe function to detect restricted CXL hosts in RCD > mode > PCI/ACPI: Link host bridge to its ACPI fw node > cxl/acpi: Check RCH's PCIe Host Bridge ACPI ID > cxl/acpi: Check RCH's CXL DVSEC capabilities > cxl/acpi: Determine PCI host bridge's ACPI UID > cxl/acpi: Extract the RCH's RCRB base address from CEDT > cxl/acpi: Extract the host's component register base address from RCRB > cxl/acpi: Skip devm_cxl_port_enumerate_dports() when in RCD mode > cxl/acpi: Rework devm_cxl_enumerate_ports() to support RCD mode > cxl/acpi: Enumerate ports in RCD mode to enable RCHs and RCDs > cxl/acpi: Specify module load order dependency for the cxl_acpi module > > drivers/acpi/pci_root.c | 1 + > drivers/cxl/acpi.c | 311 ++++++++++++++++++++++++++++++++++- > drivers/cxl/core/pci.c | 22 ++- > drivers/cxl/core/port.c | 103 ++++++++---- > drivers/cxl/core/regs.c | 3 + > drivers/cxl/cxl.h | 2 - > drivers/cxl/mem.c | 1 + > tools/testing/cxl/test/cxl.c | 8 +- > 8 files changed, 400 insertions(+), 51 deletions(-) >
Jonathan, On 31.08.22 13:23:29, Jonathan Cameron wrote: > On Wed, 31 Aug 2022 10:15:48 +0200 > Robert Richter <rrichter@amd.com> wrote: > > > In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) > > the PCIe enumeration hierarchy is different from CXL VH Enumeration > > (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 > > and 9.12, [1]). This series adds support for RCD mode. It implements > > the detection of Restricted CXL Hosts (RCHs) and its corresponding > > Restricted CXL Devices (RCDs). It does the necessary enumeration of > > ports and connects the endpoints. With all the plumbing an RCH/RCD > > pair is registered at the Linux CXL bus and becomes visible in sysfs > > in the same way as CXL VH hosts and devices do already. RCDs are > > brought up as CXL endpoints and bound to subsequent drivers such as > > cxl_mem. > > > > For CXL VH the host driver (cxl_acpi) starts host bridge discovery > > once the ACPI0017 CXL root device is detected and then searches for > > ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device > > might not necessarily exist and the host bridge can have a standard > > PCIe host bridge PNP0A08 ID, there aren't any CXL port or switches in > > the PCIe hierarchy visible. As such the RCD mode enumeration and host > > discovery is very different from CXL VH. See patch #5 for > > implementation details. > > > > This implementation expects the host's downstream and upstream port > > RCRBs base address being reported by firmware using the optional CEDT > > CHBS entry of the host bridge (see CXL spec 3.0, 9.17.1.2). > > > > RCD mode does not support hot-plug, so host discovery is at boot time > > only. > > > > Patches #1 to #4 are prerequisites of the series with fixes needed and > > a rework of debug messages for port enumeration. Those are general > > patches and could be applied earlier and independently from the rest > > assuming there are no objections with them. Patches #5 to #15 contain > > the actual implementation of RCD mode support. > > > > [1] https://www.computeexpresslink.org/spec-landing > > Hi Robert, > > I'm curious on the aims of this work. Given expectation for RCDs is often > that the host firmware has set them up before the OS loads, what functionality > do you want to gain by mapping these into the CXL 2.0+ focused infrastructure? > > When I did some analysis a while back on CXL 1.1 I was pretty much assuming > that there was no real reason to let the OS know about it because it > couldn't do much of any use with the information. There are some corners > like RAS where it might be useful or perhaps to enable some of the CXL 3.0 > features that are allowed to be EP only and so could be relevant for > an older host (e.g. CPMUs). though CXL RCD works with a legacy kernel or without any CXL functionality added, a CXL aware kernel can be useful also for RCD mode. RAS is a topic here but also gathering device information such as status or topology. Everything where access to the component register block or mailbox interface is required. Another plus, the CXL hierarchy becomes visible for RCD mode in sysfs and the device hierarchy. Reusing the existing infrastructure for this makes sense. Many features overlap in both modes (e.g. RAS, mailbox again, or topology information). Thanks again for you review. -Robert > > With my QEMU hat on I wasn't planning to bother with anything pre 2.0 > but it might be worth considering to let us exercise this code... > > Jonathan
Apologies for the delay in getting to this I had hoped to be able to finish up some other DAX work to focus on this, but time is getting short so I will need to do both in parallel. Robert Richter wrote: > In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) > the PCIe enumeration hierarchy is different from CXL VH Enumeration > (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 > and 9.12, [1]). This series adds support for RCD mode. It implements > the detection of Restricted CXL Hosts (RCHs) and its corresponding > Restricted CXL Devices (RCDs). It does the necessary enumeration of > ports and connects the endpoints. With all the plumbing an RCH/RCD > pair is registered at the Linux CXL bus and becomes visible in sysfs > in the same way as CXL VH hosts and devices do already. RCDs are > brought up as CXL endpoints and bound to subsequent drivers such as > cxl_mem. > > For CXL VH the host driver (cxl_acpi) starts host bridge discovery > once the ACPI0017 CXL root device is detected and then searches for > ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device > might not necessarily exist That's a broken BIOS as far as I can see. No ACPI0017 == no OS CXL services and the CXL aspects of the device need to be 100% managed by the BIOS. You can still run the cxl_pci driver in that case for mailbox operation, but error handling must be firmware-first without ACPI0017. > PCIe host bridge PNP0A08 ID, there aren't any CXL port or switches in > the PCIe hierarchy visible. As such the RCD mode enumeration and host > discovery is very different from CXL VH. See patch #5 for > implementation details. > > This implementation expects the host's downstream and upstream port > RCRBs base address being reported by firmware using the optional CEDT > CHBS entry of the host bridge (see CXL spec 3.0, 9.17.1.2). > > RCD mode does not support hot-plug, so host discovery is at boot time > only. > > Patches #1 to #4 are prerequisites of the series with fixes needed and > a rework of debug messages for port enumeration. Those are general > patches and could be applied earlier and independently from the rest > assuming there are no objections with them. Patches #5 to #15 contain > the actual implementation of RCD mode support. > > [1] https://www.computeexpresslink.org/spec-landing > > Robert Richter (15): > cxl/core: Remove duplicate declaration of devm_cxl_iomap_block() > cxl/core: Check physical address before mapping it in > devm_cxl_iomap_block() > cxl: Unify debug messages when calling devm_cxl_add_port() > cxl: Unify debug messages when calling devm_cxl_add_dport() > cxl/acpi: Add probe function to detect restricted CXL hosts in RCD > mode > PCI/ACPI: Link host bridge to its ACPI fw node > cxl/acpi: Check RCH's PCIe Host Bridge ACPI ID > cxl/acpi: Check RCH's CXL DVSEC capabilities > cxl/acpi: Determine PCI host bridge's ACPI UID > cxl/acpi: Extract the RCH's RCRB base address from CEDT > cxl/acpi: Extract the host's component register base address from RCRB > cxl/acpi: Skip devm_cxl_port_enumerate_dports() when in RCD mode > cxl/acpi: Rework devm_cxl_enumerate_ports() to support RCD mode > cxl/acpi: Enumerate ports in RCD mode to enable RCHs and RCDs > cxl/acpi: Specify module load order dependency for the cxl_acpi module > > drivers/acpi/pci_root.c | 1 + > drivers/cxl/acpi.c | 311 ++++++++++++++++++++++++++++++++++- > drivers/cxl/core/pci.c | 22 ++- > drivers/cxl/core/port.c | 103 ++++++++---- > drivers/cxl/core/regs.c | 3 + > drivers/cxl/cxl.h | 2 - > drivers/cxl/mem.c | 1 + > tools/testing/cxl/test/cxl.c | 8 +- > 8 files changed, 400 insertions(+), 51 deletions(-) > > -- > 2.30.2 >
Robert Richter wrote: > Jonathan, > > On 31.08.22 13:23:29, Jonathan Cameron wrote: > > On Wed, 31 Aug 2022 10:15:48 +0200 > > Robert Richter <rrichter@amd.com> wrote: > > > > > In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) > > > the PCIe enumeration hierarchy is different from CXL VH Enumeration > > > (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 > > > and 9.12, [1]). This series adds support for RCD mode. It implements > > > the detection of Restricted CXL Hosts (RCHs) and its corresponding > > > Restricted CXL Devices (RCDs). It does the necessary enumeration of > > > ports and connects the endpoints. With all the plumbing an RCH/RCD > > > pair is registered at the Linux CXL bus and becomes visible in sysfs > > > in the same way as CXL VH hosts and devices do already. RCDs are > > > brought up as CXL endpoints and bound to subsequent drivers such as > > > cxl_mem. > > > > > > For CXL VH the host driver (cxl_acpi) starts host bridge discovery > > > once the ACPI0017 CXL root device is detected and then searches for > > > ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device > > > might not necessarily exist and the host bridge can have a standard > > > PCIe host bridge PNP0A08 ID, there aren't any CXL port or switches in > > > the PCIe hierarchy visible. As such the RCD mode enumeration and host > > > discovery is very different from CXL VH. See patch #5 for > > > implementation details. > > > > > > This implementation expects the host's downstream and upstream port > > > RCRBs base address being reported by firmware using the optional CEDT > > > CHBS entry of the host bridge (see CXL spec 3.0, 9.17.1.2). > > > > > > RCD mode does not support hot-plug, so host discovery is at boot time > > > only. > > > > > > Patches #1 to #4 are prerequisites of the series with fixes needed and > > > a rework of debug messages for port enumeration. Those are general > > > patches and could be applied earlier and independently from the rest > > > assuming there are no objections with them. Patches #5 to #15 contain > > > the actual implementation of RCD mode support. > > > > > > [1] https://www.computeexpresslink.org/spec-landing > > > > Hi Robert, > > > > I'm curious on the aims of this work. Given expectation for RCDs is often > > that the host firmware has set them up before the OS loads, what functionality > > do you want to gain by mapping these into the CXL 2.0+ focused infrastructure? > > > > When I did some analysis a while back on CXL 1.1 I was pretty much assuming > > that there was no real reason to let the OS know about it because it > > couldn't do much of any use with the information. There are some corners > > like RAS where it might be useful or perhaps to enable some of the CXL 3.0 > > features that are allowed to be EP only and so could be relevant for > > an older host (e.g. CPMUs). > > though CXL RCD works with a legacy kernel or without any CXL > functionality added, a CXL aware kernel can be useful also for RCD > mode. RAS is a topic here but also gathering device information such > as status or topology. Everything where access to the component > register block or mailbox interface is required. Unless the BIOS is going actively enable the standard CXL topology with ACPI0017 then I think it should be hands off for the OS. The maintenance burden of some of the hack to work around missing BIOS descriptions is non-trivial, and it is still early days to encourage BIOS vendors to enable what is needed and set end user expectations that these pre-requisites exist. As far as I can see this enabling adds an additional CXL "root" device and I do not think userspace should need to care if a CXL 2.0 device is attached to an RCH or not. > Another plus, the CXL hierarchy becomes visible for RCD mode in sysfs > and the device hierarchy. > > Reusing the existing infrastructure for this makes sense. Many > features overlap in both modes (e.g. RAS, mailbox again, or topology > information). RAS only if OS first is supported by the BIOS. Mailbox support happens with or without a CXL root device. The topology information is certainly important in OS first error handling, but if its firmware first its going to have its own FRU id scheme. Much of the common case topology information for the RCH case (like which RCIEP is hosting which CXL address range) is covered by this pending lspci update: https://github.com/pciutils/pciutils/pull/59: ...although that needs some help to get over the goal line. Otherwise the topology information is mostly for describing all the degrees of freedom of a full blown CXL 2.0 topoloy with host bridge and switch interleaving.
> On Sep 7, 2022, at 10:43 PM, Dan Williams <dan.j.williams@intel.com> wrote: > > Apologies for the delay in getting to this I had hoped to be able to > finish up some other DAX work to focus on this, but time is getting > short so I will need to do both in parallel. > > Robert Richter wrote: >> In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) >> the PCIe enumeration hierarchy is different from CXL VH Enumeration >> (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 >> and 9.12, [1]). This series adds support for RCD mode. It implements >> the detection of Restricted CXL Hosts (RCHs) and its corresponding >> Restricted CXL Devices (RCDs). It does the necessary enumeration of >> ports and connects the endpoints. With all the plumbing an RCH/RCD >> pair is registered at the Linux CXL bus and becomes visible in sysfs >> in the same way as CXL VH hosts and devices do already. RCDs are >> brought up as CXL endpoints and bound to subsequent drivers such as >> cxl_mem. >> >> For CXL VH the host driver (cxl_acpi) starts host bridge discovery >> once the ACPI0017 CXL root device is detected and then searches for >> ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device >> might not necessarily exist > > That's a broken BIOS as far as I can see. No ACPI0017 == no OS CXL > services and the CXL aspects of the device need to be 100% managed by > the BIOS. You can still run the cxl_pci driver in that case for mailbox > operation, but error handling must be firmware-first without ACPI0017. Firmware-first or OS-first applies to CXL protocol error handling. For CXL memory error handling, the device generates a DRAM error record, the OS parses such record and act accordingly. According to CXL spec (section 8.2.9.2.1.2 DRAM Event Record), DPA but not HPA is in such record. The OS needs to translate such DPA into HPA to act on. I am taking this as an example to show that OS CXL services is needed. Instead of using ACPI0016 to tell whether the system is under RCH mode, I suppose one way is to check “CXL version” field of CHBS structure in CEDT? > >> PCIe host bridge PNP0A08 ID, there aren't any CXL port or switches in >> the PCIe hierarchy visible. As such the RCD mode enumeration and host >> discovery is very different from CXL VH. See patch #5 for >> implementation details. >> >> This implementation expects the host's downstream and upstream port >> RCRBs base address being reported by firmware using the optional CEDT >> CHBS entry of the host bridge (see CXL spec 3.0, 9.17.1.2). >> >> RCD mode does not support hot-plug, so host discovery is at boot time >> only. >> >> Patches #1 to #4 are prerequisites of the series with fixes needed and >> a rework of debug messages for port enumeration. Those are general >> patches and could be applied earlier and independently from the rest >> assuming there are no objections with them. Patches #5 to #15 contain >> the actual implementation of RCD mode support. >> >> [1] https://www.computeexpresslink.org/spec-landing >> >> Robert Richter (15): >> cxl/core: Remove duplicate declaration of devm_cxl_iomap_block() >> cxl/core: Check physical address before mapping it in >> devm_cxl_iomap_block() >> cxl: Unify debug messages when calling devm_cxl_add_port() >> cxl: Unify debug messages when calling devm_cxl_add_dport() >> cxl/acpi: Add probe function to detect restricted CXL hosts in RCD >> mode >> PCI/ACPI: Link host bridge to its ACPI fw node >> cxl/acpi: Check RCH's PCIe Host Bridge ACPI ID >> cxl/acpi: Check RCH's CXL DVSEC capabilities >> cxl/acpi: Determine PCI host bridge's ACPI UID >> cxl/acpi: Extract the RCH's RCRB base address from CEDT >> cxl/acpi: Extract the host's component register base address from RCRB >> cxl/acpi: Skip devm_cxl_port_enumerate_dports() when in RCD mode >> cxl/acpi: Rework devm_cxl_enumerate_ports() to support RCD mode >> cxl/acpi: Enumerate ports in RCD mode to enable RCHs and RCDs >> cxl/acpi: Specify module load order dependency for the cxl_acpi module >> >> drivers/acpi/pci_root.c | 1 + >> drivers/cxl/acpi.c | 311 ++++++++++++++++++++++++++++++++++- >> drivers/cxl/core/pci.c | 22 ++- >> drivers/cxl/core/port.c | 103 ++++++++---- >> drivers/cxl/core/regs.c | 3 + >> drivers/cxl/cxl.h | 2 - >> drivers/cxl/mem.c | 1 + >> tools/testing/cxl/test/cxl.c | 8 +- >> 8 files changed, 400 insertions(+), 51 deletions(-) >> >> -- >> 2.30.2 >> > > >
Jonathan Zhang (Infra) wrote: > > > > On Sep 7, 2022, at 10:43 PM, Dan Williams <dan.j.williams@intel.com> wrote: > > > > Apologies for the delay in getting to this I had hoped to be able to > > finish up some other DAX work to focus on this, but time is getting > > short so I will need to do both in parallel. > > > > Robert Richter wrote: > >> In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) > >> the PCIe enumeration hierarchy is different from CXL VH Enumeration > >> (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 > >> and 9.12, [1]). This series adds support for RCD mode. It implements > >> the detection of Restricted CXL Hosts (RCHs) and its corresponding > >> Restricted CXL Devices (RCDs). It does the necessary enumeration of > >> ports and connects the endpoints. With all the plumbing an RCH/RCD > >> pair is registered at the Linux CXL bus and becomes visible in sysfs > >> in the same way as CXL VH hosts and devices do already. RCDs are > >> brought up as CXL endpoints and bound to subsequent drivers such as > >> cxl_mem. > >> > >> For CXL VH the host driver (cxl_acpi) starts host bridge discovery > >> once the ACPI0017 CXL root device is detected and then searches for > >> ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device > >> might not necessarily exist > > > > That's a broken BIOS as far as I can see. No ACPI0017 == no OS CXL > > services and the CXL aspects of the device need to be 100% managed by > > the BIOS. You can still run the cxl_pci driver in that case for mailbox > > operation, but error handling must be firmware-first without ACPI0017. > Firmware-first or OS-first applies to CXL protocol error handling. For CXL > memory error handling, the device generates a DRAM error record, the OS > parses such record and act accordingly. According to CXL spec (section > 8.2.9.2.1.2 DRAM Event Record), DPA but not HPA is in such record. The OS > needs to translate such DPA into HPA to act on. I am taking this as an example > to show that OS CXL services is needed. > Instead of using ACPI0016 to tell whether the system is under RCH mode, > I suppose one way is to check “CXL version” field of CHBS structure in CEDT? Unless the OS has negotiated CXL _OSC the BIOS owns the event retrieval and translating it from DPA to HPA. I do want to add OS CXL services to Linux, but only in the case when the BIOS is actively enabling OS native address translation which includes populating ACPI0017, CFMWS, and devices with the HDM decoder capability registers instead of DVSEC range registers. Everything else is early-gen CXL that is 100% BIOS supported, similar to DDR where a driver is not expected.
> On Sep 8, 2022, at 12:51 PM, Dan Williams <dan.j.williams@intel.com> wrote: > > !-------------------------------------------------------------------| > This Message Is From an External Sender > > |-------------------------------------------------------------------! > > Jonathan Zhang (Infra) wrote: >> >> >>> On Sep 7, 2022, at 10:43 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>> >>> Apologies for the delay in getting to this I had hoped to be able to >>> finish up some other DAX work to focus on this, but time is getting >>> short so I will need to do both in parallel. >>> >>> Robert Richter wrote: >>>> In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) >>>> the PCIe enumeration hierarchy is different from CXL VH Enumeration >>>> (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 >>>> and 9.12, [1]). This series adds support for RCD mode. It implements >>>> the detection of Restricted CXL Hosts (RCHs) and its corresponding >>>> Restricted CXL Devices (RCDs). It does the necessary enumeration of >>>> ports and connects the endpoints. With all the plumbing an RCH/RCD >>>> pair is registered at the Linux CXL bus and becomes visible in sysfs >>>> in the same way as CXL VH hosts and devices do already. RCDs are >>>> brought up as CXL endpoints and bound to subsequent drivers such as >>>> cxl_mem. >>>> >>>> For CXL VH the host driver (cxl_acpi) starts host bridge discovery >>>> once the ACPI0017 CXL root device is detected and then searches for >>>> ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device >>>> might not necessarily exist >>> >>> That's a broken BIOS as far as I can see. No ACPI0017 == no OS CXL >>> services and the CXL aspects of the device need to be 100% managed by >>> the BIOS. You can still run the cxl_pci driver in that case for mailbox >>> operation, but error handling must be firmware-first without ACPI0017. >> Firmware-first or OS-first applies to CXL protocol error handling. For CXL >> memory error handling, the device generates a DRAM error record, the OS >> parses such record and act accordingly. According to CXL spec (section >> 8.2.9.2.1.2 DRAM Event Record), DPA but not HPA is in such record. The OS >> needs to translate such DPA into HPA to act on. I am taking this as an example >> to show that OS CXL services is needed. >> Instead of using ACPI0016 to tell whether the system is under RCH mode, >> I suppose one way is to check “CXL version” field of CHBS structure in CEDT? > > Unless the OS has negotiated CXL _OSC the BIOS owns the event retrieval > and translating it from DPA to HPA. I do want to add OS CXL services to > Linux, but only in the case when the BIOS is actively enabling OS native > address translation which includes populating ACPI0017, CFMWS, and > devices with the HDM decoder capability registers instead of DVSEC range > registers. Everything else is early-gen CXL that is 100% BIOS supported, > similar to DDR where a driver is not expected. It makes sense that the BIOS and OS need to negotiate CXL _OSC so that OS would take care of address translation. That being said, only DVSEC range register (but not HDM decoder capability register) is available when the device is in RCRB mode (section 9.11.8 figure 9-7) attached to a RCH. This type of configuration needs to be supported with OS CXL service.
Jonathan Zhang (Infra) wrote: > > > > On Sep 8, 2022, at 12:51 PM, Dan Williams <dan.j.williams@intel.com> wrote: > > > > !-------------------------------------------------------------------| > > This Message Is From an External Sender > > > > |-------------------------------------------------------------------! > > > > Jonathan Zhang (Infra) wrote: > >> > >> > >>> On Sep 7, 2022, at 10:43 PM, Dan Williams <dan.j.williams@intel.com> wrote: > >>> > >>> Apologies for the delay in getting to this I had hoped to be able to > >>> finish up some other DAX work to focus on this, but time is getting > >>> short so I will need to do both in parallel. > >>> > >>> Robert Richter wrote: > >>>> In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) > >>>> the PCIe enumeration hierarchy is different from CXL VH Enumeration > >>>> (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 > >>>> and 9.12, [1]). This series adds support for RCD mode. It implements > >>>> the detection of Restricted CXL Hosts (RCHs) and its corresponding > >>>> Restricted CXL Devices (RCDs). It does the necessary enumeration of > >>>> ports and connects the endpoints. With all the plumbing an RCH/RCD > >>>> pair is registered at the Linux CXL bus and becomes visible in sysfs > >>>> in the same way as CXL VH hosts and devices do already. RCDs are > >>>> brought up as CXL endpoints and bound to subsequent drivers such as > >>>> cxl_mem. > >>>> > >>>> For CXL VH the host driver (cxl_acpi) starts host bridge discovery > >>>> once the ACPI0017 CXL root device is detected and then searches for > >>>> ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device > >>>> might not necessarily exist > >>> > >>> That's a broken BIOS as far as I can see. No ACPI0017 == no OS CXL > >>> services and the CXL aspects of the device need to be 100% managed by > >>> the BIOS. You can still run the cxl_pci driver in that case for mailbox > >>> operation, but error handling must be firmware-first without ACPI0017. > >> Firmware-first or OS-first applies to CXL protocol error handling. For CXL > >> memory error handling, the device generates a DRAM error record, the OS > >> parses such record and act accordingly. According to CXL spec (section > >> 8.2.9.2.1.2 DRAM Event Record), DPA but not HPA is in such record. The OS > >> needs to translate such DPA into HPA to act on. I am taking this as an example > >> to show that OS CXL services is needed. > >> Instead of using ACPI0016 to tell whether the system is under RCH mode, > >> I suppose one way is to check “CXL version” field of CHBS structure in CEDT? > > > > Unless the OS has negotiated CXL _OSC the BIOS owns the event retrieval > > and translating it from DPA to HPA. I do want to add OS CXL services to > > Linux, but only in the case when the BIOS is actively enabling OS native > > address translation which includes populating ACPI0017, CFMWS, and > > devices with the HDM decoder capability registers instead of DVSEC range > > registers. Everything else is early-gen CXL that is 100% BIOS supported, > > similar to DDR where a driver is not expected. > > > It makes sense that the BIOS and OS need to negotiate CXL _OSC so that OS > would take care of address translation. That being said, only DVSEC range > register (but not HDM decoder capability register) is available when the device is in > RCRB mode (section 9.11.8 figure 9-7) attached to a RCH. This type of > configuration needs to be supported with OS CXL service. > So that figure does have the HDM capabilty pictured in the RCD upstream port. However, Table 8-22 does seem to incidate that Type 3 D1 devices are not permitted to have an HDM Decoder Capabilitiy Structure. However that then leave me confused about figure 9-8 as that shows an HDM decoder capability in the BAR and not the RCRB. Is that picture wrong with respect what Table 8-22 indicates?
Robert Richter wrote: > In Restricted CXL Device (RCD) mode (formerly referred to as CXL 1.1) > the PCIe enumeration hierarchy is different from CXL VH Enumeration > (formerly referred to as 2.0, for both modes see CXL spec 3.0: 9.11 > and 9.12, [1]). This series adds support for RCD mode. It implements > the detection of Restricted CXL Hosts (RCHs) and its corresponding > Restricted CXL Devices (RCDs). It does the necessary enumeration of > ports and connects the endpoints. With all the plumbing an RCH/RCD > pair is registered at the Linux CXL bus and becomes visible in sysfs > in the same way as CXL VH hosts and devices do already. RCDs are > brought up as CXL endpoints and bound to subsequent drivers such as > cxl_mem. > > For CXL VH the host driver (cxl_acpi) starts host bridge discovery > once the ACPI0017 CXL root device is detected and then searches for > ACPI0016 host bridges to enable CXL. In RCD mode an ACPI0017 device > might not necessarily exist and the host bridge can have a standard > PCIe host bridge PNP0A08 ID, there aren't any CXL port or switches in > the PCIe hierarchy visible. As such the RCD mode enumeration and host > discovery is very different from CXL VH. See patch #5 for > implementation details. > > This implementation expects the host's downstream and upstream port > RCRBs base address being reported by firmware using the optional CEDT > CHBS entry of the host bridge (see CXL spec 3.0, 9.17.1.2). > > RCD mode does not support hot-plug, so host discovery is at boot time > only. > > Patches #1 to #4 are prerequisites of the series with fixes needed and > a rework of debug messages for port enumeration. Those are general > patches and could be applied earlier and independently from the rest > assuming there are no objections with them. Patches #5 to #15 contain > the actual implementation of RCD mode support. Hi Robert, I did not see a response to some of my feedback but wanted to summarize where I think the next version of this set needs to go: 1/ ACPI0017 is mandatory. If a BIOS does not provide ACPI0017 it is explicitly opting the OS out of managing anything other than the CXL.io side of memory expanders. 2/ Per table 8-22 in CXL 3.0 RCDs are not permitted to have HDM decoders so that assumption in the driver needs to be reworked. 3/ It's not even clear that the Register Locator DVSEC has any role to play in an RCD as every register the driver needs should be relative to the RCRB. So the assumptions in the driver need to consider RCRB located registers as a first class citizen.