mbox series

[v4,00/12] cxl: Add support for Restricted CXL hosts (RCD mode)

Message ID 166931487492.2104015.15204324083515120776.stgit@dwillia2-xfh.jf.intel.com
Headers show
Series cxl: Add support for Restricted CXL hosts (RCD mode) | expand

Message

Dan Williams Nov. 24, 2022, 6:34 p.m. UTC
Changes since v3 [1]:
- Rework / simplify CXL to LIBNVDIMM coordination to remove a
  flush_work() locking dependency from underneath the root device lock.
- Move the root device rescan to a workqueue
- Connect RCDs directly as endpoints reachable through a CXL host bridge
  as a dport, i.e. drop the extra dport indirection from v3
- Add unit test infrastructure for an RCD configuration

[1]: http://lore.kernel.org/r/20221109104059.766720-1-rrichter@amd.com/

---

>From [PATCH v4 10/12] cxl/port: Add RCD endpoint port enumeration

Unlike a CXL memory expander in a VH topology that has at least one
intervening 'struct cxl_port' instance between itself and the CXL root
device, an RCD attaches one-level higher. For example:

               VH
          ┌──────────┐
          │ ACPI0017 │
          │  root0   │
          └─────┬────┘
                │
          ┌─────┴────┐
          │  dport0  │
    ┌─────┤ ACPI0016 ├─────┐
    │     │  port1   │     │
    │     └────┬─────┘     │
    │          │           │
 ┌──┴───┐   ┌──┴───┐   ┌───┴──┐
 │dport0│   │dport1│   │dport2│
 │ RP0  │   │ RP1  │   │ RP2  │
 └──────┘   └──┬───┘   └──────┘
               │
           ┌───┴─────┐
           │endpoint0│
           │  port2  │
           └─────────┘

...vs:

              RCH
          ┌──────────┐
          │ ACPI0017 │
          │  root0   │
          └────┬─────┘
               │
           ┌───┴────┐
           │ dport0 │
           │ACPI0016│
           └───┬────┘
               │
          ┌────┴─────┐
          │endpoint0 │
          │  port1   │
          └──────────┘

So arrange for endpoint port in the RCH/RCD case to appear directly
connected to the host-bridge in its singular role as a dport. Compare
that to the VH case where the host-bridge serves a dual role as a
'cxl_dport' for the CXL root device *and* a 'cxl_port' upstream port for
the Root Ports in the Root Complex that are modeled as 'cxl_dport'
instances in the CXL topology.

Another deviation from the VH case is that RCDs may need to look up
their component registers from the Root Complex Register Block (RCRB).
That platform firmware specified RCRB area is cached by the cxl_acpi
driver and conveyed via the host-bridge dport to the cxl_mem driver to
perform the cxl_rcrb_to_component() lookup for the endpoint port
(See 9.11.8 CXL Devices Attached to an RCH for the lookup of the
upstream port component registers).

---

Dan Williams (9):
      cxl/acpi: Simplify cxl_nvdimm_bridge probing
      cxl/region: Drop redundant pmem region release handling
      cxl/pmem: Refactor nvdimm device registration, delete the workqueue
      cxl/pmem: Remove the cxl_pmem_wq and related infrastructure
      cxl/acpi: Move rescan to the workqueue
      tools/testing/cxl: Make mock CEDT parsing more robust
      cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
      cxl/port: Add RCD endpoint port enumeration
      tools/testing/cxl: Add an RCH topology

Robert Richter (2):
      cxl/ACPI: Register CXL host ports by bridge device
      cxl/acpi: Extract component registers of restricted hosts from RCRB

Terry Bowman (1):
      cxl/acpi: Set ACPI's CXL _OSC to indicate CXL1.1 support


 drivers/acpi/pci_root.c       |    1 
 drivers/cxl/acpi.c            |  105 +++++++++---
 drivers/cxl/core/core.h       |    8 -
 drivers/cxl/core/pmem.c       |   94 +++++++----
 drivers/cxl/core/port.c       |  111 +++++++------
 drivers/cxl/core/region.c     |   54 ++++++
 drivers/cxl/core/regs.c       |   56 +++++++
 drivers/cxl/cxl.h             |   46 +++--
 drivers/cxl/cxlmem.h          |   15 ++
 drivers/cxl/mem.c             |   72 ++++++++
 drivers/cxl/pci.c             |   13 +-
 drivers/cxl/pmem.c            |  351 +++++------------------------------------
 tools/testing/cxl/Kbuild      |    1 
 tools/testing/cxl/test/cxl.c  |  241 ++++++++++++++++++++++------
 tools/testing/cxl/test/mem.c  |   40 ++++-
 tools/testing/cxl/test/mock.c |   19 ++
 tools/testing/cxl/test/mock.h |    3 
 17 files changed, 712 insertions(+), 518 deletions(-)

base-commit: 3b39fd6cf12ceda2a2582dcb9b9ee9f4d197b857

Comments

Robert Richter Nov. 29, 2022, 9:26 p.m. UTC | #1
Dan,

On 24.11.22 10:34:35, Dan Williams wrote:
> Changes since v3 [1]:
> - Rework / simplify CXL to LIBNVDIMM coordination to remove a
>   flush_work() locking dependency from underneath the root device lock.
> - Move the root device rescan to a workqueue
> - Connect RCDs directly as endpoints reachable through a CXL host bridge
>   as a dport, i.e. drop the extra dport indirection from v3
> - Add unit test infrastructure for an RCD configuration

thank you for this posting.

Patches #1-#6 are not really prerequisites (except for a trivial
conflict), right? I only reviewed them starting with #6.

Thanks,

-Robert
Robert Richter Nov. 30, 2022, 5:14 p.m. UTC | #2
On 24.11.22 10:34:35, Dan Williams wrote:
> Changes since v3 [1]:
> - Rework / simplify CXL to LIBNVDIMM coordination to remove a
>   flush_work() locking dependency from underneath the root device lock.
> - Move the root device rescan to a workqueue
> - Connect RCDs directly as endpoints reachable through a CXL host bridge
>   as a dport, i.e. drop the extra dport indirection from v3
> - Add unit test infrastructure for an RCD configuration
> 
> [1]: http://lore.kernel.org/r/20221109104059.766720-1-rrichter@amd.com/
> 
> ---
> 
> >From [PATCH v4 10/12] cxl/port: Add RCD endpoint port enumeration

> ---
> 
> Dan Williams (9):
>       cxl/acpi: Simplify cxl_nvdimm_bridge probing
>       cxl/region: Drop redundant pmem region release handling
>       cxl/pmem: Refactor nvdimm device registration, delete the workqueue
>       cxl/pmem: Remove the cxl_pmem_wq and related infrastructure
>       cxl/acpi: Move rescan to the workqueue
>       tools/testing/cxl: Make mock CEDT parsing more robust
>       cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
>       cxl/port: Add RCD endpoint port enumeration
>       tools/testing/cxl: Add an RCH topology
> 
> Robert Richter (2):
>       cxl/ACPI: Register CXL host ports by bridge device
>       cxl/acpi: Extract component registers of restricted hosts from RCRB
> 
> Terry Bowman (1):
>       cxl/acpi: Set ACPI's CXL _OSC to indicate CXL1.1 support
> 
> 
>  drivers/acpi/pci_root.c       |    1 
>  drivers/cxl/acpi.c            |  105 +++++++++---
>  drivers/cxl/core/core.h       |    8 -
>  drivers/cxl/core/pmem.c       |   94 +++++++----
>  drivers/cxl/core/port.c       |  111 +++++++------
>  drivers/cxl/core/region.c     |   54 ++++++
>  drivers/cxl/core/regs.c       |   56 +++++++
>  drivers/cxl/cxl.h             |   46 +++--
>  drivers/cxl/cxlmem.h          |   15 ++
>  drivers/cxl/mem.c             |   72 ++++++++
>  drivers/cxl/pci.c             |   13 +-
>  drivers/cxl/pmem.c            |  351 +++++------------------------------------
>  tools/testing/cxl/Kbuild      |    1 
>  tools/testing/cxl/test/cxl.c  |  241 ++++++++++++++++++++++------
>  tools/testing/cxl/test/mem.c  |   40 ++++-
>  tools/testing/cxl/test/mock.c |   19 ++
>  tools/testing/cxl/test/mock.h |    3 
>  17 files changed, 712 insertions(+), 518 deletions(-)

I have tested this series and the enumeration is as expected (see
sysfs dump below). I see an HDM failure which I am investigating, but
that seems unrelated to this series.

You can add to this series my:

Tested-by: Robert Richter <rrichter@amd.com>

The drawback of this topology decision is that the endpoint's port is
a child of root0, which is the ACPI0017's device (not the CXL host
bridge):

 /sys/bus/cxl/devices/root0/endpoint1

I understand this is the CXL's RCD view on the pcie hierarchy here.
Logically there would be a cxl host port in between, like this:

 /sys/bus/cxl/devices/root0/port1/endpoint2

Esp. if there are multiple hosts in the system the port relations will
be lost and can only be determined using the pci dev's parent bridge.

port1 could then also hold the uport's component regs with the
upstream port capabilities to access e.g. status regs for RAS.

Anyway, let's see how this approach flies.

Thanks for looing into this.

-Robert

---

# find /sys/bus/cxl/devices/ -ls
   265293      0 drwxr-xr-x   2 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/
   265493      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/root0 -> ../../../devices/platform/ACPI0017:00/root0
   265664      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/endpoint1 -> ../../../devices/platform/ACPI0017:00/root0/endpoint1
   265584      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/mem0 -> ../../../devices/pci0000:7f/0000:7f:00.0/mem0
# find /sys/bus/cxl/devices/*/ -ls 
   265660      0 drwxr-xr-x   2 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/endpoint1/
   265661      0 -rw-r--r--   1 root     root         4096 Nov 30 14:18 /sys/bus/cxl/devices/endpoint1/uevent
   265667      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/endpoint1/driver -> ../../../../../bus/cxl/drivers/cxl_port
   265669      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/endpoint1/uport -> ../../../../pci0000:7f/0000:7f:00.0/mem0
   265668      0 -r--------   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/endpoint1/CDAT
   265665      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/endpoint1/subsystem -> ../../../../../bus/cxl
   265662      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/endpoint1/devtype
   265663      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/endpoint1/modalias
   265573      0 drwxr-xr-x   4 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/mem0/
   265574      0 -rw-r--r--   1 root     root         4096 Nov 30 14:18 /sys/bus/cxl/devices/mem0/uevent
   265578      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/label_storage_size
   265582      0 drwxr-xr-x   2 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/mem0/pmem
   265583      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/pmem/size
   265576      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/firmware_version
   265579      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/numa_node
   265586      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/dev
   265659      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/mem0/driver -> ../../../../bus/cxl/drivers/cxl_mem
   265580      0 drwxr-xr-x   2 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/mem0/ram
   265581      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/ram/size
   265585      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/mem0/subsystem -> ../../../../bus/cxl
   265575      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/serial
   265577      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/mem0/payload_max
   265489      0 drwxr-xr-x   3 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/root0/
   265490      0 -rw-r--r--   1 root     root         4096 Nov 30 14:18 /sys/bus/cxl/devices/root0/uevent
   265660      0 drwxr-xr-x   2 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/root0/endpoint1
   265661      0 -rw-r--r--   1 root     root         4096 Nov 30 14:18 /sys/bus/cxl/devices/root0/endpoint1/uevent
   265667      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/root0/endpoint1/driver -> ../../../../../bus/cxl/drivers/cxl_port
   265669      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/root0/endpoint1/uport -> ../../../../pci0000:7f/0000:7f:00.0/mem0
   265668      0 -r--------   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/root0/endpoint1/CDAT
   265665      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/root0/endpoint1/subsystem -> ../../../../../bus/cxl
   265662      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/root0/endpoint1/devtype
   265663      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/root0/endpoint1/modalias
   265495      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/root0/uport -> ../../ACPI0017:00
   265496      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:21 /sys/bus/cxl/devices/root0/dport4 -> ../../../pci0000:7f
   265494      0 lrwxrwxrwx   1 root     root            0 Nov 30 14:18 /sys/bus/cxl/devices/root0/subsystem -> ../../../../bus/cxl
   265491      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/root0/devtype
   265492      0 -r--r--r--   1 root     root         4096 Nov 30 14:21 /sys/bus/cxl/devices/root0/modalias
Dan Williams Nov. 30, 2022, 8:59 p.m. UTC | #3
Robert Richter wrote:
> Dan,
> 
> On 24.11.22 10:34:35, Dan Williams wrote:
> > Changes since v3 [1]:
> > - Rework / simplify CXL to LIBNVDIMM coordination to remove a
> >   flush_work() locking dependency from underneath the root device lock.
> > - Move the root device rescan to a workqueue
> > - Connect RCDs directly as endpoints reachable through a CXL host bridge
> >   as a dport, i.e. drop the extra dport indirection from v3
> > - Add unit test infrastructure for an RCD configuration
> 
> thank you for this posting.
> 
> Patches #1-#6 are not really prerequisites (except for a trivial
> conflict), right? I only reviewed them starting with #6.

In fact they are pre-requisites because of this hunk in:

[PATCH v4 10/12] cxl/port: Add RCD endpoint port enumeration

http://lore.kernel.org/r/166931493266.2104015.8062923429837042172.stgit@dwillia2-xfh.jf.intel.com/

@@ -119,17 +131,22 @@ static int cxl_mem_probe(struct device *dev)
                return -ENXIO;
        }
 
-       device_lock(&parent_port->dev);
-       if (!parent_port->dev.driver) {
+       if (dport->rch)
+               endpoint_parent = parent_port->uport;
+       else
+               endpoint_parent = &parent_port->dev;
+
+       device_lock(endpoint_parent);
+       if (!endpoint_parent->driver) {
                dev_err(dev, "CXL port topology %s not enabled\n",
                        dev_name(&parent_port->dev));
                rc = -ENXIO;
                goto unlock;
        }
 
-       rc = devm_cxl_add_endpoint(cxlmd, dport);
+       rc = devm_cxl_add_endpoint(endpoint_parent, cxlmd, dport);
 unlock:
-       device_unlock(&parent_port->dev);
+       device_unlock(endpoint_parent);
        put_device(&parent_port->dev);
        if (rc)
                return rc;

That device_lock(endpoint_parent) in the RCH case locks the ACPI0017
device so that a devm action can be serialized against ACPI0017 being
unbound from its driver. Before RCH support this path only locked
cxl_port objects to syncrhonize with the cxl_port driver. In the RCH
case the port that needs to be locked while the endpoint is attached is
the one owned by the cxl_acpi driver.

This all happens while holding device_lock(&cxlmd->dev). It means
lockdep complains about cxl_bus_rescan() being done under
device_lock(&ACPI0017->dev), since rescan may take
device_lock(&cxlmd->dev), and it complains about the flush_work() in
unregister_nvb() for a similar entanglement. So the initial patches in
this series are all about making it possible setup an unregistration
chain when the root CXL device goes through ->remove().
Robert Richter Dec. 1, 2022, 10:59 p.m. UTC | #4
On 30.11.22 12:59:28, Dan Williams wrote:
> Robert Richter wrote:
> > Dan,
> > 
> > On 24.11.22 10:34:35, Dan Williams wrote:
> > > Changes since v3 [1]:
> > > - Rework / simplify CXL to LIBNVDIMM coordination to remove a
> > >   flush_work() locking dependency from underneath the root device lock.
> > > - Move the root device rescan to a workqueue
> > > - Connect RCDs directly as endpoints reachable through a CXL host bridge
> > >   as a dport, i.e. drop the extra dport indirection from v3
> > > - Add unit test infrastructure for an RCD configuration
> > 
> > thank you for this posting.
> > 
> > Patches #1-#6 are not really prerequisites (except for a trivial
> > conflict), right? I only reviewed them starting with #6.
> 
> In fact they are pre-requisites because of this hunk in:

Thanks for the explanation of using device_lock() here.

-Robert