diff mbox series

[v4,2/3] cxl/region: Calculate a target position in a region interleave

Message ID 0ac32c75cf81dd8b86bf07d70ff139d33c2300bc.1698263080.git.alison.schofield@intel.com
State Accepted
Commit 9f3899fd1bb5cf809964e06d86f28fe8b7643a00
Headers show
Series cxl/region: Autodiscovery position repair | expand

Commit Message

Alison Schofield Oct. 25, 2023, 8:01 p.m. UTC
From: Alison Schofield <alison.schofield@intel.com>

Introduce a calculation to find a target's position in a region
interleave. Perform a self-test of the calculation on user-defined
regions.

The region driver uses the kernel sort() function to put region
targets in relative order. Positions are assigned based on each
target's index in that sorted list. That relative sort doesn't
consider the offset of a port into its parent port which causes
some auto-discovered regions to fail creation. In one failure case,
a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
puts all the targets of one port ahead of another port when they
were expected to be interleaved.

In preparation for repairing the autodiscovery region assembly,
introduce a new method for discovering a target position in the
region interleave.

cxl_calc_interleave_pos() adds a method to find the target position by
ascending from an endpoint to a root decoder. The calculation starts
with the endpoint's local position and position in the parent port. It
traverses towards the root decoder and examines both position and ways
in order to allow the position to be refined all the way to the root
decoder.

This calculation: position = position * parent_ways + parent_pos;
applied iteratively yields the correct position.

Include a self-test that exercises this new position calculation against
every successfully configured user-defined region.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
 drivers/cxl/core/region.c | 122 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)

Comments

Dan Williams Oct. 26, 2023, 3:43 a.m. UTC | #1
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> Introduce a calculation to find a target's position in a region
> interleave. Perform a self-test of the calculation on user-defined
> regions.
> 
> The region driver uses the kernel sort() function to put region
> targets in relative order. Positions are assigned based on each
> target's index in that sorted list. That relative sort doesn't
> consider the offset of a port into its parent port which causes
> some auto-discovered regions to fail creation. In one failure case,
> a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
> puts all the targets of one port ahead of another port when they
> were expected to be interleaved.
> 
> In preparation for repairing the autodiscovery region assembly,
> introduce a new method for discovering a target position in the
> region interleave.
> 
> cxl_calc_interleave_pos() adds a method to find the target position by
> ascending from an endpoint to a root decoder. The calculation starts
> with the endpoint's local position and position in the parent port. It
> traverses towards the root decoder and examines both position and ways
> in order to allow the position to be refined all the way to the root
> decoder.
> 
> This calculation: position = position * parent_ways + parent_pos;
> applied iteratively yields the correct position.
> 
> Include a self-test that exercises this new position calculation against
> every successfully configured user-defined region.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
[..]
> +/**
> + * cxl_calc_interleave_pos() - calculate an endpoint position in a region
> + * @cxled: the endpoint decoder
> + *
> + * The endpoint position is calculated by traversing from the endpoint to
> + * the root decoder and iteratively applying this calculation:
> + *	position = position * parent_ways + parent_pos;
> + *
> + * For example, the expected interleave order of the 4-way region shown
> + * below is: mem0, mem2, mem1, mem3
> + *
> + *		  root_port
> + *                 /      \
> + *      host_bridge_0    host_bridge_1
> + *        |    |           |    |
> + *       mem0 mem1        mem2 mem3
> + *
> + * In the example the calculator will iterate twice. The first iteration
> + * uses the mem position in the host-bridge and the ways of the host-
> + * bridge to generate the first, or local, position. The second iteration
> + * uses the host-bridge position in the root_port and the ways of the
> + * root_port to refine the position.
> + *
> + * A trace of the calculation per endpoint looks like this:
> + * mem0:	pos = 0 * 2 + 0		mem2:	pos = 0 * 2 + 0
> + *		pos = 0 * 2 + 0			pos = 0 * 2 + 1
> + *		pos: 0				pos: 1
> + *
> + * mem1:	pos = 0 * 2 + 1		mem3:	pos = 0 * 2 + 1
> + *		pos = 1 * 2 + 0			pos = 1 * 2 + 1
> + *		pos: 2				pos = 3
> + *
> + * Note that while this example is simple, the method applies to more
> + * complex topologies, including those with switches.
> + *
> + * Return: position >= 0 on success
> + *	   -ENXIO on failure
> + */
> +
> +static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> +{

This needed a minor fixup for the RCD case, since RCDs are directly
integrated into the CXL-root with no intervening ports. I went ahead and
rolled this hunk:

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 0fee06660c04..62f1feb1781f 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1575,8 +1575,12 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	struct range *range = &cxled->cxld.hpa_range;
 	int parent_ways = 0, parent_pos = 0, pos = 0;
+	struct cxl_memdev_state = cxlmd->cxlds;
 	int rc;
 
+	if (cxlds->rcd)
+		goto out;
+
 	/* Iterate from endpoint to root_port refining the position */
 	for (iter = port; iter; iter = next_port(iter)) {
 		if (is_cxl_root(iter))
@@ -1589,6 +1593,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
 		pos = pos * parent_ways + parent_pos;
 	}
 
+out:
 	dev_dbg(&cxlmd->dev,
 		"decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
 		dev_name(&cxled->cxld.dev), dev_name(cxlmd->dev.parent),
Dan Williams Oct. 26, 2023, 5:18 a.m. UTC | #2
Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > Introduce a calculation to find a target's position in a region
> > interleave. Perform a self-test of the calculation on user-defined
> > regions.
> > 
> > The region driver uses the kernel sort() function to put region
> > targets in relative order. Positions are assigned based on each
> > target's index in that sorted list. That relative sort doesn't
> > consider the offset of a port into its parent port which causes
> > some auto-discovered regions to fail creation. In one failure case,
> > a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
> > puts all the targets of one port ahead of another port when they
> > were expected to be interleaved.
> > 
> > In preparation for repairing the autodiscovery region assembly,
> > introduce a new method for discovering a target position in the
> > region interleave.
> > 
> > cxl_calc_interleave_pos() adds a method to find the target position by
> > ascending from an endpoint to a root decoder. The calculation starts
> > with the endpoint's local position and position in the parent port. It
> > traverses towards the root decoder and examines both position and ways
> > in order to allow the position to be refined all the way to the root
> > decoder.
> > 
> > This calculation: position = position * parent_ways + parent_pos;
> > applied iteratively yields the correct position.
> > 
> > Include a self-test that exercises this new position calculation against
> > every successfully configured user-defined region.
> > 
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---
> [..]
> > +/**
> > + * cxl_calc_interleave_pos() - calculate an endpoint position in a region
> > + * @cxled: the endpoint decoder
> > + *
> > + * The endpoint position is calculated by traversing from the endpoint to
> > + * the root decoder and iteratively applying this calculation:
> > + *	position = position * parent_ways + parent_pos;
> > + *
> > + * For example, the expected interleave order of the 4-way region shown
> > + * below is: mem0, mem2, mem1, mem3
> > + *
> > + *		  root_port
> > + *                 /      \
> > + *      host_bridge_0    host_bridge_1
> > + *        |    |           |    |
> > + *       mem0 mem1        mem2 mem3
> > + *
> > + * In the example the calculator will iterate twice. The first iteration
> > + * uses the mem position in the host-bridge and the ways of the host-
> > + * bridge to generate the first, or local, position. The second iteration
> > + * uses the host-bridge position in the root_port and the ways of the
> > + * root_port to refine the position.
> > + *
> > + * A trace of the calculation per endpoint looks like this:
> > + * mem0:	pos = 0 * 2 + 0		mem2:	pos = 0 * 2 + 0
> > + *		pos = 0 * 2 + 0			pos = 0 * 2 + 1
> > + *		pos: 0				pos: 1
> > + *
> > + * mem1:	pos = 0 * 2 + 1		mem3:	pos = 0 * 2 + 1
> > + *		pos = 1 * 2 + 0			pos = 1 * 2 + 1
> > + *		pos: 2				pos = 3
> > + *
> > + * Note that while this example is simple, the method applies to more
> > + * complex topologies, including those with switches.
> > + *
> > + * Return: position >= 0 on success
> > + *	   -ENXIO on failure
> > + */
> > +
> > +static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> > +{
> 
> This needed a minor fixup for the RCD case, since RCDs are directly
> integrated into the CXL-root with no intervening ports. I went ahead and
> rolled this hunk:

Nope, that hunk was broken and did not fix the issue, but the below
does. The reason this was triggering on the RCH region test was due to
the fact that cxl_test defines a sub-window size region to
auto-assemble.

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index dbaea89dfa4d..6f8a50bf6201 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1500,6 +1500,8 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
 	cxlsd = to_cxl_switch_decoder(dev);
 	r1 = &cxlsd->cxld.hpa_range;
 
+	if (is_root_decoder(dev))
+		return range_contains(r1, r2);
 	return (r1->start == r2->start && r1->end == r2->end);
 }


I'll fold this into patch1. The root-decoders are a super-set of
switch-decoders and the range they cover is a super-set of the region
range.

The failure mode could intermittently trigger a crash which I need to
debug, but I am fairly certain it was due to causing auto-assembly to
fail in an awkward place:

 general protection fault, probably for non-canonical address 0x5a5a5a5a5a5a5ab2: 0000 [#1] PREEMPT SMP PTI
 CPU: 23 PID: 1468 Comm: cxl Tainted: G           OE    N 6.6.0-rc3+ #1120
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023
 RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
 Call Trace:
  <TASK>
  ? die_addr+0x32/0x80
  ? exc_general_protection+0x19b/0x450
  ? asm_exc_general_protection+0x22/0x30
  ? to_cxl_port+0x8/0x60 [cxl_core]
  cxl_region_detach+0x19/0x210 [cxl_core]
  detach_target.part.0+0x29/0x80 [cxl_core]
  unregister_region+0x42/0x70 [cxl_core]
  devres_release_all+0xb8/0x110
  device_unbind_cleanup+0xe/0x70
  device_release_driver_internal+0x1d2/0x210
  unbind_store+0x9d/0xb0
Dan Williams Oct. 27, 2023, 7:39 p.m. UTC | #3
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> Introduce a calculation to find a target's position in a region
> interleave. Perform a self-test of the calculation on user-defined
> regions.
> 
> The region driver uses the kernel sort() function to put region
> targets in relative order. Positions are assigned based on each
> target's index in that sorted list. That relative sort doesn't
> consider the offset of a port into its parent port which causes
> some auto-discovered regions to fail creation. In one failure case,
> a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
> puts all the targets of one port ahead of another port when they
> were expected to be interleaved.
> 
> In preparation for repairing the autodiscovery region assembly,
> introduce a new method for discovering a target position in the
> region interleave.
> 
> cxl_calc_interleave_pos() adds a method to find the target position by
> ascending from an endpoint to a root decoder. The calculation starts
> with the endpoint's local position and position in the parent port. It
> traverses towards the root decoder and examines both position and ways
> in order to allow the position to be refined all the way to the root
> decoder.
> 
> This calculation: position = position * parent_ways + parent_pos;
> applied iteratively yields the correct position.
> 
> Include a self-test that exercises this new position calculation against
> every successfully configured user-defined region.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
[..]
> +/**
> + * cxl_calc_interleave_pos() - calculate an endpoint position in a region
> + * @cxled: the endpoint decoder
> + *
> + * The endpoint position is calculated by traversing from the endpoint to
> + * the root decoder and iteratively applying this calculation:
> + *	position = position * parent_ways + parent_pos;
> + *
> + * For example, the expected interleave order of the 4-way region shown
> + * below is: mem0, mem2, mem1, mem3
> + *
> + *		  root_port
> + *                 /      \
> + *      host_bridge_0    host_bridge_1
> + *        |    |           |    |
> + *       mem0 mem1        mem2 mem3
> + *
> + * In the example the calculator will iterate twice. The first iteration
> + * uses the mem position in the host-bridge and the ways of the host-
> + * bridge to generate the first, or local, position. The second iteration
> + * uses the host-bridge position in the root_port and the ways of the
> + * root_port to refine the position.
> + *
> + * A trace of the calculation per endpoint looks like this:
> + * mem0:	pos = 0 * 2 + 0		mem2:	pos = 0 * 2 + 0
> + *		pos = 0 * 2 + 0			pos = 0 * 2 + 1
> + *		pos: 0				pos: 1
> + *
> + * mem1:	pos = 0 * 2 + 1		mem3:	pos = 0 * 2 + 1
> + *		pos = 1 * 2 + 0			pos = 1 * 2 + 1
> + *		pos: 2				pos = 3

As 0day reports, kdoc does not like formatted text, so I will fold in
this fixup:

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 9c714b53908d..f6be4164dfbe 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1534,43 +1534,19 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
 
 /**
  * cxl_calc_interleave_pos() - calculate an endpoint position in a region
- * @cxled: the endpoint decoder
+ * @cxled: endpoint decoder member of given region
  *
- * The endpoint position is calculated by traversing from the endpoint to
- * the root decoder and iteratively applying this calculation:
- *	position = position * parent_ways + parent_pos;
+ * The endpoint position is calculated by traversing the topology from
+ * the endpoint to the root decoder and iteratively applying this
+ * calculation:
  *
- * For example, the expected interleave order of the 4-way region shown
- * below is: mem0, mem2, mem1, mem3
+ *    position = position * parent_ways + parent_pos;
  *
- *		  root_port
- *                 /      \
- *      host_bridge_0    host_bridge_1
- *        |    |           |    |
- *       mem0 mem1        mem2 mem3
- *
- * In the example the calculator will iterate twice. The first iteration
- * uses the mem position in the host-bridge and the ways of the host-
- * bridge to generate the first, or local, position. The second iteration
- * uses the host-bridge position in the root_port and the ways of the
- * root_port to refine the position.
- *
- * A trace of the calculation per endpoint looks like this:
- * mem0:	pos = 0 * 2 + 0		mem2:	pos = 0 * 2 + 0
- *		pos = 0 * 2 + 0			pos = 0 * 2 + 1
- *		pos: 0				pos: 1
- *
- * mem1:	pos = 0 * 2 + 1		mem3:	pos = 0 * 2 + 1
- *		pos = 1 * 2 + 0			pos = 1 * 2 + 1
- *		pos: 2				pos = 3
- *
- * Note that while this example is simple, the method applies to more
- * complex topologies, including those with switches.
+ * ...where @position is inferred from switch and root decoder target lists.
  *
  * Return: position >= 0 on success
  *	   -ENXIO on failure
  */
-
 static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_port *iter, *port = cxled_to_port(cxled);
@@ -1579,6 +1555,35 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
 	int parent_ways = 0, parent_pos = 0, pos = 0;
 	int rc;
 
+	/*
+	 * Example: the expected interleave order of the 4-way region shown
+	 * below is: mem0, mem2, mem1, mem3
+	 *
+	 *		  root_port
+	 *                 /      \
+	 *      host_bridge_0    host_bridge_1
+	 *        |    |           |    |
+	 *       mem0 mem1        mem2 mem3
+	 *
+	 * In the example the calculator will iterate twice. The first iteration
+	 * uses the mem position in the host-bridge and the ways of the host-
+	 * bridge to generate the first, or local, position. The second
+	 * iteration uses the host-bridge position in the root_port and the ways
+	 * of the root_port to refine the position.
+	 *
+	 * A trace of the calculation per endpoint looks like this:
+	 * mem0: pos = 0 * 2 + 0    mem2: pos = 0 * 2 + 0
+	 *       pos = 0 * 2 + 0          pos = 0 * 2 + 1
+	 *       pos: 0                   pos: 1
+	 *
+	 * mem1: pos = 0 * 2 + 1    mem3: pos = 0 * 2 + 1
+	 *       pos = 1 * 2 + 0          pos = 1 * 2 + 1
+	 *       pos: 2                   pos = 3
+	 *
+	 * Note that while this example is simple, the method applies to more
+	 * complex topologies, including those with switches.
+	 */
+
 	/* Iterate from endpoint to root_port refining the position */
 	for (iter = port; iter; iter = next_port(iter)) {
 		if (is_cxl_root(iter))
diff mbox series

Patch

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index eea8e89a6860..481aab3234bf 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1501,6 +1501,108 @@  static int match_switch_decoder_by_range(struct device *dev, void *data)
 	return (r1->start == r2->start && r1->end == r2->end);
 }
 
+static int find_pos_and_ways(struct cxl_port *port, struct range *range,
+			     int *pos, int *ways)
+{
+	struct cxl_switch_decoder *cxlsd;
+	struct cxl_port *parent;
+	struct device *dev;
+	int rc = -ENXIO;
+
+	parent = next_port(port);
+	if (!parent)
+		return rc;
+
+	dev = device_find_child(&parent->dev, range,
+				match_switch_decoder_by_range);
+	if (!dev) {
+		dev_err(port->uport_dev,
+			"failed to find decoder mapping %#llx-%#llx\n",
+			range->start, range->end);
+		return rc;
+	}
+	cxlsd = to_cxl_switch_decoder(dev);
+	*ways = cxlsd->cxld.interleave_ways;
+
+	for (int i = 0; i < *ways; i++) {
+		if (cxlsd->target[i] == port->parent_dport) {
+			*pos = i;
+			rc = 0;
+			break;
+		}
+	}
+	put_device(dev);
+
+	return rc;
+}
+
+/**
+ * cxl_calc_interleave_pos() - calculate an endpoint position in a region
+ * @cxled: the endpoint decoder
+ *
+ * The endpoint position is calculated by traversing from the endpoint to
+ * the root decoder and iteratively applying this calculation:
+ *	position = position * parent_ways + parent_pos;
+ *
+ * For example, the expected interleave order of the 4-way region shown
+ * below is: mem0, mem2, mem1, mem3
+ *
+ *		  root_port
+ *                 /      \
+ *      host_bridge_0    host_bridge_1
+ *        |    |           |    |
+ *       mem0 mem1        mem2 mem3
+ *
+ * In the example the calculator will iterate twice. The first iteration
+ * uses the mem position in the host-bridge and the ways of the host-
+ * bridge to generate the first, or local, position. The second iteration
+ * uses the host-bridge position in the root_port and the ways of the
+ * root_port to refine the position.
+ *
+ * A trace of the calculation per endpoint looks like this:
+ * mem0:	pos = 0 * 2 + 0		mem2:	pos = 0 * 2 + 0
+ *		pos = 0 * 2 + 0			pos = 0 * 2 + 1
+ *		pos: 0				pos: 1
+ *
+ * mem1:	pos = 0 * 2 + 1		mem3:	pos = 0 * 2 + 1
+ *		pos = 1 * 2 + 0			pos = 1 * 2 + 1
+ *		pos: 2				pos = 3
+ *
+ * Note that while this example is simple, the method applies to more
+ * complex topologies, including those with switches.
+ *
+ * Return: position >= 0 on success
+ *	   -ENXIO on failure
+ */
+
+static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
+{
+	struct cxl_port *iter, *port = cxled_to_port(cxled);
+	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+	struct range *range = &cxled->cxld.hpa_range;
+	int parent_ways = 0, parent_pos = 0, pos = 0;
+	int rc;
+
+	/* Iterate from endpoint to root_port refining the position */
+	for (iter = port; iter; iter = next_port(iter)) {
+		if (is_cxl_root(iter))
+			break;
+
+		rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
+		if (rc)
+			return rc;
+
+		pos = pos * parent_ways + parent_pos;
+	}
+
+	dev_dbg(&cxlmd->dev,
+		"decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
+		dev_name(&cxled->cxld.dev), dev_name(cxlmd->dev.parent),
+		dev_name(&port->dev), range->start, range->end, pos);
+
+	return pos;
+}
+
 static void find_positions(const struct cxl_switch_decoder *cxlsd,
 			   const struct cxl_port *iter_a,
 			   const struct cxl_port *iter_b, int *a_pos,
@@ -1764,6 +1866,26 @@  static int cxl_region_attach(struct cxl_region *cxlr,
 		.end = p->res->end,
 	};
 
+	if (p->nr_targets != p->interleave_ways)
+		return 0;
+
+	/*
+	 * Test the auto-discovery position calculator function
+	 * against this successfully created user-defined region.
+	 * A fail message here means that this interleave config
+	 * will fail when presented as CXL_REGION_F_AUTO.
+	 */
+	for (int i = 0; i < p->nr_targets; i++) {
+		struct cxl_endpoint_decoder *cxled = p->targets[i];
+		int test_pos;
+
+		test_pos = cxl_calc_interleave_pos(cxled);
+		dev_dbg(&cxled->cxld.dev,
+			"Test cxl_calc_interleave_pos(): %s test_pos:%d cxled->pos:%d\n",
+			(test_pos == cxled->pos) ? "success" : "fail",
+			test_pos, cxled->pos);
+	}
+
 	return 0;
 
 err_decrement: