diff mbox series

[v1,15/29] cxl/region: Use an endpoint's SPA range to find a region

Message ID 20250107141015.3367194-16-rrichter@amd.com
State New
Headers show
Series cxl: Add address translation support and enable AMD Zen5 platforms | expand

Commit Message

Robert Richter Jan. 7, 2025, 2:10 p.m. UTC
To find the correct region and root port of an endpoint of a system
needing address translation, the endpoint's HPA range must be
translated to each of the parent port address ranges up to the root
decoder.

Calculate the SPA range using the newly introduced callback function
port->to_hpa() that translates the decoder's HPA range to its parent
port's HPA range of the next outer memory domain. Introduce the helper
function cxl_port_calc_hpa() for this to calculate address ranges
using the low-level port->to_hpa() callbacks. Determine the root port
SPA range by iterating all the ports up to the root. Store the
endpoint's SPA range and use it to find the endpoint's region.

Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
 drivers/cxl/cxl.h         |  1 +
 2 files changed, 71 insertions(+), 15 deletions(-)

Comments

Gregory Price Jan. 7, 2025, 7:14 p.m. UTC | #1
On Tue, Jan 07, 2025 at 03:10:01PM +0100, Robert Richter wrote:
> To find the correct region and root port of an endpoint of a system
> needing address translation, the endpoint's HPA range must be
> translated to each of the parent port address ranges up to the root
> decoder.
> 
> Calculate the SPA range using the newly introduced callback function
> port->to_hpa() that translates the decoder's HPA range to its parent
> port's HPA range of the next outer memory domain. Introduce the helper
> function cxl_port_calc_hpa() for this to calculate address ranges
> using the low-level port->to_hpa() callbacks. Determine the root port
> SPA range by iterating all the ports up to the root. Store the
> endpoint's SPA range and use it to find the endpoint's region.
>

Some comments/questions inline.

> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>  drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
>  drivers/cxl/cxl.h         |  1 +
>  2 files changed, 71 insertions(+), 15 deletions(-)
... snip ...
> +	while (1) {
> +		parent = parent_port_of(iter);
> +

I'm always suspicious of unbounded while loops.  Should this simply be

	while(iter) {
		...
	}

Instead?  Given the rest of the function, I don't think this matters,
but it's at least a bit clearer.

> +		if (is_cxl_endpoint(iter))
> +			cxld = &cxled->cxld;
> +		else if (!parent || parent->to_hpa)
> +			cxld = cxl_port_find_switch_decoder(iter, &hpa);
> +
> +		if (!cxld) {
> +			dev_err(cxlmd->dev.parent,
> +				"%s:%s no CXL window for range %#llx:%#llx\n",
> +				dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> +				hpa.start, hpa.end);
> +			return -ENXIO;
> +		}

I also think we should be able to move this check out of the loop on
various break conditions (iter==NULL, cxld not found, etc).

> +
> +		/* No parent means the root port was found. */
> +		if (!parent)
> +			break;
> +
> +		/* Translate HPA to the next upper memory domain. */
> +		if (cxl_port_calc_hpa(parent, cxld, &hpa))
> +			return -ENXIO;
> +
> +		iter = parent;
>  	}
>
> +	dev_dbg(cxld->dev.parent,
> +		"%s:%s: range:%#llx-%#llx\n",
> +		dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
> +		hpa.start, hpa.end);
> +
>  	cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
> +	cxled->spa_range = hpa;
>  
>  	return 0;
>  }

Overall good, just think we might be able to improve the readability /
safety of this loop a bit.

Stupid question: I presume a look in this iteration is not (generally?)
possible, but if it were to happen this while loop as-is would go infinite.
Is that something we should worry about it?

~Gregory
Jonathan Cameron Jan. 14, 2025, 10:59 a.m. UTC | #2
On Tue, 7 Jan 2025 15:10:01 +0100
Robert Richter <rrichter@amd.com> wrote:

> To find the correct region and root port of an endpoint of a system
> needing address translation, the endpoint's HPA range must be
> translated to each of the parent port address ranges up to the root
> decoder.
> 
> Calculate the SPA range using the newly introduced callback function
> port->to_hpa() that translates the decoder's HPA range to its parent
> port's HPA range of the next outer memory domain. Introduce the helper
> function cxl_port_calc_hpa() for this to calculate address ranges
> using the low-level port->to_hpa() callbacks. Determine the root port
> SPA range by iterating all the ports up to the root. Store the
> endpoint's SPA range and use it to find the endpoint's region.
> 
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>  drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
>  drivers/cxl/cxl.h         |  1 +
>  2 files changed, 71 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 09a68e266a79..007a2016760d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -824,6 +824,41 @@ static int match_free_decoder(struct device *dev, void *data)
>  	return 1;
>  }
>  
> +static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
> +			     struct range *hpa_range)
> +{
> +	struct range hpa = *hpa_range;
> +	u64 len = range_len(&hpa);
> +
> +	if (!port->to_hpa)
> +		return 0;
> +
> +	/* Translate HPA to the next upper domain. */
> +	hpa.start = port->to_hpa(cxld, hpa.start);
> +	hpa.end = port->to_hpa(cxld, hpa.end);
> +
> +	if (!hpa.start || !hpa.end ||

On general basis, why can't hpa.start be 0?
It is an unusual physical memory map, but technically possible on some
architectures.

> +	    hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
> +		dev_warn(&port->dev,
> +			"CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
> +			hpa.start, hpa.end, hpa_range->start,
> +			hpa_range->end, dev_name(&cxld->dev));
> +		return -ENXIO;
> +	}
> +
> +	if (range_len(&hpa) != len * cxld->interleave_ways) {
> +		dev_warn(&port->dev,
> +			"CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
> +			hpa.start, hpa.end, hpa_range->start,
> +			hpa_range->end, dev_name(&cxld->dev));
> +		return -ENXIO;
> +	}
> +
> +	*hpa_range = hpa;
> +
> +	return 0;
> +}
> +
>  static int match_auto_decoder(struct device *dev, void *data)
>  {
>  	struct cxl_region_params *p = data;
> @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
>  static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
>  {
>  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> -	struct cxl_port *iter = cxled_to_port(cxled);
> -	struct range *hpa = &cxled->cxld.hpa_range;
> +	struct cxl_port *parent, *iter = cxled_to_port(cxled);

I'd prefer that spit into two lines. Mixing cases that allocate and ones
that don't isn't great for readability.  Would also reduce the diff a little
which is always nice!

> +	struct range hpa = cxled->cxld.hpa_range;
>  	struct cxl_decoder *cxld = &cxled->cxld;
>  

...
Ben Cheatham Jan. 17, 2025, 9:31 p.m. UTC | #3
On 1/7/25 8:10 AM, Robert Richter wrote:
> To find the correct region and root port of an endpoint of a system
> needing address translation, the endpoint's HPA range must be
> translated to each of the parent port address ranges up to the root
> decoder.
> 
> Calculate the SPA range using the newly introduced callback function
> port->to_hpa() that translates the decoder's HPA range to its parent
> port's HPA range of the next outer memory domain. Introduce the helper
> function cxl_port_calc_hpa() for this to calculate address ranges
> using the low-level port->to_hpa() callbacks. Determine the root port
> SPA range by iterating all the ports up to the root. Store the
> endpoint's SPA range and use it to find the endpoint's region.
> 
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>  drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
>  drivers/cxl/cxl.h         |  1 +
>  2 files changed, 71 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 09a68e266a79..007a2016760d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -824,6 +824,41 @@ static int match_free_decoder(struct device *dev, void *data)
>  	return 1;
>  }
>  
> +static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
> +			     struct range *hpa_range)
> +{
> +	struct range hpa = *hpa_range;
> +	u64 len = range_len(&hpa);
> +
> +	if (!port->to_hpa)
> +		return 0;
> +
> +	/* Translate HPA to the next upper domain. */
> +	hpa.start = port->to_hpa(cxld, hpa.start);
> +	hpa.end = port->to_hpa(cxld, hpa.end);
> +
> +	if (!hpa.start || !hpa.end ||
> +	    hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
> +		dev_warn(&port->dev,
> +			"CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
> +			hpa.start, hpa.end, hpa_range->start,
> +			hpa_range->end, dev_name(&cxld->dev));
> +		return -ENXIO;
> +	}
> +
> +	if (range_len(&hpa) != len * cxld->interleave_ways) {
> +		dev_warn(&port->dev,
> +			"CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
> +			hpa.start, hpa.end, hpa_range->start,
> +			hpa_range->end, dev_name(&cxld->dev));
> +		return -ENXIO;
> +	}
> +
> +	*hpa_range = hpa;
> +
> +	return 0;
> +}
> +
>  static int match_auto_decoder(struct device *dev, void *data)
>  {
>  	struct cxl_region_params *p = data;
> @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
>  static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
>  {
>  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> -	struct cxl_port *iter = cxled_to_port(cxled);
> -	struct range *hpa = &cxled->cxld.hpa_range;
> +	struct cxl_port *parent, *iter = cxled_to_port(cxled);
> +	struct range hpa = cxled->cxld.hpa_range;
>  	struct cxl_decoder *cxld = &cxled->cxld;

Can cut down on the dereferencing a bit here by doing:
	struct cxl_decoder *cxld = &cxled->cxld;
	struct range hpa = cxld->hpa_range;

instead.

>  
> -	while (iter && !is_cxl_root(iter))
> -		iter = to_cxl_port(iter->dev.parent);
> -
> -	if (!iter)
> +	if (!iter || is_cxl_root(iter))
>  		return -ENXIO;
>  
> -	cxld = cxl_port_find_switch_decoder(iter, hpa);
> -	if (!cxld) {
> -		dev_err(cxlmd->dev.parent,
> -			"%s:%s no CXL window for range %#llx:%#llx\n",
> -			dev_name(&cxlmd->dev), dev_name(&cxld->dev),
> -			cxld->hpa_range.start, cxld->hpa_range.end);
> -		return -ENXIO;
> +	while (1) {
> +		parent = parent_port_of(iter);
> +
> +		if (is_cxl_endpoint(iter))
> +			cxld = &cxled->cxld;
> +		else if (!parent || parent->to_hpa)
> +			cxld = cxl_port_find_switch_decoder(iter, &hpa);
> +
> +		if (!cxld) {
> +			dev_err(cxlmd->dev.parent,
> +				"%s:%s no CXL window for range %#llx:%#llx\n",
> +				dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> +				hpa.start, hpa.end);
> +			return -ENXIO;
> +		}
> +
> +		/* No parent means the root port was found. */
> +		if (!parent)
> +			break;
> +
> +		/* Translate HPA to the next upper memory domain. */
> +		if (cxl_port_calc_hpa(parent, cxld, &hpa))
> +			return -ENXIO;
> +
> +		iter = parent;
>  	}
>  
> +	dev_dbg(cxld->dev.parent,
> +		"%s:%s: range:%#llx-%#llx\n",
> +		dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
> +		hpa.start, hpa.end);
> +
>  	cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
> +	cxled->spa_range = hpa;
>  
>  	return 0;
>  }
> @@ -3358,7 +3414,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
>  {
> -	struct range *hpa = &cxled->cxld.hpa_range;
>  	struct cxl_root_decoder *cxlrd = cxled->cxlrd;
>  	struct cxl_region_params *p;
>  	struct cxl_region *cxlr;
> @@ -3370,7 +3425,7 @@ static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
>  	 * one does the construction and the others add to that.
>  	 */
>  	mutex_lock(&cxlrd->range_lock);
> -	cxlr = cxl_find_region_by_range(cxlrd, hpa);
> +	cxlr = cxl_find_region_by_range(cxlrd, &cxled->spa_range);
>  	if (!cxlr)
>  		cxlr = construct_region(cxlrd, cxled);
>  	mutex_unlock(&cxlrd->range_lock);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index c04f66fe2a93..4ccb2b3b31c9 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -419,6 +419,7 @@ struct cxl_endpoint_decoder {
>  	struct cxl_decoder cxld;
>  	struct cxl_root_decoder *cxlrd;
>  	struct resource *dpa_res;
> +	struct range spa_range;
>  	resource_size_t skip;
>  	enum cxl_decoder_mode mode;
>  	enum cxl_decoder_state state;
Robert Richter Jan. 31, 2025, 3:46 p.m. UTC | #4
On 14.01.25 10:59:27, Jonathan Cameron wrote:
> On Tue, 7 Jan 2025 15:10:01 +0100
> Robert Richter <rrichter@amd.com> wrote:

> > +static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
> > +			     struct range *hpa_range)
> > +{
> > +	struct range hpa = *hpa_range;
> > +	u64 len = range_len(&hpa);
> > +
> > +	if (!port->to_hpa)
> > +		return 0;
> > +
> > +	/* Translate HPA to the next upper domain. */
> > +	hpa.start = port->to_hpa(cxld, hpa.start);
> > +	hpa.end = port->to_hpa(cxld, hpa.end);
> > +
> > +	if (!hpa.start || !hpa.end ||
> 
> On general basis, why can't hpa.start be 0?
> It is an unusual physical memory map, but technically possible on some
> architectures.
> 
> > +	    hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
> > +		dev_warn(&port->dev,
> > +			"CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
> > +			hpa.start, hpa.end, hpa_range->start,
> > +			hpa_range->end, dev_name(&cxld->dev));
> > +		return -ENXIO;
> > +	}

> > @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> >  static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> >  {
> >  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > -	struct cxl_port *iter = cxled_to_port(cxled);
> > -	struct range *hpa = &cxled->cxld.hpa_range;
> > +	struct cxl_port *parent, *iter = cxled_to_port(cxled);
> 
> I'd prefer that spit into two lines. Mixing cases that allocate and ones
> that don't isn't great for readability.  Would also reduce the diff a little
> which is always nice!
> 
> > +	struct range hpa = cxled->cxld.hpa_range;
> >  	struct cxl_decoder *cxld = &cxled->cxld;
> >  

Changed both, thanks.

-Robert
Robert Richter Feb. 5, 2025, 8:48 a.m. UTC | #5
On 07.01.25 14:14:14, Gregory Price wrote:
> On Tue, Jan 07, 2025 at 03:10:01PM +0100, Robert Richter wrote:
> > To find the correct region and root port of an endpoint of a system
> > needing address translation, the endpoint's HPA range must be
> > translated to each of the parent port address ranges up to the root
> > decoder.
> > 
> > Calculate the SPA range using the newly introduced callback function
> > port->to_hpa() that translates the decoder's HPA range to its parent
> > port's HPA range of the next outer memory domain. Introduce the helper
> > function cxl_port_calc_hpa() for this to calculate address ranges
> > using the low-level port->to_hpa() callbacks. Determine the root port
> > SPA range by iterating all the ports up to the root. Store the
> > endpoint's SPA range and use it to find the endpoint's region.
> >
> 
> Some comments/questions inline.
> 
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> >  drivers/cxl/core/region.c | 85 ++++++++++++++++++++++++++++++++-------
> >  drivers/cxl/cxl.h         |  1 +
> >  2 files changed, 71 insertions(+), 15 deletions(-)
> ... snip ...
> > +	while (1) {
> > +		parent = parent_port_of(iter);
> > +
> 
> I'm always suspicious of unbounded while loops.  Should this simply be
> 
> 	while(iter) {
> 		...
> 	}
> 
> Instead?  Given the rest of the function, I don't think this matters,
> but it's at least a bit clearer.
> 
> > +		if (is_cxl_endpoint(iter))
> > +			cxld = &cxled->cxld;
> > +		else if (!parent || parent->to_hpa)
> > +			cxld = cxl_port_find_switch_decoder(iter, &hpa);
> > +
> > +		if (!cxld) {
> > +			dev_err(cxlmd->dev.parent,
> > +				"%s:%s no CXL window for range %#llx:%#llx\n",
> > +				dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> > +				hpa.start, hpa.end);
> > +			return -ENXIO;
> > +		}
> 
> I also think we should be able to move this check out of the loop on
> various break conditions (iter==NULL, cxld not found, etc).
> 
> > +
> > +		/* No parent means the root port was found. */
> > +		if (!parent)
> > +			break;
> > +
> > +		/* Translate HPA to the next upper memory domain. */
> > +		if (cxl_port_calc_hpa(parent, cxld, &hpa))
> > +			return -ENXIO;

We must check for !parent before and it is clearer then to directly
leave the loop.

That is, using

 while(iter) {} 

does not improve the loop because that would introduce duplicate
checks. It is not possible otherwise to break at the beginning or end
of the loop, as there is code to run, either
cxl_port_find_switch_decoder() or cxl_port_calc_hpa().

> > +
> > +		iter = parent;

> >  	}
> >
> > +	dev_dbg(cxld->dev.parent,
> > +		"%s:%s: range:%#llx-%#llx\n",
> > +		dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
> > +		hpa.start, hpa.end);
> > +
> >  	cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
> > +	cxled->spa_range = hpa;
> >  
> >  	return 0;
> >  }
> 
> Overall good, just think we might be able to improve the readability /
> safety of this loop a bit.
> 
> Stupid question: I presume a look in this iteration is not (generally?)
> possible, but if it were to happen this while loop as-is would go infinite.
> Is that something we should worry about it?

It is safe, this kind of iterator is used elsewhere too.

-Robert
Robert Richter Feb. 5, 2025, 9 a.m. UTC | #6
On 17.01.25 15:31:43, Ben Cheatham wrote:
> On 1/7/25 8:10 AM, Robert Richter wrote:

> >  static int match_auto_decoder(struct device *dev, void *data)
> >  {
> >  	struct cxl_region_params *p = data;
> > @@ -3214,26 +3249,47 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
> >  static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
> >  {
> >  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > -	struct cxl_port *iter = cxled_to_port(cxled);
> > -	struct range *hpa = &cxled->cxld.hpa_range;
> > +	struct cxl_port *parent, *iter = cxled_to_port(cxled);
> > +	struct range hpa = cxled->cxld.hpa_range;
> >  	struct cxl_decoder *cxld = &cxled->cxld;
> 
> Can cut down on the dereferencing a bit here by doing:
> 	struct cxl_decoder *cxld = &cxled->cxld;
> 	struct range hpa = cxld->hpa_range;
> 
> instead.

Changed that, thanks.

-Robert
diff mbox series

Patch

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 09a68e266a79..007a2016760d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -824,6 +824,41 @@  static int match_free_decoder(struct device *dev, void *data)
 	return 1;
 }
 
+static int cxl_port_calc_hpa(struct cxl_port *port, struct cxl_decoder *cxld,
+			     struct range *hpa_range)
+{
+	struct range hpa = *hpa_range;
+	u64 len = range_len(&hpa);
+
+	if (!port->to_hpa)
+		return 0;
+
+	/* Translate HPA to the next upper domain. */
+	hpa.start = port->to_hpa(cxld, hpa.start);
+	hpa.end = port->to_hpa(cxld, hpa.end);
+
+	if (!hpa.start || !hpa.end ||
+	    hpa.start == ULLONG_MAX || hpa.end == ULLONG_MAX) {
+		dev_warn(&port->dev,
+			"CXL address translation: HPA range invalid: %#llx-%#llx:%#llx-%#llx(%s)\n",
+			hpa.start, hpa.end, hpa_range->start,
+			hpa_range->end, dev_name(&cxld->dev));
+		return -ENXIO;
+	}
+
+	if (range_len(&hpa) != len * cxld->interleave_ways) {
+		dev_warn(&port->dev,
+			"CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
+			hpa.start, hpa.end, hpa_range->start,
+			hpa_range->end, dev_name(&cxld->dev));
+		return -ENXIO;
+	}
+
+	*hpa_range = hpa;
+
+	return 0;
+}
+
 static int match_auto_decoder(struct device *dev, void *data)
 {
 	struct cxl_region_params *p = data;
@@ -3214,26 +3249,47 @@  cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
 static int cxl_endpoint_initialize(struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct cxl_port *iter = cxled_to_port(cxled);
-	struct range *hpa = &cxled->cxld.hpa_range;
+	struct cxl_port *parent, *iter = cxled_to_port(cxled);
+	struct range hpa = cxled->cxld.hpa_range;
 	struct cxl_decoder *cxld = &cxled->cxld;
 
-	while (iter && !is_cxl_root(iter))
-		iter = to_cxl_port(iter->dev.parent);
-
-	if (!iter)
+	if (!iter || is_cxl_root(iter))
 		return -ENXIO;
 
-	cxld = cxl_port_find_switch_decoder(iter, hpa);
-	if (!cxld) {
-		dev_err(cxlmd->dev.parent,
-			"%s:%s no CXL window for range %#llx:%#llx\n",
-			dev_name(&cxlmd->dev), dev_name(&cxld->dev),
-			cxld->hpa_range.start, cxld->hpa_range.end);
-		return -ENXIO;
+	while (1) {
+		parent = parent_port_of(iter);
+
+		if (is_cxl_endpoint(iter))
+			cxld = &cxled->cxld;
+		else if (!parent || parent->to_hpa)
+			cxld = cxl_port_find_switch_decoder(iter, &hpa);
+
+		if (!cxld) {
+			dev_err(cxlmd->dev.parent,
+				"%s:%s no CXL window for range %#llx:%#llx\n",
+				dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
+				hpa.start, hpa.end);
+			return -ENXIO;
+		}
+
+		/* No parent means the root port was found. */
+		if (!parent)
+			break;
+
+		/* Translate HPA to the next upper memory domain. */
+		if (cxl_port_calc_hpa(parent, cxld, &hpa))
+			return -ENXIO;
+
+		iter = parent;
 	}
 
+	dev_dbg(cxld->dev.parent,
+		"%s:%s: range:%#llx-%#llx\n",
+		dev_name(&cxled->cxld.dev), dev_name(&cxld->dev),
+		hpa.start, hpa.end);
+
 	cxled->cxlrd = to_cxl_root_decoder(&cxld->dev);
+	cxled->spa_range = hpa;
 
 	return 0;
 }
@@ -3358,7 +3414,6 @@  static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 
 static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
 {
-	struct range *hpa = &cxled->cxld.hpa_range;
 	struct cxl_root_decoder *cxlrd = cxled->cxlrd;
 	struct cxl_region_params *p;
 	struct cxl_region *cxlr;
@@ -3370,7 +3425,7 @@  static int cxl_endpoint_add(struct cxl_endpoint_decoder *cxled)
 	 * one does the construction and the others add to that.
 	 */
 	mutex_lock(&cxlrd->range_lock);
-	cxlr = cxl_find_region_by_range(cxlrd, hpa);
+	cxlr = cxl_find_region_by_range(cxlrd, &cxled->spa_range);
 	if (!cxlr)
 		cxlr = construct_region(cxlrd, cxled);
 	mutex_unlock(&cxlrd->range_lock);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index c04f66fe2a93..4ccb2b3b31c9 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -419,6 +419,7 @@  struct cxl_endpoint_decoder {
 	struct cxl_decoder cxld;
 	struct cxl_root_decoder *cxlrd;
 	struct resource *dpa_res;
+	struct range spa_range;
 	resource_size_t skip;
 	enum cxl_decoder_mode mode;
 	enum cxl_decoder_state state;