diff mbox series

[3/5] cxl/acpi: Minimize granularity for x1 interleaves

Message ID 165853776917.2430596.16823264262010844458.stgit@dwillia2-xfh.jf.intel.com
State Superseded
Headers show
Series CXL Region Provisioning Fixes | expand

Commit Message

Dan Williams July 23, 2022, 12:56 a.m. UTC
The kernel enforces that region granularity is >= to the top-level
interleave-granularity for the given CXL window. However, when the CXL
window interleave is x1, i.e. non-interleaved at the host bridge level,
then the specified granularity does not matter. Override the window
specified granularity to the CXL minimum so that any valid region
granularity is >= to the root granularity.

Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/acpi.c |    6 ++++++
 1 file changed, 6 insertions(+)

Comments

Alison Schofield Aug. 1, 2022, 7:35 p.m. UTC | #1
On Fri, Jul 22, 2022 at 05:56:09PM -0700, Dan Williams wrote:
> The kernel enforces that region granularity is >= to the top-level
> interleave-granularity for the given CXL window. However, when the CXL
> window interleave is x1, i.e. non-interleaved at the host bridge level,
> then the specified granularity does not matter. Override the window
> specified granularity to the CXL minimum so that any valid region
> granularity is >= to the root granularity.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Alison Schofield <alison.schofield@intel.com>

> ---
>  drivers/cxl/acpi.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index eb436268b92c..67137e17b8c9 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  		.end = res->end,
>  	};
>  	cxld->interleave_ways = ways;
> +	/*
> +	 * Minimize the x1 granularity to advertise support for any
> +	 * valid region granularity
> +	 */
> +	if (ways == 1)
> +		ig = 256;
>  	cxld->interleave_granularity = ig;
>  
>  	rc = cxl_decoder_add(cxld, target_map);
>
Verma, Vishal L Aug. 1, 2022, 7:45 p.m. UTC | #2
On Fri, 2022-07-22 at 17:56 -0700, Dan Williams wrote:
> The kernel enforces that region granularity is >= to the top-level
> interleave-granularity for the given CXL window. However, when the CXL
> window interleave is x1, i.e. non-interleaved at the host bridge level,
> then the specified granularity does not matter. Override the window
> specified granularity to the CXL minimum so that any valid region
> granularity is >= to the root granularity.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/acpi.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index eb436268b92c..67137e17b8c9 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>                 .end = res->end,
>         };
>         cxld->interleave_ways = ways;
> +       /*
> +        * Minimize the x1 granularity to advertise support for any
> +        * valid region granularity
> +        */
> +       if (ways == 1)
> +               ig = 256;

Probably not super critical, but should this be a #define somewhere?

Regardless,

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>

>         cxld->interleave_granularity = ig;
>  
>         rc = cxl_decoder_add(cxld, target_map);
>
Dan Williams Aug. 1, 2022, 9:34 p.m. UTC | #3
Verma, Vishal L wrote:
> On Fri, 2022-07-22 at 17:56 -0700, Dan Williams wrote:
> > The kernel enforces that region granularity is >= to the top-level
> > interleave-granularity for the given CXL window. However, when the CXL
> > window interleave is x1, i.e. non-interleaved at the host bridge level,
> > then the specified granularity does not matter. Override the window
> > specified granularity to the CXL minimum so that any valid region
> > granularity is >= to the root granularity.
> > 
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/acpi.c |    6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index eb436268b92c..67137e17b8c9 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
> >                 .end = res->end,
> >         };
> >         cxld->interleave_ways = ways;
> > +       /*
> > +        * Minimize the x1 granularity to advertise support for any
> > +        * valid region granularity
> > +        */
> > +       if (ways == 1)
> > +               ig = 256;
> 
> Probably not super critical, but should this be a #define somewhere?

Yeah, something like CXL_MIN_GRANULARITY, will add.

> 
> Regardless,
> 
> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
> 
> >         cxld->interleave_granularity = ig;
> >  
> >         rc = cxl_decoder_add(cxld, target_map);
> > 
>
Jonathan Cameron Aug. 2, 2022, 3:56 p.m. UTC | #4
On Fri, 22 Jul 2022 17:56:09 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The kernel enforces that region granularity is >= to the top-level
> interleave-granularity for the given CXL window. However, when the CXL
> window interleave is x1, i.e. non-interleaved at the host bridge level,
> then the specified granularity does not matter. Override the window
> specified granularity to the CXL minimum so that any valid region
> granularity is >= to the root granularity.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Hi Dan,

Debugging exactly why this is failing (from cxl.git/preview) for my test setup...
(1 hb, 8 rp, 8 direct connected devices)

If I set the interleave granularity of a region to 256, I end
up with 256 for the CFMWS which is fine, then 512 for the HB which
is not - EP interleave granularity is expected 256.

https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070

Calculates the eig as address_bit - eiw + 1

iw = 8
eiw = 3
peig = 0 (pig = 256)
peiw = 0 (piw = 1)
(all as expected I think...)

So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3)
and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong.

I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this
setup..

Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it
is at least landing in the right file!)
Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the
qemu address logic is right but it'll never work with that value).  I'm struggling to figure
out where we actually set the interleave ways for an EP.

Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps
because the devices give better performance with 1024 byte sequential reads)

Clearly going to be one of those bugs all the way down days. 

Thanks,

Jonathan


> ---
>  drivers/cxl/acpi.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index eb436268b92c..67137e17b8c9 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  		.end = res->end,
>  	};
>  	cxld->interleave_ways = ways;
> +	/*
> +	 * Minimize the x1 granularity to advertise support for any
> +	 * valid region granularity
> +	 */
> +	if (ways == 1)
> +		ig = 256;
>  	cxld->interleave_granularity = ig;
>  
>  	rc = cxl_decoder_add(cxld, target_map);
> 
>
Jonathan Cameron Aug. 2, 2022, 4:52 p.m. UTC | #5
On Tue, 2 Aug 2022 16:56:27 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Fri, 22 Jul 2022 17:56:09 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > The kernel enforces that region granularity is >= to the top-level
> > interleave-granularity for the given CXL window. However, when the CXL
> > window interleave is x1, i.e. non-interleaved at the host bridge level,
> > then the specified granularity does not matter. Override the window
> > specified granularity to the CXL minimum so that any valid region
> > granularity is >= to the root granularity.
> > 
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>  
> 
> Hi Dan,
> 
> Debugging exactly why this is failing (from cxl.git/preview) for my test setup...
> (1 hb, 8 rp, 8 direct connected devices)
> 
> If I set the interleave granularity of a region to 256, I end
> up with 256 for the CFMWS which is fine, then 512 for the HB which
> is not - EP interleave granularity is expected 256.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070
> 
> Calculates the eig as address_bit - eiw + 1
> 
> iw = 8
> eiw = 3
> peig = 0 (pig = 256)
> peiw = 0 (piw = 1)
> (all as expected I think...)
> 
> So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3)
> and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong.
> 
> I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this
> setup..
> 
> Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it
> is at least landing in the right file!)
> Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the
> qemu address logic is right but it'll never work with that value).  I'm struggling to figure
> out where we actually set the interleave ways for an EP.

FWIW I hacked qemu to default to EPs with eig = 3 (ig = 8) for the EPs and decode looks better.
There is a write to eiw for the EPs when commit is set, but seems to be just writing back value
cached when we originally read the setup back from the HDM decoders at probe.

Test code is that QEMU fix I sent a few weeks back + a hack of -= 1 of eig as above for the HB given
I haven't figured out what right fix for that is.

Jonathan



> 
> Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps
> because the devices give better performance with 1024 byte sequential reads)
> 
> Clearly going to be one of those bugs all the way down days. 
> 
> Thanks,
> 
> Jonathan
> 
> 
> > ---
> >  drivers/cxl/acpi.c |    6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index eb436268b92c..67137e17b8c9 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
> >  		.end = res->end,
> >  	};
> >  	cxld->interleave_ways = ways;
> > +	/*
> > +	 * Minimize the x1 granularity to advertise support for any
> > +	 * valid region granularity
> > +	 */
> > +	if (ways == 1)
> > +		ig = 256;
> >  	cxld->interleave_granularity = ig;
> >  
> >  	rc = cxl_decoder_add(cxld, target_map);
> > 
> >   
>
Dan Williams Aug. 2, 2022, 5:33 p.m. UTC | #6
Jonathan Cameron wrote:
> On Fri, 22 Jul 2022 17:56:09 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > The kernel enforces that region granularity is >= to the top-level
> > interleave-granularity for the given CXL window. However, when the CXL
> > window interleave is x1, i.e. non-interleaved at the host bridge level,
> > then the specified granularity does not matter. Override the window
> > specified granularity to the CXL minimum so that any valid region
> > granularity is >= to the root granularity.
> > 
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> Hi Dan,
> 
> Debugging exactly why this is failing (from cxl.git/preview) for my test setup...
> (1 hb, 8 rp, 8 direct connected devices)
> 
> If I set the interleave granularity of a region to 256, I end
> up with 256 for the CFMWS which is fine, then 512 for the HB which
> is not - EP interleave granularity is expected 256.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070
> 
> Calculates the eig as address_bit - eiw + 1
> 
> iw = 8
> eiw = 3
> peig = 0 (pig = 256)
> peiw = 0 (piw = 1)
> (all as expected I think...)
> 
> So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3)
> and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong.
> 
> I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this
> setup..

Yeah, the "+ 1" is not required when routing from the x1 HB level. I can
reproduce this config with cxl_test to validate.

> Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it
> is at least landing in the right file!)
> Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the
> qemu address logic is right but it'll never work with that value).  I'm struggling to figure
> out where we actually set the interleave ways for an EP.

Ugh, it is not set and I think I was blinded by some successful region
creation results and assumed they also ran cycles to validate the data
consistency. My expectation is that the EP interleave_ways is set here:

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index cf5d5811fe4c..dd4035d92041 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1285,6 +1285,8 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 
        p->targets[pos] = cxled;
        cxled->pos = pos;
+       cxled->cxld.interleave_ways = p->interleave_ways;
+       cxled->cxld.interleave_granularity = p->interleave_granularity;
        p->nr_targets++;
 
        if (p->nr_targets == p->interleave_ways) {


> Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps
> because the devices give better performance with 1024 byte sequential reads)
> 
> Clearly going to be one of those bugs all the way down days. 
> 

Yes, the hunt continues, but I think the driver has some large ones the
squash first.
Jonathan Cameron Aug. 3, 2022, 4 p.m. UTC | #7
On Tue, 2 Aug 2022 10:33:19 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Fri, 22 Jul 2022 17:56:09 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > The kernel enforces that region granularity is >= to the top-level
> > > interleave-granularity for the given CXL window. However, when the CXL
> > > window interleave is x1, i.e. non-interleaved at the host bridge level,
> > > then the specified granularity does not matter. Override the window
> > > specified granularity to the CXL minimum so that any valid region
> > > granularity is >= to the root granularity.
> > > 
> > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>  
> > 
> > Hi Dan,
> > 
> > Debugging exactly why this is failing (from cxl.git/preview) for my test setup...
> > (1 hb, 8 rp, 8 direct connected devices)
> > 
> > If I set the interleave granularity of a region to 256, I end
> > up with 256 for the CFMWS which is fine, then 512 for the HB which
> > is not - EP interleave granularity is expected 256.
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070
> > 
> > Calculates the eig as address_bit - eiw + 1
> > 
> > iw = 8
> > eiw = 3
> > peig = 0 (pig = 256)
> > peiw = 0 (piw = 1)
> > (all as expected I think...)
> > 
> > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3)
> > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong.
> > 
> > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this
> > setup..  
> 
> Yeah, the "+ 1" is not required when routing from the x1 HB level. I can
> reproduce this config with cxl_test to validate.

Other than this off by one, with the other fixes you posted everything now works for me
with the particular test case above.

Thanks

Jonathan

> 
> > Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it
> > is at least landing in the right file!)
> > Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the
> > qemu address logic is right but it'll never work with that value).  I'm struggling to figure
> > out where we actually set the interleave ways for an EP.  
> 
> Ugh, it is not set and I think I was blinded by some successful region
> creation results and assumed they also ran cycles to validate the data
> consistency. My expectation is that the EP interleave_ways is set here:
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index cf5d5811fe4c..dd4035d92041 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1285,6 +1285,8 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  
>         p->targets[pos] = cxled;
>         cxled->pos = pos;
> +       cxled->cxld.interleave_ways = p->interleave_ways;
> +       cxled->cxld.interleave_granularity = p->interleave_granularity;
>         p->nr_targets++;
>  
>         if (p->nr_targets == p->interleave_ways) {
> 
> 
> > Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps
> > because the devices give better performance with 1024 byte sequential reads)
> > 
> > Clearly going to be one of those bugs all the way down days. 
> >   
> 
> Yes, the hunt continues, but I think the driver has some large ones the
> squash first.
Dan Williams Aug. 3, 2022, 5:18 p.m. UTC | #8
Jonathan Cameron wrote:
> On Tue, 2 Aug 2022 10:33:19 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Jonathan Cameron wrote:
> > > On Fri, 22 Jul 2022 17:56:09 -0700
> > > Dan Williams <dan.j.williams@intel.com> wrote:
> > >   
> > > > The kernel enforces that region granularity is >= to the top-level
> > > > interleave-granularity for the given CXL window. However, when the CXL
> > > > window interleave is x1, i.e. non-interleaved at the host bridge level,
> > > > then the specified granularity does not matter. Override the window
> > > > specified granularity to the CXL minimum so that any valid region
> > > > granularity is >= to the root granularity.
> > > > 
> > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>  
> > > 
> > > Hi Dan,
> > > 
> > > Debugging exactly why this is failing (from cxl.git/preview) for my test setup...
> > > (1 hb, 8 rp, 8 direct connected devices)
> > > 
> > > If I set the interleave granularity of a region to 256, I end
> > > up with 256 for the CFMWS which is fine, then 512 for the HB which
> > > is not - EP interleave granularity is expected 256.
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070
> > > 
> > > Calculates the eig as address_bit - eiw + 1
> > > 
> > > iw = 8
> > > eiw = 3
> > > peig = 0 (pig = 256)
> > > peiw = 0 (piw = 1)
> > > (all as expected I think...)
> > > 
> > > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3)
> > > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong.
> > > 
> > > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this
> > > setup..  
> > 
> > Yeah, the "+ 1" is not required when routing from the x1 HB level. I can
> > reproduce this config with cxl_test to validate.
> 
> Other than this off by one, with the other fixes you posted everything now works for me
> with the particular test case above.

Oh, you are hitting this path in your test? I misread the condition when
this triggers. It should be skipping the granularity increment when the
port's parent interleave is x1. I.e. in this case with devices directly
connected to a host-bridge where the host-bridge is not interleaved then
there is no need for an extra address bit to route the decode.
Jonathan Cameron Aug. 4, 2022, 9:32 a.m. UTC | #9
On Wed, 3 Aug 2022 10:18:27 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Tue, 2 Aug 2022 10:33:19 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > Jonathan Cameron wrote:  
> > > > On Fri, 22 Jul 2022 17:56:09 -0700
> > > > Dan Williams <dan.j.williams@intel.com> wrote:
> > > >     
> > > > > The kernel enforces that region granularity is >= to the top-level
> > > > > interleave-granularity for the given CXL window. However, when the CXL
> > > > > window interleave is x1, i.e. non-interleaved at the host bridge level,
> > > > > then the specified granularity does not matter. Override the window
> > > > > specified granularity to the CXL minimum so that any valid region
> > > > > granularity is >= to the root granularity.
> > > > > 
> > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>    
> > > > 
> > > > Hi Dan,
> > > > 
> > > > Debugging exactly why this is failing (from cxl.git/preview) for my test setup...
> > > > (1 hb, 8 rp, 8 direct connected devices)
> > > > 
> > > > If I set the interleave granularity of a region to 256, I end
> > > > up with 256 for the CFMWS which is fine, then 512 for the HB which
> > > > is not - EP interleave granularity is expected 256.
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070
> > > > 
> > > > Calculates the eig as address_bit - eiw + 1
> > > > 
> > > > iw = 8
> > > > eiw = 3
> > > > peig = 0 (pig = 256)
> > > > peiw = 0 (piw = 1)
> > > > (all as expected I think...)
> > > > 
> > > > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3)
> > > > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong.
> > > > 
> > > > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this
> > > > setup..    
> > > 
> > > Yeah, the "+ 1" is not required when routing from the x1 HB level. I can
> > > reproduce this config with cxl_test to validate.  
> > 
> > Other than this off by one, with the other fixes you posted everything now works for me
> > with the particular test case above.  
> 
> Oh, you are hitting this path in your test? I misread the condition when
> this triggers. It should be skipping the granularity increment when the
> port's parent interleave is x1. I.e. in this case with devices directly
> connected to a host-bridge where the host-bridge is not interleaved then
> there is no need for an extra address bit to route the decode.
> 

I haven't chased down why, but definitely hitting it with resulting routing
to wrong device. 
Current test when setting interleave gran for the decoder1.0 (the HB in this case)
is on the ways of decoder 1.0 which is 8.
Maybe the test should be on the parent_rr nr_targets? (or parent_iw should work I think)
That makes things 'work' for this particular test, but need to test on a wider range
of setups.

Jonathan
diff mbox series

Patch

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index eb436268b92c..67137e17b8c9 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -140,6 +140,12 @@  static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 		.end = res->end,
 	};
 	cxld->interleave_ways = ways;
+	/*
+	 * Minimize the x1 granularity to advertise support for any
+	 * valid region granularity
+	 */
+	if (ways == 1)
+		ig = 256;
 	cxld->interleave_granularity = ig;
 
 	rc = cxl_decoder_add(cxld, target_map);