Message ID | 165853776917.2430596.16823264262010844458.stgit@dwillia2-xfh.jf.intel.com |
---|---|
State | Superseded |
Headers | show |
Series | CXL Region Provisioning Fixes | expand |
On Fri, Jul 22, 2022 at 05:56:09PM -0700, Dan Williams wrote: > The kernel enforces that region granularity is >= to the top-level > interleave-granularity for the given CXL window. However, when the CXL > window interleave is x1, i.e. non-interleaved at the host bridge level, > then the specified granularity does not matter. Override the window > specified granularity to the CXL minimum so that any valid region > granularity is >= to the root granularity. > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> > --- > drivers/cxl/acpi.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c > index eb436268b92c..67137e17b8c9 100644 > --- a/drivers/cxl/acpi.c > +++ b/drivers/cxl/acpi.c > @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, > .end = res->end, > }; > cxld->interleave_ways = ways; > + /* > + * Minimize the x1 granularity to advertise support for any > + * valid region granularity > + */ > + if (ways == 1) > + ig = 256; > cxld->interleave_granularity = ig; > > rc = cxl_decoder_add(cxld, target_map); >
On Fri, 2022-07-22 at 17:56 -0700, Dan Williams wrote: > The kernel enforces that region granularity is >= to the top-level > interleave-granularity for the given CXL window. However, when the CXL > window interleave is x1, i.e. non-interleaved at the host bridge level, > then the specified granularity does not matter. Override the window > specified granularity to the CXL minimum so that any valid region > granularity is >= to the root granularity. > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > drivers/cxl/acpi.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c > index eb436268b92c..67137e17b8c9 100644 > --- a/drivers/cxl/acpi.c > +++ b/drivers/cxl/acpi.c > @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, > .end = res->end, > }; > cxld->interleave_ways = ways; > + /* > + * Minimize the x1 granularity to advertise support for any > + * valid region granularity > + */ > + if (ways == 1) > + ig = 256; Probably not super critical, but should this be a #define somewhere? Regardless, Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> > cxld->interleave_granularity = ig; > > rc = cxl_decoder_add(cxld, target_map); >
Verma, Vishal L wrote: > On Fri, 2022-07-22 at 17:56 -0700, Dan Williams wrote: > > The kernel enforces that region granularity is >= to the top-level > > interleave-granularity for the given CXL window. However, when the CXL > > window interleave is x1, i.e. non-interleaved at the host bridge level, > > then the specified granularity does not matter. Override the window > > specified granularity to the CXL minimum so that any valid region > > granularity is >= to the root granularity. > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > --- > > drivers/cxl/acpi.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c > > index eb436268b92c..67137e17b8c9 100644 > > --- a/drivers/cxl/acpi.c > > +++ b/drivers/cxl/acpi.c > > @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, > > .end = res->end, > > }; > > cxld->interleave_ways = ways; > > + /* > > + * Minimize the x1 granularity to advertise support for any > > + * valid region granularity > > + */ > > + if (ways == 1) > > + ig = 256; > > Probably not super critical, but should this be a #define somewhere? Yeah, something like CXL_MIN_GRANULARITY, will add. > > Regardless, > > Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> > > > cxld->interleave_granularity = ig; > > > > rc = cxl_decoder_add(cxld, target_map); > > >
On Fri, 22 Jul 2022 17:56:09 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > The kernel enforces that region granularity is >= to the top-level > interleave-granularity for the given CXL window. However, when the CXL > window interleave is x1, i.e. non-interleaved at the host bridge level, > then the specified granularity does not matter. Override the window > specified granularity to the CXL minimum so that any valid region > granularity is >= to the root granularity. > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Hi Dan, Debugging exactly why this is failing (from cxl.git/preview) for my test setup... (1 hb, 8 rp, 8 direct connected devices) If I set the interleave granularity of a region to 256, I end up with 256 for the CFMWS which is fine, then 512 for the HB which is not - EP interleave granularity is expected 256. https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070 Calculates the eig as address_bit - eiw + 1 iw = 8 eiw = 3 peig = 0 (pig = 256) peiw = 0 (piw = 1) (all as expected I think...) So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3) and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong. I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this setup.. Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it is at least landing in the right file!) Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the qemu address logic is right but it'll never work with that value). I'm struggling to figure out where we actually set the interleave ways for an EP. Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps because the devices give better performance with 1024 byte sequential reads) Clearly going to be one of those bugs all the way down days. Thanks, Jonathan > --- > drivers/cxl/acpi.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c > index eb436268b92c..67137e17b8c9 100644 > --- a/drivers/cxl/acpi.c > +++ b/drivers/cxl/acpi.c > @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, > .end = res->end, > }; > cxld->interleave_ways = ways; > + /* > + * Minimize the x1 granularity to advertise support for any > + * valid region granularity > + */ > + if (ways == 1) > + ig = 256; > cxld->interleave_granularity = ig; > > rc = cxl_decoder_add(cxld, target_map); > >
On Tue, 2 Aug 2022 16:56:27 +0100 Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > On Fri, 22 Jul 2022 17:56:09 -0700 > Dan Williams <dan.j.williams@intel.com> wrote: > > > The kernel enforces that region granularity is >= to the top-level > > interleave-granularity for the given CXL window. However, when the CXL > > window interleave is x1, i.e. non-interleaved at the host bridge level, > > then the specified granularity does not matter. Override the window > > specified granularity to the CXL minimum so that any valid region > > granularity is >= to the root granularity. > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > Hi Dan, > > Debugging exactly why this is failing (from cxl.git/preview) for my test setup... > (1 hb, 8 rp, 8 direct connected devices) > > If I set the interleave granularity of a region to 256, I end > up with 256 for the CFMWS which is fine, then 512 for the HB which > is not - EP interleave granularity is expected 256. > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070 > > Calculates the eig as address_bit - eiw + 1 > > iw = 8 > eiw = 3 > peig = 0 (pig = 256) > peiw = 0 (piw = 1) > (all as expected I think...) > > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3) > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong. > > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this > setup.. > > Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it > is at least landing in the right file!) > Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the > qemu address logic is right but it'll never work with that value). I'm struggling to figure > out where we actually set the interleave ways for an EP. FWIW I hacked qemu to default to EPs with eig = 3 (ig = 8) for the EPs and decode looks better. There is a write to eiw for the EPs when commit is set, but seems to be just writing back value cached when we originally read the setup back from the HDM decoders at probe. Test code is that QEMU fix I sent a few weeks back + a hack of -= 1 of eig as above for the HB given I haven't figured out what right fix for that is. Jonathan > > Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps > because the devices give better performance with 1024 byte sequential reads) > > Clearly going to be one of those bugs all the way down days. > > Thanks, > > Jonathan > > > > --- > > drivers/cxl/acpi.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c > > index eb436268b92c..67137e17b8c9 100644 > > --- a/drivers/cxl/acpi.c > > +++ b/drivers/cxl/acpi.c > > @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, > > .end = res->end, > > }; > > cxld->interleave_ways = ways; > > + /* > > + * Minimize the x1 granularity to advertise support for any > > + * valid region granularity > > + */ > > + if (ways == 1) > > + ig = 256; > > cxld->interleave_granularity = ig; > > > > rc = cxl_decoder_add(cxld, target_map); > > > > >
Jonathan Cameron wrote: > On Fri, 22 Jul 2022 17:56:09 -0700 > Dan Williams <dan.j.williams@intel.com> wrote: > > > The kernel enforces that region granularity is >= to the top-level > > interleave-granularity for the given CXL window. However, when the CXL > > window interleave is x1, i.e. non-interleaved at the host bridge level, > > then the specified granularity does not matter. Override the window > > specified granularity to the CXL minimum so that any valid region > > granularity is >= to the root granularity. > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > Hi Dan, > > Debugging exactly why this is failing (from cxl.git/preview) for my test setup... > (1 hb, 8 rp, 8 direct connected devices) > > If I set the interleave granularity of a region to 256, I end > up with 256 for the CFMWS which is fine, then 512 for the HB which > is not - EP interleave granularity is expected 256. > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070 > > Calculates the eig as address_bit - eiw + 1 > > iw = 8 > eiw = 3 > peig = 0 (pig = 256) > peiw = 0 (piw = 1) > (all as expected I think...) > > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3) > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong. > > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this > setup.. Yeah, the "+ 1" is not required when routing from the x1 HB level. I can reproduce this config with cxl_test to validate. > Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it > is at least landing in the right file!) > Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the > qemu address logic is right but it'll never work with that value). I'm struggling to figure > out where we actually set the interleave ways for an EP. Ugh, it is not set and I think I was blinded by some successful region creation results and assumed they also ran cycles to validate the data consistency. My expectation is that the EP interleave_ways is set here: diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index cf5d5811fe4c..dd4035d92041 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1285,6 +1285,8 @@ static int cxl_region_attach(struct cxl_region *cxlr, p->targets[pos] = cxled; cxled->pos = pos; + cxled->cxld.interleave_ways = p->interleave_ways; + cxled->cxld.interleave_granularity = p->interleave_granularity; p->nr_targets++; if (p->nr_targets == p->interleave_ways) { > Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps > because the devices give better performance with 1024 byte sequential reads) > > Clearly going to be one of those bugs all the way down days. > Yes, the hunt continues, but I think the driver has some large ones the squash first.
On Tue, 2 Aug 2022 10:33:19 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > Jonathan Cameron wrote: > > On Fri, 22 Jul 2022 17:56:09 -0700 > > Dan Williams <dan.j.williams@intel.com> wrote: > > > > > The kernel enforces that region granularity is >= to the top-level > > > interleave-granularity for the given CXL window. However, when the CXL > > > window interleave is x1, i.e. non-interleaved at the host bridge level, > > > then the specified granularity does not matter. Override the window > > > specified granularity to the CXL minimum so that any valid region > > > granularity is >= to the root granularity. > > > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > > > Hi Dan, > > > > Debugging exactly why this is failing (from cxl.git/preview) for my test setup... > > (1 hb, 8 rp, 8 direct connected devices) > > > > If I set the interleave granularity of a region to 256, I end > > up with 256 for the CFMWS which is fine, then 512 for the HB which > > is not - EP interleave granularity is expected 256. > > > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070 > > > > Calculates the eig as address_bit - eiw + 1 > > > > iw = 8 > > eiw = 3 > > peig = 0 (pig = 256) > > peiw = 0 (piw = 1) > > (all as expected I think...) > > > > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3) > > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong. > > > > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this > > setup.. > > Yeah, the "+ 1" is not required when routing from the x1 HB level. I can > reproduce this config with cxl_test to validate. Other than this off by one, with the other fixes you posted everything now works for me with the particular test case above. Thanks Jonathan > > > Even with this hacked, qemu address decode is landing at wrong address in the backing files (but it > > is at least landing in the right file!) > > Curiously interleave ways = 1 for the EPs which is obviously wrong. (I'm not convinced the > > qemu address logic is right but it'll never work with that value). I'm struggling to figure > > out where we actually set the interleave ways for an EP. > > Ugh, it is not set and I think I was blinded by some successful region > creation results and assumed they also ran cycles to validate the data > consistency. My expectation is that the EP interleave_ways is set here: > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index cf5d5811fe4c..dd4035d92041 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -1285,6 +1285,8 @@ static int cxl_region_attach(struct cxl_region *cxlr, > > p->targets[pos] = cxled; > cxled->pos = pos; > + cxled->cxld.interleave_ways = p->interleave_ways; > + cxled->cxld.interleave_granularity = p->interleave_granularity; > p->nr_targets++; > > if (p->nr_targets == p->interleave_ways) { > > > > Also I'm not having much luck requesting a larger interleave granularity for the region (desirable perhaps > > because the devices give better performance with 1024 byte sequential reads) > > > > Clearly going to be one of those bugs all the way down days. > > > > Yes, the hunt continues, but I think the driver has some large ones the > squash first.
Jonathan Cameron wrote: > On Tue, 2 Aug 2022 10:33:19 -0700 > Dan Williams <dan.j.williams@intel.com> wrote: > > > Jonathan Cameron wrote: > > > On Fri, 22 Jul 2022 17:56:09 -0700 > > > Dan Williams <dan.j.williams@intel.com> wrote: > > > > > > > The kernel enforces that region granularity is >= to the top-level > > > > interleave-granularity for the given CXL window. However, when the CXL > > > > window interleave is x1, i.e. non-interleaved at the host bridge level, > > > > then the specified granularity does not matter. Override the window > > > > specified granularity to the CXL minimum so that any valid region > > > > granularity is >= to the root granularity. > > > > > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > > > > > Hi Dan, > > > > > > Debugging exactly why this is failing (from cxl.git/preview) for my test setup... > > > (1 hb, 8 rp, 8 direct connected devices) > > > > > > If I set the interleave granularity of a region to 256, I end > > > up with 256 for the CFMWS which is fine, then 512 for the HB which > > > is not - EP interleave granularity is expected 256. > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070 > > > > > > Calculates the eig as address_bit - eiw + 1 > > > > > > iw = 8 > > > eiw = 3 > > > peig = 0 (pig = 256) > > > peiw = 0 (piw = 1) > > > (all as expected I think...) > > > > > > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3) > > > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong. > > > > > > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this > > > setup.. > > > > Yeah, the "+ 1" is not required when routing from the x1 HB level. I can > > reproduce this config with cxl_test to validate. > > Other than this off by one, with the other fixes you posted everything now works for me > with the particular test case above. Oh, you are hitting this path in your test? I misread the condition when this triggers. It should be skipping the granularity increment when the port's parent interleave is x1. I.e. in this case with devices directly connected to a host-bridge where the host-bridge is not interleaved then there is no need for an extra address bit to route the decode.
On Wed, 3 Aug 2022 10:18:27 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > Jonathan Cameron wrote: > > On Tue, 2 Aug 2022 10:33:19 -0700 > > Dan Williams <dan.j.williams@intel.com> wrote: > > > > > Jonathan Cameron wrote: > > > > On Fri, 22 Jul 2022 17:56:09 -0700 > > > > Dan Williams <dan.j.williams@intel.com> wrote: > > > > > > > > > The kernel enforces that region granularity is >= to the top-level > > > > > interleave-granularity for the given CXL window. However, when the CXL > > > > > window interleave is x1, i.e. non-interleaved at the host bridge level, > > > > > then the specified granularity does not matter. Override the window > > > > > specified granularity to the CXL minimum so that any valid region > > > > > granularity is >= to the root granularity. > > > > > > > > > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > > > > > > > Hi Dan, > > > > > > > > Debugging exactly why this is failing (from cxl.git/preview) for my test setup... > > > > (1 hb, 8 rp, 8 direct connected devices) > > > > > > > > If I set the interleave granularity of a region to 256, I end > > > > up with 256 for the CFMWS which is fine, then 512 for the HB which > > > > is not - EP interleave granularity is expected 256. > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tree/drivers/cxl/core/region.c?h=preview#n1070 > > > > > > > > Calculates the eig as address_bit - eiw + 1 > > > > > > > > iw = 8 > > > > eiw = 3 > > > > peig = 0 (pig = 256) > > > > peiw = 0 (piw = 1) > > > > (all as expected I think...) > > > > > > > > So address_bit = s max(peig + peiw, eiw + peig) = max(0, 3) > > > > and eig = 3 - 3 + 1 = 1 (ig = 512) which is wrong. > > > > > > > > I'm not 100% sure on the logic behind this maths, but would expect eig = 0 as the output for this > > > > setup.. > > > > > > Yeah, the "+ 1" is not required when routing from the x1 HB level. I can > > > reproduce this config with cxl_test to validate. > > > > Other than this off by one, with the other fixes you posted everything now works for me > > with the particular test case above. > > Oh, you are hitting this path in your test? I misread the condition when > this triggers. It should be skipping the granularity increment when the > port's parent interleave is x1. I.e. in this case with devices directly > connected to a host-bridge where the host-bridge is not interleaved then > there is no need for an extra address bit to route the decode. > I haven't chased down why, but definitely hitting it with resulting routing to wrong device. Current test when setting interleave gran for the decoder1.0 (the HB in this case) is on the ways of decoder 1.0 which is 8. Maybe the test should be on the parent_rr nr_targets? (or parent_iw should work I think) That makes things 'work' for this particular test, but need to test on a wider range of setups. Jonathan
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index eb436268b92c..67137e17b8c9 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -140,6 +140,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, .end = res->end, }; cxld->interleave_ways = ways; + /* + * Minimize the x1 granularity to advertise support for any + * valid region granularity + */ + if (ways == 1) + ig = 256; cxld->interleave_granularity = ig; rc = cxl_decoder_add(cxld, target_map);
The kernel enforces that region granularity is >= to the top-level interleave-granularity for the given CXL window. However, when the CXL window interleave is x1, i.e. non-interleaved at the host bridge level, then the specified granularity does not matter. Override the window specified granularity to the CXL minimum so that any valid region granularity is >= to the root granularity. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/cxl/acpi.c | 6 ++++++ 1 file changed, 6 insertions(+)