Message ID | 20240113050421.1622533-1-alison.schofield@intel.com |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] cxl/region: Allow out of order assembly of autodiscovered regions | expand |
A small quibble with the Subject: line, this looks like an "RFT" (request-for-test) not an "RFC", because the code looks likely correct. Given that testers sometimes disappear it would be nice to have a unit test reproducer for this and not require a re-test on hardware. alison.schofield@ wrote: > From: Alison Schofield <alison.schofield@intel.com> > > Wonjae Lee, > > Here is the RFC Patch I mentioned in this thread: > https://lore.kernel.org/linux-cxl/20240112092201epcms2p577b3c979bdc694a370e5952edc091f68@epcms2p5/ > > This passes the cxl-test suite, so hoping no regression, but it needs > to be tested with the required config: 2 memdevs connected to the same > port, each memdev belongs to a different auto-region. Repeated attempts > should hit the test case and emit this debug message upon success: > dev_dbg(&cxlr_a->dev, "allow out of order region ref alloc\n"); So one of the largest roadblocks for creating arbitrary region assembly scenarios with cxl_test is the inability to save and restore decoder settings. The patch below adds support for recording decoder settings and skipping the reset of those values when unloading the driver. Then, when the driver is reloaded, it simulates the case of BIOS created CXL regions prior to OS boot. We can go after even finer grained cases once the uunit effort settles, but in the meantime cxl_test can add auto-assembly regression tests with the current topology. With the below you can simply trigger "cxl {en,dis}able-memdev" in the proper order to cause the violation. -- >8 -- From 5839392a3dbb64dbbac0630d930b1f48a8ef7ea6 Mon Sep 17 00:00:00 2001 From: Dan Williams <dan.j.williams@intel.com> Date: Wed, 17 Jan 2024 20:56:20 -0800 Subject: [PATCH] tools/testing/cxl: Add decoder save/restore support Record decoder values at init and mock_decoder_commit() time, and restore them at the next invocation of mock_init_hdm_decoder(). Add 2 attributes to the cxl_test "cxl_acpi" device to optionally flush the cache of topology decoder values, or disable updating the decoder at mock_decoder_reset() time. This enables replaying a saved decoder configuration when re-triggering a topology scan by re-binding the cxl_acpi driver to "cxl_acpi.0" (the cxl_test emulation of an ACPI0017 instance). # modprobe cxl_test # cxl list -RB -b cxl_test -u { "bus":"root3", "provider":"cxl_test", "regions:root3":[ { "region":"region5", "resource":"0xf010000000", "size":"512.00 MiB (536.87 MB)", "type":"ram", "interleave_ways":2, "interleave_granularity":4096, "decode_state":"commit" } ] } # echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_registry_reset_disable # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind # cxl list -RB -b cxl_test -u # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind # cxl list -RB -b cxl_test -u { "bus":"root3", "provider":"cxl_test", "regions:root3":[ { "region":"region5", "resource":"0xf010000000", "size":"512.00 MiB (536.87 MB)", "type":"ram", "interleave_ways":2, "interleave_granularity":4096, "decode_state":"commit" } ] } Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- tools/testing/cxl/test/cxl.c | 268 +++++++++++++++++++++++++++++++++++ 1 file changed, 268 insertions(+) diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index f4e517a0c774..3f333d6a57be 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -44,6 +44,9 @@ struct platform_device *cxl_mem_single[NR_MEM_SINGLE]; static struct platform_device *cxl_rch[NR_CXL_RCH]; static struct platform_device *cxl_rcd[NR_CXL_RCH]; +static DEFINE_XARRAY(decoder_registry); +static bool decoder_registry_reset_disable; + static inline bool is_multi_bridge(struct device *dev) { int i; @@ -660,6 +663,153 @@ static int map_targets(struct device *dev, void *data) return 0; } +static unsigned long cxld_registry_index(struct cxl_decoder *cxld) +{ + struct cxl_port *port = to_cxl_port(cxld->dev.parent); + + /* + * Upper nibble of a kernel pointer is 0xff, chop that to make + * space for a cxl_decoder id which should be less than 128 + * given decoder count is a 4-bit field. + * + * While @port is reallocated each enumeration, @port->uport_dev + * is stable. + */ + dev_WARN_ONCE(&port->dev, cxld->id >= 128, + "decoder id:%d out of range\n", cxld->id); + return (((unsigned long) port->uport_dev) << 4) | cxld->id; +} + +struct cxl_test_decoder { + union { + struct cxl_switch_decoder cxlsd; + struct cxl_endpoint_decoder cxled; + }; + union { + struct cxl_dport *targets[CXL_DECODER_MAX_INTERLEAVE]; + struct range dpa_range; + }; +}; + +static struct cxl_test_decoder *cxld_registry_find(struct cxl_decoder *cxld) +{ + return xa_load(&decoder_registry, cxld_registry_index(cxld)); +} + +#define dbg_cxld(port, msg, cxld) \ + do { \ + struct cxl_decoder *___d = (cxld); \ + dev_dbg((port)->uport_dev, \ + "decoder%d: %s range: %#llx-%#llx iw: %d ig: %d flags: %#lx\n", \ + ___d->id, msg, ___d->hpa_range.start, \ + ___d->hpa_range.end + 1, ___d->interleave_ways, \ + ___d->interleave_granularity, ___d->flags); \ + } while (0) + +static int mock_decoder_commit(struct cxl_decoder *cxld); +static int mock_decoder_reset(struct cxl_decoder *cxld); + +static void cxld_copy(struct cxl_decoder *a, struct cxl_decoder *b) +{ + a->id = b->id; + a->hpa_range = b->hpa_range; + a->interleave_ways = b->interleave_ways; + a->interleave_granularity = b->interleave_granularity; + a->target_type = b->target_type; + a->flags = b->flags; + a->commit = mock_decoder_commit; + a->reset = mock_decoder_reset; +} + +static void cxld_registry_restore(struct cxl_decoder *cxld, struct cxl_test_decoder *td) +{ + struct cxl_port *port = to_cxl_port(cxld->dev.parent); + + if (is_switch_decoder(&cxld->dev)) { + struct cxl_switch_decoder *cxlsd = to_cxl_switch_decoder(&cxld->dev); + + dbg_cxld(port, "restore", &td->cxlsd.cxld); + cxld_copy(cxld, &td->cxlsd.cxld); + WARN_ON(cxlsd->nr_targets != td->cxlsd.nr_targets); + + /* convert saved dport devs to dports */ + for (int i = 0; i < cxlsd->nr_targets; i++) { + struct cxl_dport *dport; + + if (!td->cxlsd.target[i]) + continue; + dport = cxl_find_dport_by_dev( + port, (struct device *)td->cxlsd.target[i]); + WARN_ON(!dport); + cxlsd->target[i] = dport; + } + } else { + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev); + + dbg_cxld(port, "restore", &td->cxled.cxld); + cxld_copy(cxld, &td->cxled.cxld); + cxled->state = td->cxled.state; + cxled->skip = td->cxled.skip; + if (range_len(&td->dpa_range)) + devm_cxl_dpa_reserve(cxled, td->dpa_range.start, + range_len(&td->dpa_range), + td->cxled.skip); + if (cxld->flags & CXL_DECODER_F_ENABLE) + port->commit_end = cxld->id; + } +} + +static void __cxld_registry_save(struct cxl_test_decoder *td, + struct cxl_decoder *cxld) +{ + if (is_switch_decoder(&cxld->dev)) { + struct cxl_switch_decoder *cxlsd = to_cxl_switch_decoder(&cxld->dev); + + cxld_copy(&td->cxlsd.cxld, cxld); + td->cxlsd.nr_targets = cxlsd->nr_targets; + + /* save dport devs as a stable placeholder for dports */ + for (int i = 0; i < cxlsd->nr_targets; i++) { + if (!cxlsd->target[i]) + continue; + td->cxlsd.target[i] = + (struct cxl_dport *)cxlsd->target[i]->dport_dev; + } + } else { + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev); + + cxld_copy(&td->cxled.cxld, cxld); + td->cxled.state = cxled->state; + td->cxled.skip = cxled->skip; + if (cxled->dpa_res) { + td->dpa_range.start = cxled->dpa_res->start; + td->dpa_range.end = cxled->dpa_res->end; + } else { + td->dpa_range.start = 0; + td->dpa_range.end = -1; + } + } +} + +static void cxld_registry_save(struct cxl_test_decoder *td, struct cxl_decoder *cxld) +{ + struct cxl_port *port = to_cxl_port(cxld->dev.parent); + + dbg_cxld(port, "save", cxld); + __cxld_registry_save(td, cxld); +} + +static void cxld_registry_update(struct cxl_decoder *cxld) +{ + struct cxl_port *port = to_cxl_port(cxld->dev.parent); + struct cxl_test_decoder *td = cxld_registry_find(cxld); + + dev_WARN_ONCE(port->uport_dev, !td, "%s failed\n", __func__); + + dbg_cxld(port, "update", cxld); + __cxld_registry_save(td, cxld); +} + static int mock_decoder_commit(struct cxl_decoder *cxld) { struct cxl_port *port = to_cxl_port(cxld->dev.parent); @@ -679,6 +829,7 @@ static int mock_decoder_commit(struct cxl_decoder *cxld) port->commit_end++; cxld->flags |= CXL_DECODER_F_ENABLE; + cxld_registry_update(cxld); return 0; } @@ -701,10 +852,43 @@ static int mock_decoder_reset(struct cxl_decoder *cxld) port->commit_end--; cxld->flags &= ~CXL_DECODER_F_ENABLE; + if (decoder_registry_reset_disable) + dev_dbg(port->uport_dev, "decoder%d: skip registry update\n", + cxld->id); + else + cxld_registry_update(cxld); return 0; } +static void cxld_registry_invalidate(void) +{ + unsigned long index; + void *entry; + + xa_for_each(&decoder_registry, index, entry) { + xa_erase(&decoder_registry, index); + kfree(entry); + } +} + +static struct cxl_test_decoder *cxld_registry_new(struct cxl_decoder *cxld) +{ + struct cxl_test_decoder *td __free(kfree) = kzalloc(sizeof(*td), GFP_KERNEL); + + if (!td) + return NULL; + + if (xa_insert(&decoder_registry, cxld_registry_index(cxld), td, + GFP_KERNEL)) { + WARN_ON(1); + return NULL; + } + + cxld_registry_save(td, cxld); + return no_free_ptr(td); +} + static void default_mock_decoder(struct cxl_decoder *cxld) { cxld->hpa_range = (struct range){ @@ -717,6 +901,9 @@ static void default_mock_decoder(struct cxl_decoder *cxld) cxld->target_type = CXL_DECODER_HOSTONLYMEM; cxld->commit = mock_decoder_commit; cxld->reset = mock_decoder_reset; + + if (!cxld_registry_new(cxld)) + dev_dbg(&cxld->dev, "failed to add to registry\n"); } static int first_decoder(struct device *dev, void *data) @@ -738,6 +925,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) struct cxl_endpoint_decoder *cxled; struct cxl_switch_decoder *cxlsd; struct cxl_port *port, *iter; + struct cxl_test_decoder *td; const int size = SZ_512M; struct cxl_memdev *cxlmd; struct cxl_dport *dport; @@ -767,6 +955,12 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) port = cxled_to_port(cxled); } + td = cxld_registry_find(cxld); + if (td) { + cxld_registry_restore(cxld, td); + return; + } + /* * The first decoder on the first 2 devices on the first switch * attached to host-bridge0 mock a fake / static RAM region. All @@ -795,6 +989,8 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) devm_cxl_dpa_reserve(cxled, 0, size / cxld->interleave_ways, 0); cxld->commit = mock_decoder_commit; cxld->reset = mock_decoder_reset; + if (!cxld_registry_new(cxld)) + dev_dbg(&cxld->dev, "failed to add to registry\n"); /* * Now that endpoint decoder is set up, walk up the hierarchy @@ -837,6 +1033,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) .start = base, .end = base + size - 1, }; + cxld_registry_update(cxld); put_device(dev); } } @@ -1233,6 +1430,73 @@ static void cxl_single_exit(void) } } +static ssize_t decoder_registry_invalidate_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + unsigned long index; + bool empty = true; + void *entry; + + xa_for_each(&decoder_registry, index, entry) { + empty = false; + break; + } + + return sysfs_emit(buf, "%d\n", !empty); +} + +static ssize_t decoder_registry_invalidate_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + bool invalidate; + int rc; + + rc = kstrtobool(buf, &invalidate); + if (rc) + return rc; + + guard(device)(dev); + + if (dev->driver) + return -EBUSY; + + cxld_registry_invalidate(); + return count; +} + +static DEVICE_ATTR_RW(decoder_registry_invalidate); + +static ssize_t +decoder_registry_reset_disable_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%d\n", decoder_registry_reset_disable); +} + +static ssize_t +decoder_registry_reset_disable_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + int rc; + + rc = kstrtobool(buf, &decoder_registry_reset_disable); + if (rc) + return rc; + return count; +} + +static DEVICE_ATTR_RW(decoder_registry_reset_disable); + +static struct attribute *cxl_acpi_attrs[] = { + &dev_attr_decoder_registry_invalidate.attr, + &dev_attr_decoder_registry_reset_disable.attr, + NULL +}; +ATTRIBUTE_GROUPS(cxl_acpi); + static __init int cxl_test_init(void) { int rc, i; @@ -1377,6 +1641,7 @@ static __init int cxl_test_init(void) mock_companion(&acpi0017_mock, &cxl_acpi->dev); acpi0017_mock.dev.bus = &platform_bus_type; + cxl_acpi->dev.groups = cxl_acpi_groups; rc = platform_device_add(cxl_acpi); if (rc) @@ -1446,6 +1711,9 @@ static __exit void cxl_test_exit(void) depopulate_all_mock_resources(); gen_pool_destroy(cxl_mock_pool); unregister_cxl_mock_ops(&cxl_mock_ops); + + cxld_registry_invalidate(); + xa_destroy(&decoder_registry); } module_param(interleave_arithmetic, int, 0444);
Thank you for your quick response and sending the patch! I'm very sorry for the delay in replying to your mail. It took some time to reproduce test and check the issue. On Fri, Jan 12, 2024 at 09:04:21PM -0800, alison.schofield@intel.com wrote: > From: Alison Schofield <alison.schofield@intel.com> > > Wonjae Lee, > > Here is the RFC Patch I mentioned in this thread: > https://lore.kernel.org/linux-cxl/20240112092201epcms2p577b3c979bdc694a370e5952edc091f68@epcms2p5/ > > This passes the cxl-test suite, so hoping no regression, but it needs > to be tested with the required config: 2 memdevs connected to the same > port, each memdev belongs to a different auto-region. Repeated attempts > should hit the test case and emit this debug message upon success: > dev_dbg(&cxlr_a->dev, "allow out of order region ref alloc\n"); > > Failure will be the HPA order violation message. > > All reviewers & testers welcomed. After applying your patch to base-commit, I ran the same test I mentioned in the previous thread, and during the boot process, a null pointer dereference occurs as shown below: thread: https://lore.kernel.org/linux-cxl/ZaFn0dfaJ8iOi1UW@aschofie-mobl2/T/#t base-commit: 3601311593eb44d34f142b993cb6f38f9a7863b3 dmesg log: [] cxl region0: region sort successful [] cxl region0: mem1:endpoint6 decoder6.0 add: mem1:decoder6.0 @ 0 next: none nr_eps: 1 nr_targets: 1 [] cxl region0: pci0000:3d:port1 decoder1.0 add: mem1:decoder6.0 @ 0 next: mem1 nr_eps: 1 nr_targets: 1 [] BUG: kernel NULL pointer dereference, address: 00000000000002e8 ... [] Call Trace: ... [] ? auto_order_ok+0x5f/0xb0 [cxl_core] [] cxl_region_attach_position+0x379/0x860 [cxl_core] [] cxl_region_attach+0x466/0x8b0 [cxl_core] To be specific, cxld_a in auto_order_ok() is NULL: cxld_a = cxl_region_find_decoder(port, cxlr_a); /* cxld_a == NULL */ I've noticed that if I pass cxled as a function argument appropriately, and get cxld_a as shown below, the region initialization succeeds with the log "cxl region0: allow out of order region ref alloc" as you mention. @@ -756,8 +756,8 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, return to_cxl_decoder(dev); } -bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, - struct cxl_region *cxlr_b) +bool auto_order_ok(struct cxl_port *port, struct cxl_endpoint_decoder *cxled, + struct cxl_region *cxlr_a, struct cxl_region *cxlr_b) { struct cxl_region_ref *cxl_rr; struct cxl_decoder *cxld_a, *cxld_b; @@ -771,7 +771,10 @@ bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, if (!test_bit(CXL_REGION_F_AUTO, &cxlr_a->flags)) return false; - cxld_a = cxl_region_find_decoder(port, cxlr_a); + if (port == cxled_to_port(cxled)) + cxld_a = &cxled->cxld; + else + cxld_a = cxl_region_find_decoder(port, cxlr_a); I hope this helps. Thanks, Wonjae > Begin commit log: > Autodiscovered regions can fail to assemble if they are not discovered > in HPA decode order. The user will see failure messages like: > > [] cxl region0: endpoint5: HPA order violation region1 > [] cxl region0: endpoint5: failed to allocate region reference > > The check that is causing the failure helps the CXL driver enforce > a CXL spec mandate that decoders be committed in HPA order. The > check is needless for autodiscovered regions since their decoders > are already programmed. Trying to enforce order in the assembly of > these regions is useless because they are assembled once all their > member endpoints arrive, and there is no guarantee on the order in > which endpoints are discovered during probe. > > Keep the existing check, but for autodiscovered regions, allow the > out of order assembly after a sanity check that the lowered numbered > decoder has the lower HPA starting address. > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > --- > drivers/cxl/core/region.c 34 +++++++++++++++++++++++++++++++++- > 1 file changed, 33 insertions(+), 1 deletion(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 0f05692bfec3..8770ebcae05d 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -753,6 +753,37 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > return to_cxl_decoder(dev); > } > > +bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, > + struct cxl_region *cxlr_b) > +{ > + struct cxl_region_ref *cxl_rr; > + struct cxl_decoder *cxld_a, *cxld_b; > + > + /* > + * Allow the out of order assembly of auto-discovered regions as > + * long as correct decoder programming order can be verified. > + * > + * Per CXL Spec 3.1 8.2.4.20.12 Committing Decoder Programming, > + * software must commit decoders in HPA order. Therefore it is > + * sufficient to sanity check that the lowered number decoder > + * has the lower HPA starting address. > + */ > + if (!test_bit(CXL_REGION_F_AUTO, &cxlr_a->flags)) > + return false; > + > + cxld_a = cxl_region_find_decoder(port, cxlr_a); > + cxl_rr = cxl_rr_load(port, cxlr_b); > + cxld_b = cxl_rr->decoder; > + > + if (cxld_b->id > cxld_a->id) { > + dev_dbg(&cxlr_a->dev, > + "allow out of order region ref alloc\n"); > + return true; > + } > + > + return false; > +} > + > static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > struct cxl_region *cxlr) > { > @@ -767,7 +798,8 @@ static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > if (!ip->res) > continue; > > - if (ip->res->start > p->res->start) { > + if (ip->res->start > p->res->start && > + (!auto_order_ok(port, cxlr, iter->region))) { > dev_dbg(&cxlr->dev, > "%s: HPA order violation %s:%pr vs %pr\n", > dev_name(&port->dev), > > base-commit: 3601311593eb44d34f142b993cb6f38f9a7863b3 > -- > 2.37.3 >
On Fri, Jan 26, 2024 at 05:51:08PM +0900, Wonjae Lee wrote: > Thank you for your quick response and sending the patch! > > I'm very sorry for the delay in replying to your mail. It took some time to > reproduce test and check the issue. > > On Fri, Jan 12, 2024 at 09:04:21PM -0800, alison.schofield@intel.com wrote: > > From: Alison Schofield <alison.schofield@intel.com> > > > > Wonjae Lee, > > > > Here is the RFC Patch I mentioned in this thread: > > https://lore.kernel.org/linux-cxl/20240112092201epcms2p577b3c979bdc694a370e5952edc091f68@epcms2p5/ > > > > This passes the cxl-test suite, so hoping no regression, but it needs > > to be tested with the required config: 2 memdevs connected to the same > > port, each memdev belongs to a different auto-region. Repeated attempts > > should hit the test case and emit this debug message upon success: > > dev_dbg(&cxlr_a->dev, "allow out of order region ref alloc\n"); > > > > Failure will be the HPA order violation message. > > > > All reviewers & testers welcomed. > > After applying your patch to base-commit, I ran the same test I mentioned in > the previous thread, and during the boot process, a null pointer dereference > occurs as shown below: > > thread: https://lore.kernel.org/linux-cxl/ZaFn0dfaJ8iOi1UW@aschofie-mobl2/T/#t > base-commit: 3601311593eb44d34f142b993cb6f38f9a7863b3 > dmesg log: > [] cxl region0: region sort successful > [] cxl region0: mem1:endpoint6 decoder6.0 add: mem1:decoder6.0 @ 0 next: none nr_eps: 1 nr_targets: 1 > [] cxl region0: pci0000:3d:port1 decoder1.0 add: mem1:decoder6.0 @ 0 next: mem1 nr_eps: 1 nr_targets: 1 > [] BUG: kernel NULL pointer dereference, address: 00000000000002e8 > ... > [] Call Trace: > ... > [] ? auto_order_ok+0x5f/0xb0 [cxl_core] > [] cxl_region_attach_position+0x379/0x860 [cxl_core] > [] cxl_region_attach+0x466/0x8b0 [cxl_core] > > > To be specific, cxld_a in auto_order_ok() is NULL: > cxld_a = cxl_region_find_decoder(port, cxlr_a); /* cxld_a == NULL */ > > I've noticed that if I pass cxled as a function argument appropriately, and get > cxld_a as shown below, the region initialization succeeds with the log "cxl > region0: allow out of order region ref alloc" as you mention. > > @@ -756,8 +756,8 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > return to_cxl_decoder(dev); > } > -bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, > - struct cxl_region *cxlr_b) > +bool auto_order_ok(struct cxl_port *port, struct cxl_endpoint_decoder *cxled, > + struct cxl_region *cxlr_a, struct cxl_region *cxlr_b) > { > struct cxl_region_ref *cxl_rr; > struct cxl_decoder *cxld_a, *cxld_b; > @@ -771,7 +771,10 @@ bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, > if (!test_bit(CXL_REGION_F_AUTO, &cxlr_a->flags)) > return false; > > - cxld_a = cxl_region_find_decoder(port, cxlr_a); > + if (port == cxled_to_port(cxled)) > + cxld_a = &cxled->cxld; > + else > + cxld_a = cxl_region_find_decoder(port, cxlr_a); > > I hope this helps. Yes, this is very helpful! Glad to have you trying it out. I think at this point we can call alloc_region_ref() with a known cxled of this port and simply pass it on. I'm going to skip this if...else pattern, and simply get cxld_a = &cxled->cxld; I think there may be opportunity for refinements in cxl_region_find_decoder() and it's descendants, but it's not directly tied to this patch. Thanks, Alison > > Thanks, > Wonjae > > > Begin commit log: > > Autodiscovered regions can fail to assemble if they are not discovered > > in HPA decode order. The user will see failure messages like: > > > > [] cxl region0: endpoint5: HPA order violation region1 > > [] cxl region0: endpoint5: failed to allocate region reference > > > > The check that is causing the failure helps the CXL driver enforce > > a CXL spec mandate that decoders be committed in HPA order. The > > check is needless for autodiscovered regions since their decoders > > are already programmed. Trying to enforce order in the assembly of > > these regions is useless because they are assembled once all their > > member endpoints arrive, and there is no guarantee on the order in > > which endpoints are discovered during probe. > > > > Keep the existing check, but for autodiscovered regions, allow the > > out of order assembly after a sanity check that the lowered numbered > > decoder has the lower HPA starting address. > > > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > > --- > > drivers/cxl/core/region.c 34 +++++++++++++++++++++++++++++++++- > > 1 file changed, 33 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > > index 0f05692bfec3..8770ebcae05d 100644 > > --- a/drivers/cxl/core/region.c > > +++ b/drivers/cxl/core/region.c > > @@ -753,6 +753,37 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > > return to_cxl_decoder(dev); > > } > > > > +bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, > > + struct cxl_region *cxlr_b) > > +{ > > + struct cxl_region_ref *cxl_rr; > > + struct cxl_decoder *cxld_a, *cxld_b; > > + > > + /* > > + * Allow the out of order assembly of auto-discovered regions as > > + * long as correct decoder programming order can be verified. > > + * > > + * Per CXL Spec 3.1 8.2.4.20.12 Committing Decoder Programming, > > + * software must commit decoders in HPA order. Therefore it is > > + * sufficient to sanity check that the lowered number decoder > > + * has the lower HPA starting address. > > + */ > > + if (!test_bit(CXL_REGION_F_AUTO, &cxlr_a->flags)) > > + return false; > > + > > + cxld_a = cxl_region_find_decoder(port, cxlr_a); > > + cxl_rr = cxl_rr_load(port, cxlr_b); > > + cxld_b = cxl_rr->decoder; > > + > > + if (cxld_b->id > cxld_a->id) { > > + dev_dbg(&cxlr_a->dev, > > + "allow out of order region ref alloc\n"); > > + return true; > > + } > > + > > + return false; > > +} > > + > > static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > > struct cxl_region *cxlr) > > { > > @@ -767,7 +798,8 @@ static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > > if (!ip->res) > > continue; > > > > - if (ip->res->start > p->res->start) { > > + if (ip->res->start > p->res->start && > > + (!auto_order_ok(port, cxlr, iter->region))) { > > dev_dbg(&cxlr->dev, > > "%s: HPA order violation %s:%pr vs %pr\n", > > dev_name(&port->dev), > > > > base-commit: 3601311593eb44d34f142b993cb6f38f9a7863b3 > > -- > > 2.37.3 > >
On Mon, Jan 29, 2024 at 08:37:24PM -0800, Alison Schofield wrote: > On Fri, Jan 26, 2024 at 05:51:08PM +0900, Wonjae Lee wrote: > > Thank you for your quick response and sending the patch! > > > > I'm very sorry for the delay in replying to your mail. It took some time to > > reproduce test and check the issue. > > > > On Fri, Jan 12, 2024 at 09:04:21PM -0800, alison.schofield@intel.com wrote: > > > From: Alison Schofield <alison.schofield@intel.com> > > > > > > Wonjae Lee, > > > > > > Here is the RFC Patch I mentioned in this thread: > > > https://lore.kernel.org/linux-cxl/20240112092201epcms2p577b3c979bdc694a370e5952edc091f68@epcms2p5/ > > > > > > This passes the cxl-test suite, so hoping no regression, but it needs > > > to be tested with the required config: 2 memdevs connected to the same > > > port, each memdev belongs to a different auto-region. Repeated attempts > > > should hit the test case and emit this debug message upon success: > > > dev_dbg(&cxlr_a->dev, "allow out of order region ref alloc\n"); > > > > > > Failure will be the HPA order violation message. > > > > > > All reviewers & testers welcomed. > > > > After applying your patch to base-commit, I ran the same test I mentioned in > > the previous thread, and during the boot process, a null pointer dereference > > occurs as shown below: > > > > thread: https://lore.kernel.org/linux-cxl/ZaFn0dfaJ8iOi1UW@aschofie-mobl2/T/#t > > base-commit: 3601311593eb44d34f142b993cb6f38f9a7863b3 > > dmesg log: > > [] cxl region0: region sort successful > > [] cxl region0: mem1:endpoint6 decoder6.0 add: mem1:decoder6.0 @ 0 next: none nr_eps: 1 nr_targets: 1 > > [] cxl region0: pci0000:3d:port1 decoder1.0 add: mem1:decoder6.0 @ 0 next: mem1 nr_eps: 1 nr_targets: 1 > > [] BUG: kernel NULL pointer dereference, address: 00000000000002e8 > > ... > > [] Call Trace: > > ... > > [] ? auto_order_ok+0x5f/0xb0 [cxl_core] > > [] cxl_region_attach_position+0x379/0x860 [cxl_core] > > [] cxl_region_attach+0x466/0x8b0 [cxl_core] > > > > > > To be specific, cxld_a in auto_order_ok() is NULL: > > cxld_a = cxl_region_find_decoder(port, cxlr_a); /* cxld_a == NULL */ > > > > I've noticed that if I pass cxled as a function argument appropriately, and get > > cxld_a as shown below, the region initialization succeeds with the log "cxl > > region0: allow out of order region ref alloc" as you mention. > > > > @@ -756,8 +756,8 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > > return to_cxl_decoder(dev); > > } > > -bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, > > - struct cxl_region *cxlr_b) > > +bool auto_order_ok(struct cxl_port *port, struct cxl_endpoint_decoder *cxled, > > + struct cxl_region *cxlr_a, struct cxl_region *cxlr_b) > > { > > struct cxl_region_ref *cxl_rr; > > struct cxl_decoder *cxld_a, *cxld_b; > > @@ -771,7 +771,10 @@ bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, > > if (!test_bit(CXL_REGION_F_AUTO, &cxlr_a->flags)) > > return false; > > > > - cxld_a = cxl_region_find_decoder(port, cxlr_a); > > + if (port == cxled_to_port(cxled)) > > + cxld_a = &cxled->cxld; > > + else > > + cxld_a = cxl_region_find_decoder(port, cxlr_a); > > > > I hope this helps. > > Yes, this is very helpful! Glad to have you trying it out. > > I think at this point we can call alloc_region_ref() with a known cxled > of this port and simply pass it on. I'm going to skip this if...else pattern, > and simply get cxld_a = &cxled->cxld; > > I think there may be opportunity for refinements in cxl_region_find_decoder() > and it's descendants, but it's not directly tied to this patch. > > Thanks, > Alison I'm glad it was useful. I agree to change it in the direction you said. Looking forward to the patch :) Thanks, Wonjae > > > > > Thanks, > > Wonjae > > > > > Begin commit log: > > > Autodiscovered regions can fail to assemble if they are not discovered > > > in HPA decode order. The user will see failure messages like: > > > > > > [] cxl region0: endpoint5: HPA order violation region1 > > > [] cxl region0: endpoint5: failed to allocate region reference > > > > > > The check that is causing the failure helps the CXL driver enforce > > > a CXL spec mandate that decoders be committed in HPA order. The > > > check is needless for autodiscovered regions since their decoders > > > are already programmed. Trying to enforce order in the assembly of > > > these regions is useless because they are assembled once all their > > > member endpoints arrive, and there is no guarantee on the order in > > > which endpoints are discovered during probe. > > > > > > Keep the existing check, but for autodiscovered regions, allow the > > > out of order assembly after a sanity check that the lowered numbered > > > decoder has the lower HPA starting address. > > > > > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > > > --- > > > drivers/cxl/core/region.c 34 +++++++++++++++++++++++++++++++++- > > > 1 file changed, 33 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > > > index 0f05692bfec3..8770ebcae05d 100644 > > > --- a/drivers/cxl/core/region.c > > > +++ b/drivers/cxl/core/region.c > > > @@ -753,6 +753,37 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > > > return to_cxl_decoder(dev); > > > } > > > > > > +bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, > > > + struct cxl_region *cxlr_b) > > > +{ > > > + struct cxl_region_ref *cxl_rr; > > > + struct cxl_decoder *cxld_a, *cxld_b; > > > + > > > + /* > > > + * Allow the out of order assembly of auto-discovered regions as > > > + * long as correct decoder programming order can be verified. > > > + * > > > + * Per CXL Spec 3.1 8.2.4.20.12 Committing Decoder Programming, > > > + * software must commit decoders in HPA order. Therefore it is > > > + * sufficient to sanity check that the lowered number decoder > > > + * has the lower HPA starting address. > > > + */ > > > + if (!test_bit(CXL_REGION_F_AUTO, &cxlr_a->flags)) > > > + return false; > > > + > > > + cxld_a = cxl_region_find_decoder(port, cxlr_a); > > > + cxl_rr = cxl_rr_load(port, cxlr_b); > > > + cxld_b = cxl_rr->decoder; > > > + > > > + if (cxld_b->id > cxld_a->id) { > > > + dev_dbg(&cxlr_a->dev, > > > + "allow out of order region ref alloc\n"); > > > + return true; > > > + } > > > + > > > + return false; > > > +} > > > + > > > static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > > > struct cxl_region *cxlr) > > > { > > > @@ -767,7 +798,8 @@ static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > > > if (!ip->res) > > > continue; > > > > > > - if (ip->res->start > p->res->start) { > > > + if (ip->res->start > p->res->start && > > > + (!auto_order_ok(port, cxlr, iter->region))) { > > > dev_dbg(&cxlr->dev, > > > "%s: HPA order violation %s:%pr vs %pr\n", > > > dev_name(&port->dev), > > > > > > base-commit: 3601311593eb44d34f142b993cb6f38f9a7863b3 > > > -- > > > 2.37.3 > > > >
On Thu, Jan 18, 2024 at 11:46:44AM -0800, Dan Williams wrote: - snip to decoder replay patch - > > So one of the largest roadblocks for creating arbitrary region assembly > scenarios with cxl_test is the inability to save and restore decoder > settings. > > The patch below adds support for recording decoder settings and skipping > the reset of those values when unloading the driver. Then, when the > driver is reloaded, it simulates the case of BIOS created CXL regions > prior to OS boot. > > We can go after even finer grained cases once the uunit effort settles, > but in the meantime cxl_test can add auto-assembly regression tests with > the current topology. > > With the below you can simply trigger "cxl {en,dis}able-memdev" in the > proper order to cause the violation. > > -- >8 -- > From 5839392a3dbb64dbbac0630d930b1f48a8ef7ea6 Mon Sep 17 00:00:00 2001 > From: Dan Williams <dan.j.williams@intel.com> > Date: Wed, 17 Jan 2024 20:56:20 -0800 > Subject: [PATCH] tools/testing/cxl: Add decoder save/restore support > > Record decoder values at init and mock_decoder_commit() time, and > restore them at the next invocation of mock_init_hdm_decoder(). Add 2 > attributes to the cxl_test "cxl_acpi" device to optionally flush the > cache of topology decoder values, or disable updating the decoder at > mock_decoder_reset() time. > > This enables replaying a saved decoder configuration when re-triggering > a topology scan by re-binding the cxl_acpi driver to "cxl_acpi.0" (the > cxl_test emulation of an ACPI0017 instance). Hi Dan, Sorry it's taken a while to come back on this. I did find it useful in testing the auto-assembly order issue as you suggested. I didn't use this one: &dev_attr_decoder_registry_invalidate.attr, I just reloaded the cxl-test module to do same. This I used: &dev_attr_decoder_registry_reset_disable.attr, with your decoders state fixup to set CXL_DECODER_STATE_AUTO, and a work-around to avoid pmem_probe failures on pmem region auto create. More generally, I'm wondering about the implementation of the 'registry_save'. Here it continuously updates during all cxl-test usage. Did you consider only creating the registry upon user request and then at the next mock_init_hmd_decoder() look for and use that registry if it exists. Usage would be: # echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_registry_create # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind And then maybe, for folks who like to acpi/unbind,bind, rather than reload module, we could offer a decoder_registry_remove attr. Am I missing something regarding the need to keep it updated on the fly? Alison > > # modprobe cxl_test > # cxl list -RB -b cxl_test -u > { > "bus":"root3", > "provider":"cxl_test", > "regions:root3":[ > { > "region":"region5", > "resource":"0xf010000000", > "size":"512.00 MiB (536.87 MB)", > "type":"ram", > "interleave_ways":2, > "interleave_granularity":4096, > "decode_state":"commit" > } > ] > } > # echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_registry_reset_disable > # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind > # cxl list -RB -b cxl_test -u > # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind > # cxl list -RB -b cxl_test -u > { > "bus":"root3", > "provider":"cxl_test", > "regions:root3":[ > { > "region":"region5", > "resource":"0xf010000000", > "size":"512.00 MiB (536.87 MB)", > "type":"ram", > "interleave_ways":2, > "interleave_granularity":4096, > "decode_state":"commit" > } > ] > } > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > tools/testing/cxl/test/cxl.c | 268 +++++++++++++++++++++++++++++++++++ > 1 file changed, 268 insertions(+) > > diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c > index f4e517a0c774..3f333d6a57be 100644 > --- a/tools/testing/cxl/test/cxl.c > +++ b/tools/testing/cxl/test/cxl.c > @@ -44,6 +44,9 @@ struct platform_device *cxl_mem_single[NR_MEM_SINGLE]; > static struct platform_device *cxl_rch[NR_CXL_RCH]; > static struct platform_device *cxl_rcd[NR_CXL_RCH]; > > +static DEFINE_XARRAY(decoder_registry); > +static bool decoder_registry_reset_disable; > + > static inline bool is_multi_bridge(struct device *dev) > { > int i; > @@ -660,6 +663,153 @@ static int map_targets(struct device *dev, void *data) > return 0; > } > > +static unsigned long cxld_registry_index(struct cxl_decoder *cxld) > +{ > + struct cxl_port *port = to_cxl_port(cxld->dev.parent); > + > + /* > + * Upper nibble of a kernel pointer is 0xff, chop that to make > + * space for a cxl_decoder id which should be less than 128 > + * given decoder count is a 4-bit field. > + * > + * While @port is reallocated each enumeration, @port->uport_dev > + * is stable. > + */ > + dev_WARN_ONCE(&port->dev, cxld->id >= 128, > + "decoder id:%d out of range\n", cxld->id); > + return (((unsigned long) port->uport_dev) << 4) | cxld->id; > +} > + > +struct cxl_test_decoder { > + union { > + struct cxl_switch_decoder cxlsd; > + struct cxl_endpoint_decoder cxled; > + }; > + union { > + struct cxl_dport *targets[CXL_DECODER_MAX_INTERLEAVE]; > + struct range dpa_range; > + }; > +}; > + > +static struct cxl_test_decoder *cxld_registry_find(struct cxl_decoder *cxld) > +{ > + return xa_load(&decoder_registry, cxld_registry_index(cxld)); > +} > + > +#define dbg_cxld(port, msg, cxld) \ > + do { \ > + struct cxl_decoder *___d = (cxld); \ > + dev_dbg((port)->uport_dev, \ > + "decoder%d: %s range: %#llx-%#llx iw: %d ig: %d flags: %#lx\n", \ > + ___d->id, msg, ___d->hpa_range.start, \ > + ___d->hpa_range.end + 1, ___d->interleave_ways, \ > + ___d->interleave_granularity, ___d->flags); \ > + } while (0) > + > +static int mock_decoder_commit(struct cxl_decoder *cxld); > +static int mock_decoder_reset(struct cxl_decoder *cxld); > + > +static void cxld_copy(struct cxl_decoder *a, struct cxl_decoder *b) > +{ > + a->id = b->id; > + a->hpa_range = b->hpa_range; > + a->interleave_ways = b->interleave_ways; > + a->interleave_granularity = b->interleave_granularity; > + a->target_type = b->target_type; > + a->flags = b->flags; > + a->commit = mock_decoder_commit; > + a->reset = mock_decoder_reset; > +} > + > +static void cxld_registry_restore(struct cxl_decoder *cxld, struct cxl_test_decoder *td) > +{ > + struct cxl_port *port = to_cxl_port(cxld->dev.parent); > + > + if (is_switch_decoder(&cxld->dev)) { > + struct cxl_switch_decoder *cxlsd = to_cxl_switch_decoder(&cxld->dev); > + > + dbg_cxld(port, "restore", &td->cxlsd.cxld); > + cxld_copy(cxld, &td->cxlsd.cxld); > + WARN_ON(cxlsd->nr_targets != td->cxlsd.nr_targets); > + > + /* convert saved dport devs to dports */ > + for (int i = 0; i < cxlsd->nr_targets; i++) { > + struct cxl_dport *dport; > + > + if (!td->cxlsd.target[i]) > + continue; > + dport = cxl_find_dport_by_dev( > + port, (struct device *)td->cxlsd.target[i]); > + WARN_ON(!dport); > + cxlsd->target[i] = dport; > + } > + } else { > + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev); > + > + dbg_cxld(port, "restore", &td->cxled.cxld); > + cxld_copy(cxld, &td->cxled.cxld); > + cxled->state = td->cxled.state; > + cxled->skip = td->cxled.skip; > + if (range_len(&td->dpa_range)) > + devm_cxl_dpa_reserve(cxled, td->dpa_range.start, > + range_len(&td->dpa_range), > + td->cxled.skip); > + if (cxld->flags & CXL_DECODER_F_ENABLE) > + port->commit_end = cxld->id; > + } > +} > + > +static void __cxld_registry_save(struct cxl_test_decoder *td, > + struct cxl_decoder *cxld) > +{ > + if (is_switch_decoder(&cxld->dev)) { > + struct cxl_switch_decoder *cxlsd = to_cxl_switch_decoder(&cxld->dev); > + > + cxld_copy(&td->cxlsd.cxld, cxld); > + td->cxlsd.nr_targets = cxlsd->nr_targets; > + > + /* save dport devs as a stable placeholder for dports */ > + for (int i = 0; i < cxlsd->nr_targets; i++) { > + if (!cxlsd->target[i]) > + continue; > + td->cxlsd.target[i] = > + (struct cxl_dport *)cxlsd->target[i]->dport_dev; > + } > + } else { > + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev); > + > + cxld_copy(&td->cxled.cxld, cxld); > + td->cxled.state = cxled->state; > + td->cxled.skip = cxled->skip; > + if (cxled->dpa_res) { > + td->dpa_range.start = cxled->dpa_res->start; > + td->dpa_range.end = cxled->dpa_res->end; > + } else { > + td->dpa_range.start = 0; > + td->dpa_range.end = -1; > + } > + } > +} > + > +static void cxld_registry_save(struct cxl_test_decoder *td, struct cxl_decoder *cxld) > +{ > + struct cxl_port *port = to_cxl_port(cxld->dev.parent); > + > + dbg_cxld(port, "save", cxld); > + __cxld_registry_save(td, cxld); > +} > + > +static void cxld_registry_update(struct cxl_decoder *cxld) > +{ > + struct cxl_port *port = to_cxl_port(cxld->dev.parent); > + struct cxl_test_decoder *td = cxld_registry_find(cxld); > + > + dev_WARN_ONCE(port->uport_dev, !td, "%s failed\n", __func__); > + > + dbg_cxld(port, "update", cxld); > + __cxld_registry_save(td, cxld); > +} > + > static int mock_decoder_commit(struct cxl_decoder *cxld) > { > struct cxl_port *port = to_cxl_port(cxld->dev.parent); > @@ -679,6 +829,7 @@ static int mock_decoder_commit(struct cxl_decoder *cxld) > > port->commit_end++; > cxld->flags |= CXL_DECODER_F_ENABLE; > + cxld_registry_update(cxld); > > return 0; > } > @@ -701,10 +852,43 @@ static int mock_decoder_reset(struct cxl_decoder *cxld) > > port->commit_end--; > cxld->flags &= ~CXL_DECODER_F_ENABLE; > + if (decoder_registry_reset_disable) > + dev_dbg(port->uport_dev, "decoder%d: skip registry update\n", > + cxld->id); > + else > + cxld_registry_update(cxld); > > return 0; > } > > +static void cxld_registry_invalidate(void) > +{ > + unsigned long index; > + void *entry; > + > + xa_for_each(&decoder_registry, index, entry) { > + xa_erase(&decoder_registry, index); > + kfree(entry); > + } > +} > + > +static struct cxl_test_decoder *cxld_registry_new(struct cxl_decoder *cxld) > +{ > + struct cxl_test_decoder *td __free(kfree) = kzalloc(sizeof(*td), GFP_KERNEL); > + > + if (!td) > + return NULL; > + > + if (xa_insert(&decoder_registry, cxld_registry_index(cxld), td, > + GFP_KERNEL)) { > + WARN_ON(1); > + return NULL; > + } > + > + cxld_registry_save(td, cxld); > + return no_free_ptr(td); > +} > + > static void default_mock_decoder(struct cxl_decoder *cxld) > { > cxld->hpa_range = (struct range){ > @@ -717,6 +901,9 @@ static void default_mock_decoder(struct cxl_decoder *cxld) > cxld->target_type = CXL_DECODER_HOSTONLYMEM; > cxld->commit = mock_decoder_commit; > cxld->reset = mock_decoder_reset; > + > + if (!cxld_registry_new(cxld)) > + dev_dbg(&cxld->dev, "failed to add to registry\n"); > } > > static int first_decoder(struct device *dev, void *data) > @@ -738,6 +925,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) > struct cxl_endpoint_decoder *cxled; > struct cxl_switch_decoder *cxlsd; > struct cxl_port *port, *iter; > + struct cxl_test_decoder *td; > const int size = SZ_512M; > struct cxl_memdev *cxlmd; > struct cxl_dport *dport; > @@ -767,6 +955,12 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) > port = cxled_to_port(cxled); > } > > + td = cxld_registry_find(cxld); > + if (td) { > + cxld_registry_restore(cxld, td); > + return; > + } > + > /* > * The first decoder on the first 2 devices on the first switch > * attached to host-bridge0 mock a fake / static RAM region. All > @@ -795,6 +989,8 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) > devm_cxl_dpa_reserve(cxled, 0, size / cxld->interleave_ways, 0); > cxld->commit = mock_decoder_commit; > cxld->reset = mock_decoder_reset; > + if (!cxld_registry_new(cxld)) > + dev_dbg(&cxld->dev, "failed to add to registry\n"); > > /* > * Now that endpoint decoder is set up, walk up the hierarchy > @@ -837,6 +1033,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld) > .start = base, > .end = base + size - 1, > }; > + cxld_registry_update(cxld); > put_device(dev); > } > } > @@ -1233,6 +1430,73 @@ static void cxl_single_exit(void) > } > } > > +static ssize_t decoder_registry_invalidate_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + unsigned long index; > + bool empty = true; > + void *entry; > + > + xa_for_each(&decoder_registry, index, entry) { > + empty = false; > + break; > + } > + > + return sysfs_emit(buf, "%d\n", !empty); > +} > + > +static ssize_t decoder_registry_invalidate_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + bool invalidate; > + int rc; > + > + rc = kstrtobool(buf, &invalidate); > + if (rc) > + return rc; > + > + guard(device)(dev); > + > + if (dev->driver) > + return -EBUSY; > + > + cxld_registry_invalidate(); > + return count; > +} > + > +static DEVICE_ATTR_RW(decoder_registry_invalidate); > + > +static ssize_t > +decoder_registry_reset_disable_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + return sysfs_emit(buf, "%d\n", decoder_registry_reset_disable); > +} > + > +static ssize_t > +decoder_registry_reset_disable_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + int rc; > + > + rc = kstrtobool(buf, &decoder_registry_reset_disable); > + if (rc) > + return rc; > + return count; > +} > + > +static DEVICE_ATTR_RW(decoder_registry_reset_disable); > + > +static struct attribute *cxl_acpi_attrs[] = { > + &dev_attr_decoder_registry_invalidate.attr, > + &dev_attr_decoder_registry_reset_disable.attr, > + NULL > +}; > +ATTRIBUTE_GROUPS(cxl_acpi); > + > static __init int cxl_test_init(void) > { > int rc, i; > @@ -1377,6 +1641,7 @@ static __init int cxl_test_init(void) > > mock_companion(&acpi0017_mock, &cxl_acpi->dev); > acpi0017_mock.dev.bus = &platform_bus_type; > + cxl_acpi->dev.groups = cxl_acpi_groups; > > rc = platform_device_add(cxl_acpi); > if (rc) > @@ -1446,6 +1711,9 @@ static __exit void cxl_test_exit(void) > depopulate_all_mock_resources(); > gen_pool_destroy(cxl_mock_pool); > unregister_cxl_mock_ops(&cxl_mock_ops); > + > + cxld_registry_invalidate(); > + xa_destroy(&decoder_registry); > } > > module_param(interleave_arithmetic, int, 0444); > -- > 2.43.0
Alison Schofield wrote: > > On Thu, Jan 18, 2024 at 11:46:44AM -0800, Dan Williams wrote: > > - snip to decoder replay patch - > > > > > So one of the largest roadblocks for creating arbitrary region assembly > > scenarios with cxl_test is the inability to save and restore decoder > > settings. > > > > The patch below adds support for recording decoder settings and skipping > > the reset of those values when unloading the driver. Then, when the > > driver is reloaded, it simulates the case of BIOS created CXL regions > > prior to OS boot. > > > > We can go after even finer grained cases once the uunit effort settles, > > but in the meantime cxl_test can add auto-assembly regression tests with > > the current topology. > > > > With the below you can simply trigger "cxl {en,dis}able-memdev" in the > > proper order to cause the violation. > > > > -- >8 -- > > From 5839392a3dbb64dbbac0630d930b1f48a8ef7ea6 Mon Sep 17 00:00:00 2001 > > From: Dan Williams <dan.j.williams@intel.com> > > Date: Wed, 17 Jan 2024 20:56:20 -0800 > > Subject: [PATCH] tools/testing/cxl: Add decoder save/restore support > > > > Record decoder values at init and mock_decoder_commit() time, and > > restore them at the next invocation of mock_init_hdm_decoder(). Add 2 > > attributes to the cxl_test "cxl_acpi" device to optionally flush the > > cache of topology decoder values, or disable updating the decoder at > > mock_decoder_reset() time. > > > > This enables replaying a saved decoder configuration when re-triggering > > a topology scan by re-binding the cxl_acpi driver to "cxl_acpi.0" (the > > cxl_test emulation of an ACPI0017 instance). > > Hi Dan, > > Sorry it's taken a while to come back on this. I did find it useful > in testing the auto-assembly order issue as you suggested. > > I didn't use this one: &dev_attr_decoder_registry_invalidate.attr, > I just reloaded the cxl-test module to do same. Makes sense, that can go. > This I used: &dev_attr_decoder_registry_reset_disable.attr, > with your decoders state fixup to set CXL_DECODER_STATE_AUTO, > and a work-around to avoid pmem_probe failures on pmem region > auto create. > > More generally, I'm wondering about the implementation of the > 'registry_save'. Here it continuously updates during all > cxl-test usage. Did you consider only creating the registry upon > user request and then at the next mock_init_hmd_decoder() look > for and use that registry if it exists. > > Usage would be: > # echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_registry_create You mean wait to snapshot decoder state when triggered? > # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind > # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind > > And then maybe, for folks who like to acpi/unbind,bind, rather > than reload module, we could offer a decoder_registry_remove attr. > > Am I missing something regarding the need to keep it updated > on the fly? I can see it being an alternate way, but not sure how to weigh one approach vs the other. Is the dynamic update getting in the way of a test case you are thinking about? Otherwise it seemed easy to reason that the registry is always on, but only takes effect when registry_reset_disable is set, and cxl_acpi rebinds the test device.
On Thu, Feb 08, 2024 at 02:57:44PM -0800, Dan Williams wrote: > Alison Schofield wrote: > > > > On Thu, Jan 18, 2024 at 11:46:44AM -0800, Dan Williams wrote: > > > > - snip to decoder replay patch - > > > > > > > > So one of the largest roadblocks for creating arbitrary region assembly > > > scenarios with cxl_test is the inability to save and restore decoder > > > settings. > > > > > > The patch below adds support for recording decoder settings and skipping > > > the reset of those values when unloading the driver. Then, when the > > > driver is reloaded, it simulates the case of BIOS created CXL regions > > > prior to OS boot. > > > > > > We can go after even finer grained cases once the uunit effort settles, > > > but in the meantime cxl_test can add auto-assembly regression tests with > > > the current topology. > > > > > > With the below you can simply trigger "cxl {en,dis}able-memdev" in the > > > proper order to cause the violation. > > > > > > -- >8 -- > > > From 5839392a3dbb64dbbac0630d930b1f48a8ef7ea6 Mon Sep 17 00:00:00 2001 > > > From: Dan Williams <dan.j.williams@intel.com> > > > Date: Wed, 17 Jan 2024 20:56:20 -0800 > > > Subject: [PATCH] tools/testing/cxl: Add decoder save/restore support > > > > > > Record decoder values at init and mock_decoder_commit() time, and > > > restore them at the next invocation of mock_init_hdm_decoder(). Add 2 > > > attributes to the cxl_test "cxl_acpi" device to optionally flush the > > > cache of topology decoder values, or disable updating the decoder at > > > mock_decoder_reset() time. > > > > > > This enables replaying a saved decoder configuration when re-triggering > > > a topology scan by re-binding the cxl_acpi driver to "cxl_acpi.0" (the > > > cxl_test emulation of an ACPI0017 instance). > > > > Hi Dan, > > > > Sorry it's taken a while to come back on this. I did find it useful > > in testing the auto-assembly order issue as you suggested. > > > > I didn't use this one: &dev_attr_decoder_registry_invalidate.attr, > > I just reloaded the cxl-test module to do same. > > Makes sense, that can go. > > > This I used: &dev_attr_decoder_registry_reset_disable.attr, > > with your decoders state fixup to set CXL_DECODER_STATE_AUTO, > > and a work-around to avoid pmem_probe failures on pmem region > > auto create. > > > > More generally, I'm wondering about the implementation of the > > 'registry_save'. Here it continuously updates during all > > cxl-test usage. Did you consider only creating the registry upon > > user request and then at the next mock_init_hmd_decoder() look > > for and use that registry if it exists. > > > > Usage would be: > > # echo 1 > /sys/bus/platform/devices/cxl_acpi.0/decoder_registry_create > > You mean wait to snapshot decoder state when triggered? Yes. > > > # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/unbind > > # echo cxl_acpi.0 > /sys/bus/platform/drivers/cxl_acpi/bind > > > > And then maybe, for folks who like to acpi/unbind,bind, rather > > than reload module, we could offer a decoder_registry_remove attr. > > > > Am I missing something regarding the need to keep it updated > > on the fly? > > I can see it being an alternate way, but not sure how to weigh one > approach vs the other. Is the dynamic update getting in the way of a > test case you are thinking about? Otherwise it seemed easy to reason > that the registry is always on, but only takes effect when > registry_reset_disable is set, and cxl_acpi rebinds the test device. The constant work of updating the registry caught my attention, but no impact that I know of. The dynamic approach is more intrusive and impacts the normal path needlessly. A snapshot approach limits much of the impact to users of the new feature.
Alison Schofield wrote: > > I can see it being an alternate way, but not sure how to weigh one > > approach vs the other. Is the dynamic update getting in the way of a > > test case you are thinking about? Otherwise it seemed easy to reason > > that the registry is always on, but only takes effect when > > registry_reset_disable is set, and cxl_acpi rebinds the test device. > > The constant work of updating the registry caught my attention, but > no impact that I know of. The dynamic approach is more intrusive and > impacts the normal path needlessly. A snapshot approach limits much > of the impact to users of the new feature. > It does mean more infrastructure to walk the entire topology and save off all of the decoders. However, this is straightforward because decoders are devices on the CXL bus. I am not worried about the intrusiveness as much as how easy it is to conceptualize new tests. I do think it is easier to conceptualize that the flow is "get the decoders the way you want them, snapshot them, ..., rebind", ...rather than the current: "get the decoders the way you want them, disable-reset, rebind" Notice the "..." in the first flow where it is a bit more forgiving if you do some decoder operations between the snapshot step and the rebind step. Where the latter flow requires that you know the side effects of how rebind tries to reset the decoder state. It still ends up with the all same complexity on the read side, but should be easier to craft new static configurations without needing to worry about the write side of the registry.
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 0f05692bfec3..8770ebcae05d 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -753,6 +753,37 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, return to_cxl_decoder(dev); } +bool auto_order_ok(struct cxl_port *port, struct cxl_region *cxlr_a, + struct cxl_region *cxlr_b) +{ + struct cxl_region_ref *cxl_rr; + struct cxl_decoder *cxld_a, *cxld_b; + + /* + * Allow the out of order assembly of auto-discovered regions as + * long as correct decoder programming order can be verified. + * + * Per CXL Spec 3.1 8.2.4.20.12 Committing Decoder Programming, + * software must commit decoders in HPA order. Therefore it is + * sufficient to sanity check that the lowered number decoder + * has the lower HPA starting address. + */ + if (!test_bit(CXL_REGION_F_AUTO, &cxlr_a->flags)) + return false; + + cxld_a = cxl_region_find_decoder(port, cxlr_a); + cxl_rr = cxl_rr_load(port, cxlr_b); + cxld_b = cxl_rr->decoder; + + if (cxld_b->id > cxld_a->id) { + dev_dbg(&cxlr_a->dev, + "allow out of order region ref alloc\n"); + return true; + } + + return false; +} + static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr) { @@ -767,7 +798,8 @@ static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, if (!ip->res) continue; - if (ip->res->start > p->res->start) { + if (ip->res->start > p->res->start && + (!auto_order_ok(port, cxlr, iter->region))) { dev_dbg(&cxlr->dev, "%s: HPA order violation %s:%pr vs %pr\n", dev_name(&port->dev),