Message ID | 172964779333.81806.8852577918216421011.stgit@dwillia2-xfh.jf.intel.com |
---|---|
Headers | show |
Series | cxl: Initialization and shutdown fixes | expand |
On 22.10.24 18:43:15, Dan Williams wrote: > Changes since v1 [1]: > - Fix some misspellings missed by checkpatch in changelogs (Jonathan) > - Add comments explaining the order of objects in drivers/cxl/Makefile > (Jonathan) > - Rename attach_device => cxl_rescan_attach (Jonathan) > - Fixup Zijun's email (Zijun) > > [1]: http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@dwillia2-xfh.jf.intel.com > > --- > > Original cover: > > Gregory's modest proposal to fix CXL cxl_mem_probe() failures due to > delayed arrival of the CXL "root" infrastructure [1] prompted questions > of how the existing mechanism for retrying cxl_mem_probe() could be > failing. I found a similar issue with the region creation. A region is created with the first endpoint found and immediately added as device which triggers cxl_region_probe(). Now, in interleaving setups the region state comes into commit state only after the last endpoint was probed. So the probe must be repeated until all endpoints were enumerated. I ended up with this change: diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index a07b62254596..c78704e435e5 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -3775,8 +3775,8 @@ static int cxl_region_probe(struct device *dev) } if (p->state < CXL_CONFIG_COMMIT) { - dev_dbg(&cxlr->dev, "config state: %d\n", p->state); - rc = -ENXIO; + rc = dev_err_probe(&cxlr->dev, -EPROBE_DEFER, + "region config state: %d\n", p->state); goto out; }
On Wed, Oct 23, 2024 at 03:12:07PM +0200, Robert Richter wrote: > On 22.10.24 18:43:15, Dan Williams wrote: > > Changes since v1 [1]: > > - Fix some misspellings missed by checkpatch in changelogs (Jonathan) > > - Add comments explaining the order of objects in drivers/cxl/Makefile > > (Jonathan) > > - Rename attach_device => cxl_rescan_attach (Jonathan) > > - Fixup Zijun's email (Zijun) > > > > [1]: http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@dwillia2-xfh.jf.intel.com > > > > --- > > > > Original cover: > > > > Gregory's modest proposal to fix CXL cxl_mem_probe() failures due to > > delayed arrival of the CXL "root" infrastructure [1] prompted questions > > of how the existing mechanism for retrying cxl_mem_probe() could be > > failing. > > I found a similar issue with the region creation. > > A region is created with the first endpoint found and immediately > added as device which triggers cxl_region_probe(). Now, in > interleaving setups the region state comes into commit state only > after the last endpoint was probed. So the probe must be repeated > until all endpoints were enumerated. I ended up with this change: > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index a07b62254596..c78704e435e5 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -3775,8 +3775,8 @@ static int cxl_region_probe(struct device *dev) > } > > if (p->state < CXL_CONFIG_COMMIT) { > - dev_dbg(&cxlr->dev, "config state: %d\n", p->state); > - rc = -ENXIO; > + rc = dev_err_probe(&cxlr->dev, -EPROBE_DEFER, > + "region config state: %d\n", p->state); > goto out; > } > I have not experienced any out of order operations since applying Dan's v1 of this patch set, do you still see this after applying the existing set? Probably this is indicative of needing another SOFTDEP / ordering issue. ~Gregory > -- > 2.39.5 > > I don't see an init order issue here as the mem module is always up > before the regions are probed. > > -Robert > > > > > The critical missing piece in the debug was that Gregory's setup had > > almost all CXL modules built-in to the kernel. > > > > On the way to that discovery several other bugs and init-order corner > > cases were discovered. > > > > The main fix is to make sure the drivers/cxl/Makefile object order > > supports root CXL ports being fully initialized upon cxl_acpi_probe() > > exit. The modular case has some similar potential holes that are fixed > > with MODULE_SOFTDEP() and other fix ups. Finally, an attempt to update > > cxl_test to reproduce the original report resulted in the discovery of a > > separate long standing use after free bug in cxl_region_detach(). > > > > [2]: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net > > > > --- > > > > Dan Williams (6): > > cxl/port: Fix CXL port initialization order when the subsystem is built-in > > cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices() > > cxl/acpi: Ensure ports ready at cxl_acpi_probe() return > > cxl/port: Fix use-after-free, permit out-of-order decoder shutdown > > cxl/port: Prevent out-of-order decoder allocation > > cxl/test: Improve init-order fidelity relative to real-world systems > > > > > > drivers/base/core.c | 35 +++++++ > > drivers/cxl/Kconfig | 1 > > drivers/cxl/Makefile | 20 +++- > > drivers/cxl/acpi.c | 7 + > > drivers/cxl/core/hdm.c | 50 +++++++++-- > > drivers/cxl/core/port.c | 13 ++- > > drivers/cxl/core/region.c | 91 ++++++++++--------- > > drivers/cxl/cxl.h | 3 - > > include/linux/device.h | 3 + > > tools/testing/cxl/test/cxl.c | 200 +++++++++++++++++++++++------------------- > > tools/testing/cxl/test/mem.c | 1 > > 11 files changed, 269 insertions(+), 155 deletions(-) > > > > base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
Robert Richter wrote: > On 22.10.24 18:43:15, Dan Williams wrote: > > Changes since v1 [1]: > > - Fix some misspellings missed by checkpatch in changelogs (Jonathan) > > - Add comments explaining the order of objects in drivers/cxl/Makefile > > (Jonathan) > > - Rename attach_device => cxl_rescan_attach (Jonathan) > > - Fixup Zijun's email (Zijun) > > > > [1]: http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@dwillia2-xfh.jf.intel.com > > > > --- > > > > Original cover: > > > > Gregory's modest proposal to fix CXL cxl_mem_probe() failures due to > > delayed arrival of the CXL "root" infrastructure [1] prompted questions > > of how the existing mechanism for retrying cxl_mem_probe() could be > > failing. > > I found a similar issue with the region creation. > > A region is created with the first endpoint found and immediately > added as device which triggers cxl_region_probe(). Now, in > interleaving setups the region state comes into commit state only > after the last endpoint was probed. So the probe must be repeated > until all endpoints were enumerated. I ended up with this change: > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index a07b62254596..c78704e435e5 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -3775,8 +3775,8 @@ static int cxl_region_probe(struct device *dev) > } > > if (p->state < CXL_CONFIG_COMMIT) { > - dev_dbg(&cxlr->dev, "config state: %d\n", p->state); > - rc = -ENXIO; > + rc = dev_err_probe(&cxlr->dev, -EPROBE_DEFER, > + "region config state: %d\n", p->state); I would argue EPROBE_DEFER is not appropriate because there is no guarantee that the other members of the region show up, and if they do they will re-trigger probe. So "probe must be repeated until all endpoints were enumerated" is the case either way. I.e. either more endpoint arrival triggers re-probe or EPROBE_DEFER triggers extra redundant probing *and* still results in a probe attempts as endpoints arrive. So a dev_dbg() plus -ENXIO return on uncommited region state is expected. > goto out; > } > > -- > 2.39.5 > > I don't see an init order issue here as the mem module is always up > before the regions are probed. Right, cxl_endpoint_port_probe() triggers region discovery and cxl_endpoint_port_probe() currently only triggers after cxl_mem has registered an endpoint port. The failure this set is address is unwanted cxl_mem_probe() failures.
On 23.10.24 13:34:36, Dan Williams wrote: > Robert Richter wrote: > > if (p->state < CXL_CONFIG_COMMIT) { > > - dev_dbg(&cxlr->dev, "config state: %d\n", p->state); > > - rc = -ENXIO; > > + rc = dev_err_probe(&cxlr->dev, -EPROBE_DEFER, > > + "region config state: %d\n", p->state); > > I would argue EPROBE_DEFER is not appropriate because there is no > guarantee that the other members of the region show up, and if they do > they will re-trigger probe. So "probe must be repeated until all > endpoints were enumerated" is the case either way. I.e. either more > endpoint arrival triggers re-probe or EPROBE_DEFER triggers extra > redundant probing *and* still results in a probe attempts as endpoints > arrive. > > So a dev_dbg() plus -ENXIO return on uncommited region state is > expected. So, the region device keeps failing a probe until all endpoints are collected. This triggered by cxl_add_to_region() after the region went into CXL_CONFIG_COMMIT state. Looks reasonable. The setup I was using showed various probe failures so I 'fixed' this issue without noticing the region device was reprobed later successfully. Thanks for explaining. -Robert