Message ID | 168213190748.708404.16215095414060364800.stgit@dwillia2-xfh.jf.intel.com |
---|---|
State | Accepted |
Commit | 267214a2319b5692bbc9b128a6514960291dcca8 |
Headers | show |
Series | cxl/port: Fix port to pci device assumptions in read_cdat_data() | expand |
On Fri, Apr 21, 2023 at 07:51:47PM -0700, Dan Williams wrote: > Not all CXL ports are associated with PCI devices. Host-bridge and > cxl_test ports are hosted by platform devices. Teach read_cdat_data() to > be careful about non-pci hosted cxl_memdev instances. Otherwise, > cxl_test crashes with this signature: > > RIP: 0010:xas_start+0x6d/0x290 > [..] > Call Trace: > <TASK> > xas_load+0xa/0x50 > xas_find+0x25b/0x2f0 > xa_find+0x118/0x1d0 > pci_find_doe_mailbox+0x51/0xc0 > read_cdat_data+0x45/0x190 [cxl_core] > cxl_port_probe+0x10a/0x1e0 [cxl_port] > cxl_bus_probe+0x17/0x50 [cxl_core] > > Some other cleanups are included like removing the single-use @uport > variable, and removing the indirection through 'struct cxl_dev_state' to > lookup the device that registered the memdev and may be a pci device. > > Fixes: af0a6c3587dc ("cxl/pci: Use CDAT DOE mailbox created by PCI core") > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Take my Reviewed-by with a grain of salt as I'm absolutely not an expert on the cxl struct hierarchy, nevertheless this looks sane to me. I note however that before af0a6c3587dc, xa_for_each() was run on an xarray which was not initialized with xa_init() on non-pci cxl ports. (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() for non-pci cxl ports as well.) Hence can't this crash prior to af0a6c3587dc as well? If it can, the Fixes tag would rather have to point to c97006046c79 ("cxl/port: Read CDAT table"), though this patch wouldn't apply cleanly to pre-6.4 kernels. c97006046c79 went into v6.0 and there's one stable kernel between it and v6.4 (for which af0a6c3587dc is queued). So if the missing xa_init() can indeed cause crashes, you may want to base your fix on v6.3 and fold it in at the front of cxl/next, then rebase af0a6c3587dc on top of it. Let me know if you need help or want me to look into it. Thanks for fixing this and sorry for not spotting it myself! Lukas > --- > drivers/cxl/core/pci.c | 13 +++++++------ > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 25b7e8125d5d..bdbd907884ce 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -536,17 +536,18 @@ static int cxl_cdat_read_table(struct device *dev, > */ > void read_cdat_data(struct cxl_port *port) > { > - struct pci_doe_mb *cdat_doe; > + struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport); > + struct device *host = cxlmd->dev.parent; > struct device *dev = &port->dev; > - struct device *uport = port->uport; > - struct cxl_memdev *cxlmd = to_cxl_memdev(uport); > - struct cxl_dev_state *cxlds = cxlmd->cxlds; > - struct pci_dev *pdev = to_pci_dev(cxlds->dev); > + struct pci_doe_mb *cdat_doe; > size_t cdat_length; > void *cdat_table; > int rc; > > - cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > + if (!dev_is_pci(host)) > + return; > + cdat_doe = pci_find_doe_mailbox(to_pci_dev(host), > + PCI_DVSEC_VENDOR_ID_CXL, > CXL_DOE_PROTOCOL_TABLE_ACCESS); > if (!cdat_doe) { > dev_dbg(dev, "No CDAT mailbox\n"); >
On Sat, Apr 22, 2023 at 10:35:02AM +0200, Lukas Wunner wrote: > I note however that before af0a6c3587dc, xa_for_each() was run on an > xarray which was not initialized with xa_init() on non-pci cxl ports. > (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() > but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() > for non-pci cxl ports as well.) > > Hence can't this crash prior to af0a6c3587dc as well? After taking another look with a fresh pair of eyeballs I think you may see splats with CONFIG_DEBUG_LOCK_ALLOC=y (even prior to af0a6c3587dc on non-pci cxl devices) due to the missing xa_init(). struct cxl_dev_state is kzalloc'ed and with CONFIG_DEBUG_LOCK_ALLOC=n, the members of struct xarray all seem to be just zeroed by xa_init(). So no harm no foul. But that's not the case with CONFIG_DEBUG_LOCK_ALLOC=y because spinlock_t contains non-zero data. Thanks, Lukas
Lukas Wunner wrote: > On Sat, Apr 22, 2023 at 10:35:02AM +0200, Lukas Wunner wrote: > > I note however that before af0a6c3587dc, xa_for_each() was run on an > > xarray which was not initialized with xa_init() on non-pci cxl ports. > > (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() > > but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() > > for non-pci cxl ports as well.) > > > > Hence can't this crash prior to af0a6c3587dc as well? > > After taking another look with a fresh pair of eyeballs I think > you may see splats with CONFIG_DEBUG_LOCK_ALLOC=y (even prior to > af0a6c3587dc on non-pci cxl devices) due to the missing xa_init(). The xa_init() was included before in the removed call to devm_cxl_pci_create_doe(). --- -static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) -{ - struct device *dev = cxlds->dev; - struct pci_dev *pdev = to_pci_dev(dev); - u16 off = 0; - - xa_init(&cxlds->doe_mbs); ---
Lukas Wunner wrote: > On Fri, Apr 21, 2023 at 07:51:47PM -0700, Dan Williams wrote: > > Not all CXL ports are associated with PCI devices. Host-bridge and > > cxl_test ports are hosted by platform devices. Teach read_cdat_data() to > > be careful about non-pci hosted cxl_memdev instances. Otherwise, > > cxl_test crashes with this signature: > > > > RIP: 0010:xas_start+0x6d/0x290 > > [..] > > Call Trace: > > <TASK> > > xas_load+0xa/0x50 > > xas_find+0x25b/0x2f0 > > xa_find+0x118/0x1d0 > > pci_find_doe_mailbox+0x51/0xc0 > > read_cdat_data+0x45/0x190 [cxl_core] > > cxl_port_probe+0x10a/0x1e0 [cxl_port] > > cxl_bus_probe+0x17/0x50 [cxl_core] > > > > Some other cleanups are included like removing the single-use @uport > > variable, and removing the indirection through 'struct cxl_dev_state' to > > lookup the device that registered the memdev and may be a pci device. > > > > Fixes: af0a6c3587dc ("cxl/pci: Use CDAT DOE mailbox created by PCI core") > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > Reviewed-by: Lukas Wunner <lukas@wunner.de> > > Take my Reviewed-by with a grain of salt as I'm absolutely not an > expert on the cxl struct hierarchy, nevertheless this looks sane to me. Yeah, I think we need a data structure relationship diagram to explain endpoint devices, memory devices, endpoint ports, and switch ports all interconnect. > I note however that before af0a6c3587dc, xa_for_each() was run on an > xarray which was not initialized with xa_init() on non-pci cxl ports. > (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() > but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() > for non-pci cxl ports as well.) > > Hence can't this crash prior to af0a6c3587dc as well? If it can, > the Fixes tag would rather have to point to c97006046c79 ("cxl/port: > Read CDAT table"), though this patch wouldn't apply cleanly to > pre-6.4 kernels. c97006046c79 went into v6.0 and there's one stable > kernel between it and v6.4 (for which af0a6c3587dc is queued). > So if the missing xa_init() can indeed cause crashes, you may want to > base your fix on v6.3 and fold it in at the front of cxl/next, > then rebase af0a6c3587dc on top of it. Let me know if you need help > or want me to look into it. As I replied on the other note, I think older kernels are ok. > Thanks for fixing this and sorry for not spotting it myself! No worries, I had missed that the unit test were not run yet.
On Sat, Apr 22, 2023 at 01:54:00PM -0700, Dan Williams wrote: > Lukas Wunner wrote: > > On Sat, Apr 22, 2023 at 10:35:02AM +0200, Lukas Wunner wrote: > > > I note however that before af0a6c3587dc, xa_for_each() was run on an > > > xarray which was not initialized with xa_init() on non-pci cxl ports. > > > (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() > > > but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() > > > for non-pci cxl ports as well.) > > > > > > Hence can't this crash prior to af0a6c3587dc as well? > > > > After taking another look with a fresh pair of eyeballs I think > > you may see splats with CONFIG_DEBUG_LOCK_ALLOC=y (even prior to > > af0a6c3587dc on non-pci cxl devices) due to the missing xa_init(). > > The xa_init() was included before in the removed call to > devm_cxl_pci_create_doe(). ... which was called from cxl_pci_probe(), so only if a pci_dev exists? I guess I must be missing something... Thanks, Lukas
Lukas Wunner wrote: > On Sat, Apr 22, 2023 at 01:54:00PM -0700, Dan Williams wrote: > > Lukas Wunner wrote: > > > On Sat, Apr 22, 2023 at 10:35:02AM +0200, Lukas Wunner wrote: > > > > I note however that before af0a6c3587dc, xa_for_each() was run on an > > > > xarray which was not initialized with xa_init() on non-pci cxl ports. > > > > (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() > > > > but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() > > > > for non-pci cxl ports as well.) > > > > > > > > Hence can't this crash prior to af0a6c3587dc as well? > > > > > > After taking another look with a fresh pair of eyeballs I think > > > you may see splats with CONFIG_DEBUG_LOCK_ALLOC=y (even prior to > > > af0a6c3587dc on non-pci cxl devices) due to the missing xa_init(). > > > > The xa_init() was included before in the removed call to > > devm_cxl_pci_create_doe(). > > ... which was called from cxl_pci_probe(), so only if a pci_dev exists? > > I guess I must be missing something... Oh, no you're right in the sense that the code will use an only-zero-initialized xarray before the move to the PCI core, but a zero initialized xarray is sufficient. Recall that xarray walks are only RCU protected, so the xarray spinlock is not used for xa_for_each(). The change that makes this a failure is now the doe_mbs xarray is not even allocated in the !pci_dev case.
On Sat, Apr 22, 2023 at 04:22:18PM -0700, Dan Williams wrote: > Lukas Wunner wrote: > > On Sat, Apr 22, 2023 at 01:54:00PM -0700, Dan Williams wrote: > > > Lukas Wunner wrote: > > > > On Sat, Apr 22, 2023 at 10:35:02AM +0200, Lukas Wunner wrote: > > > > > I note however that before af0a6c3587dc, xa_for_each() was run on an > > > > > xarray which was not initialized with xa_init() on non-pci cxl ports. > > > > > (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() > > > > > but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() > > > > > for non-pci cxl ports as well.) > > > > > > > > > > Hence can't this crash prior to af0a6c3587dc as well? > > > > > > > > After taking another look with a fresh pair of eyeballs I think > > > > you may see splats with CONFIG_DEBUG_LOCK_ALLOC=y (even prior to > > > > af0a6c3587dc on non-pci cxl devices) due to the missing xa_init(). > > > > > > The xa_init() was included before in the removed call to > > > devm_cxl_pci_create_doe(). > > > > ... which was called from cxl_pci_probe(), so only if a pci_dev exists? > > > > I guess I must be missing something... > > Oh, no you're right in the sense that the code will use an > only-zero-initialized xarray before the move to the PCI core, but a zero > initialized xarray is sufficient. Recall that xarray walks are only RCU > protected, so the xarray spinlock is not used for xa_for_each(). > > The change that makes this a failure is now the doe_mbs xarray is not > even allocated in the !pci_dev case. Makes sense, thanks for the explanation.
On Sat, 22 Apr 2023 13:56:20 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > Lukas Wunner wrote: > > On Fri, Apr 21, 2023 at 07:51:47PM -0700, Dan Williams wrote: > > > Not all CXL ports are associated with PCI devices. Host-bridge and > > > cxl_test ports are hosted by platform devices. Teach read_cdat_data() to > > > be careful about non-pci hosted cxl_memdev instances. Otherwise, > > > cxl_test crashes with this signature: > > > > > > RIP: 0010:xas_start+0x6d/0x290 > > > [..] > > > Call Trace: > > > <TASK> > > > xas_load+0xa/0x50 > > > xas_find+0x25b/0x2f0 > > > xa_find+0x118/0x1d0 > > > pci_find_doe_mailbox+0x51/0xc0 > > > read_cdat_data+0x45/0x190 [cxl_core] > > > cxl_port_probe+0x10a/0x1e0 [cxl_port] > > > cxl_bus_probe+0x17/0x50 [cxl_core] > > > > > > Some other cleanups are included like removing the single-use @uport > > > variable, and removing the indirection through 'struct cxl_dev_state' to > > > lookup the device that registered the memdev and may be a pci device. > > > > > > Fixes: af0a6c3587dc ("cxl/pci: Use CDAT DOE mailbox created by PCI core") > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > > > Reviewed-by: Lukas Wunner <lukas@wunner.de> > > > > Take my Reviewed-by with a grain of salt as I'm absolutely not an > > expert on the cxl struct hierarchy, nevertheless this looks sane to me. > > Yeah, I think we need a data structure relationship diagram to explain > endpoint devices, memory devices, endpoint ports, and switch ports all > interconnect. Agreed. I keep forgetting how this all works as well and end up scattering prints everywhere to find out. A diagram to refer to would be very useful. Jonathan > > > I note however that before af0a6c3587dc, xa_for_each() was run on an > > xarray which was not initialized with xa_init() on non-pci cxl ports. > > (xa_init() was run from cxl_pci_probe() -> devm_cxl_pci_create_doe() > > but xa_for_each() was run from read_cdat_data() -> find_cdat_doe() > > for non-pci cxl ports as well.) > > > > Hence can't this crash prior to af0a6c3587dc as well? If it can, > > the Fixes tag would rather have to point to c97006046c79 ("cxl/port: > > Read CDAT table"), though this patch wouldn't apply cleanly to > > pre-6.4 kernels. c97006046c79 went into v6.0 and there's one stable > > kernel between it and v6.4 (for which af0a6c3587dc is queued). > > So if the missing xa_init() can indeed cause crashes, you may want to > > base your fix on v6.3 and fold it in at the front of cxl/next, > > then rebase af0a6c3587dc on top of it. Let me know if you need help > > or want me to look into it. > > As I replied on the other note, I think older kernels are ok. > > > Thanks for fixing this and sorry for not spotting it myself! > > No worries, I had missed that the unit test were not run yet.
On Fri, 21 Apr 2023 19:51:47 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > Not all CXL ports are associated with PCI devices. Host-bridge and > cxl_test ports are hosted by platform devices. Teach read_cdat_data() to > be careful about non-pci hosted cxl_memdev instances. Otherwise, > cxl_test crashes with this signature: This is crossing two paths. One memdevs that aren't hosted by PCI devices, two the other port cases. Right now this code assumes a memdev, so calling out the other case is problematic. I'd just drop the mention of host bridges or mention that we only call this for memdev ports today. > > RIP: 0010:xas_start+0x6d/0x290 > [..] > Call Trace: > <TASK> > xas_load+0xa/0x50 > xas_find+0x25b/0x2f0 > xa_find+0x118/0x1d0 > pci_find_doe_mailbox+0x51/0xc0 > read_cdat_data+0x45/0x190 [cxl_core] > cxl_port_probe+0x10a/0x1e0 [cxl_port] > cxl_bus_probe+0x17/0x50 [cxl_core] > > Some other cleanups are included like removing the single-use @uport > variable, and removing the indirection through 'struct cxl_dev_state' to > lookup the device that registered the memdev and may be a pci device. > > Fixes: af0a6c3587dc ("cxl/pci: Use CDAT DOE mailbox created by PCI core") > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Makes sense. If it's not a PCI device no config space so no DOE instance (+ the wrong device types etc). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > drivers/cxl/core/pci.c | 13 +++++++------ > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 25b7e8125d5d..bdbd907884ce 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -536,17 +536,18 @@ static int cxl_cdat_read_table(struct device *dev, > */ > void read_cdat_data(struct cxl_port *port) > { > - struct pci_doe_mb *cdat_doe; > + struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport); Given you talk about host bridges above, this would not be safe I don't think we call this on host bridges though so perhaps the patch description needs to not mention them to avoid confusion. > + struct device *host = cxlmd->dev.parent; > struct device *dev = &port->dev; > - struct device *uport = port->uport; > - struct cxl_memdev *cxlmd = to_cxl_memdev(uport); > - struct cxl_dev_state *cxlds = cxlmd->cxlds; > - struct pci_dev *pdev = to_pci_dev(cxlds->dev); > + struct pci_doe_mb *cdat_doe; > size_t cdat_length; > void *cdat_table; > int rc; > > - cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, > + if (!dev_is_pci(host)) > + return; > + cdat_doe = pci_find_doe_mailbox(to_pci_dev(host), > + PCI_DVSEC_VENDOR_ID_CXL, > CXL_DOE_PROTOCOL_TABLE_ACCESS); > if (!cdat_doe) { > dev_dbg(dev, "No CDAT mailbox\n"); >
Jonathan Cameron wrote: > On Fri, 21 Apr 2023 19:51:47 -0700 > Dan Williams <dan.j.williams@intel.com> wrote: > > > Not all CXL ports are associated with PCI devices. Host-bridge and > > cxl_test ports are hosted by platform devices. Teach read_cdat_data() to > > be careful about non-pci hosted cxl_memdev instances. Otherwise, > > cxl_test crashes with this signature: > > This is crossing two paths. One memdevs that aren't hosted by > PCI devices, two the other port cases. Right now this code assumes > a memdev, so calling out the other case is problematic. > > I'd just drop the mention of host bridges or mention that we only > call this for memdev ports today. Yeah, the reason for mentioning host-bridges was more for educational purposes of highlighting that not all 'struct cxl_port' instances are associated with PCI device objects. I'll drop it so folks do not come away with the implication that CDAT data is retrieved from host-bridge ports.
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 25b7e8125d5d..bdbd907884ce 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -536,17 +536,18 @@ static int cxl_cdat_read_table(struct device *dev, */ void read_cdat_data(struct cxl_port *port) { - struct pci_doe_mb *cdat_doe; + struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport); + struct device *host = cxlmd->dev.parent; struct device *dev = &port->dev; - struct device *uport = port->uport; - struct cxl_memdev *cxlmd = to_cxl_memdev(uport); - struct cxl_dev_state *cxlds = cxlmd->cxlds; - struct pci_dev *pdev = to_pci_dev(cxlds->dev); + struct pci_doe_mb *cdat_doe; size_t cdat_length; void *cdat_table; int rc; - cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL, + if (!dev_is_pci(host)) + return; + cdat_doe = pci_find_doe_mailbox(to_pci_dev(host), + PCI_DVSEC_VENDOR_ID_CXL, CXL_DOE_PROTOCOL_TABLE_ACCESS); if (!cdat_doe) { dev_dbg(dev, "No CDAT mailbox\n");
Not all CXL ports are associated with PCI devices. Host-bridge and cxl_test ports are hosted by platform devices. Teach read_cdat_data() to be careful about non-pci hosted cxl_memdev instances. Otherwise, cxl_test crashes with this signature: RIP: 0010:xas_start+0x6d/0x290 [..] Call Trace: <TASK> xas_load+0xa/0x50 xas_find+0x25b/0x2f0 xa_find+0x118/0x1d0 pci_find_doe_mailbox+0x51/0xc0 read_cdat_data+0x45/0x190 [cxl_core] cxl_port_probe+0x10a/0x1e0 [cxl_port] cxl_bus_probe+0x17/0x50 [cxl_core] Some other cleanups are included like removing the single-use @uport variable, and removing the indirection through 'struct cxl_dev_state' to lookup the device that registered the memdev and may be a pci device. Fixes: af0a6c3587dc ("cxl/pci: Use CDAT DOE mailbox created by PCI core") Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/cxl/core/pci.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)