Message ID | 165939482134.252363.1915691883146696327.stgit@dwillia2-xfh.jf.intel.com |
---|---|
State | Accepted |
Commit | e29a8995d63f6f861b2cc446c58cef430885f469 |
Headers | show |
Series | cxl/region: Fix region reference target accounting | expand |
On Mon, Aug 01, 2022 at 04:00:21PM -0700, Dan Williams wrote: > Dan reports: > > The error handling in cxl_port_attach_region() looks like it might > have a similar bug. The cxl_rr->nr_targets++; might want a --. > > That function is more complicated. > > Indeed cxl_rr->nr_targets leaks when cxl_rr_ep_add() fails, but that > flow is not clear. Fix the bug and the clarity by separating the 'new' ^^^^^^^ clarify? > region-reference case from the 'extend' region-reference case. This also > moves the host-physical-address (HPA) validation, that the HPA of a new > region being accounted to the port is greater than the HPA of all other > regions associated with the port, to alloc_region_ref(). Technically this means the kdoc 'steps list' is out of order a bit. :-( But I like that this is only done on creation. And I'm not sure it matter much because 'steps' is more of a list of things this function does. > > Introduce @nr_targets_inc to track when the error exit path needs to > clean up cxl_rr->nr_targets. > > Fixes: 384e624bb211 ("cxl/region: Attach endpoint decoders") > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > drivers/cxl/core/region.c | 71 +++++++++++++++++++++++++++------------------ > 1 file changed, 43 insertions(+), 28 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index aff4f736b63c..cf5d5811fe4c 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -709,12 +709,26 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > struct cxl_region *cxlr) > { > - struct cxl_region_ref *cxl_rr; > + struct cxl_region_params *p = &cxlr->params; > + struct cxl_region_ref *cxl_rr, *iter; > + unsigned long index; > int rc; > > + xa_for_each(&port->regions, index, iter) { > + struct cxl_region_params *ip = &iter->region->params; > + > + if (ip->res->start > p->res->start) { > + dev_dbg(&cxlr->dev, > + "%s: HPA order violation %s:%pr vs %pr\n", > + dev_name(&port->dev), > + dev_name(&iter->region->dev), ip->res, p->res); > + return ERR_PTR(-EBUSY); > + } > + } > + > cxl_rr = kzalloc(sizeof(*cxl_rr), GFP_KERNEL); > if (!cxl_rr) > - return NULL; > + return ERR_PTR(-ENOMEM); > cxl_rr->port = port; > cxl_rr->region = cxlr; > cxl_rr->nr_targets = 1; > @@ -726,7 +740,7 @@ static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > "%s: failed to track region reference: %d\n", > dev_name(&port->dev), rc); > kfree(cxl_rr); > - return NULL; > + return ERR_PTR(rc); > } > > return cxl_rr; > @@ -801,33 +815,25 @@ static int cxl_port_attach_region(struct cxl_port *port, > { > struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); > struct cxl_ep *ep = cxl_ep_load(port, cxlmd); > - struct cxl_region_ref *cxl_rr = NULL, *iter; > - struct cxl_region_params *p = &cxlr->params; > - struct cxl_decoder *cxld = NULL; > + struct cxl_region_ref *cxl_rr; > + bool nr_targets_inc = false; > + struct cxl_decoder *cxld; > unsigned long index; > int rc = -EBUSY; > > lockdep_assert_held_write(&cxl_region_rwsem); > > - xa_for_each(&port->regions, index, iter) { > - struct cxl_region_params *ip = &iter->region->params; > - > - if (iter->region == cxlr) > - cxl_rr = iter; > - if (ip->res->start > p->res->start) { > - dev_dbg(&cxlr->dev, > - "%s: HPA order violation %s:%pr vs %pr\n", > - dev_name(&port->dev), > - dev_name(&iter->region->dev), ip->res, p->res); > - return -EBUSY; > - } > - } > - > + cxl_rr = cxl_rr_load(port, cxlr); > if (cxl_rr) { > struct cxl_ep *ep_iter; > int found = 0; > > - cxld = cxl_rr->decoder; > + /* > + * Walk the existing endpoints that have been attached to > + * @cxlr at @port and see if they share the same 'next' port > + * in the downstream direction. I.e. endpoints that share common > + * upstream switch. > + */ > xa_for_each(&cxl_rr->endpoints, index, ep_iter) { > if (ep_iter == ep) > continue; > @@ -838,22 +844,29 @@ static int cxl_port_attach_region(struct cxl_port *port, > } > > /* > - * If this is a new target or if this port is direct connected > - * to this endpoint then add to the target count. > + * New target port, or @port is an endpoint port that always > + * accounts its own local decode as a target. > */ > - if (!found || !ep->next) > + if (!found || !ep->next) { > cxl_rr->nr_targets++; > + nr_targets_inc = true; > + } > + > + /* > + * The decoder for @cxlr was allocated when the region was first > + * attached to @port. > + */ > + cxld = cxl_rr->decoder; > } else { > cxl_rr = alloc_region_ref(port, cxlr); > - if (!cxl_rr) { > + if (IS_ERR(cxl_rr)) { > dev_dbg(&cxlr->dev, > "%s: failed to allocate region reference\n", > dev_name(&port->dev)); > - return -ENOMEM; > + return PTR_ERR(cxl_rr); > } > - } > + nr_targets_inc = true; I don't think this is technically needed but I would leave it if I were you. I think the leak only occurs on the extend case but I think it is safe to do this and does make the code cleaner. Regardless of the kdoc subtlety and with the spelling fix. Reviewed-by: Ira Weiny <ira.weiny@intel.com> > > - if (!cxld) { > if (port == cxled_to_port(cxled)) > cxld = &cxled->cxld; > else > @@ -896,6 +909,8 @@ static int cxl_port_attach_region(struct cxl_port *port, > > return 0; > out_erase: > + if (nr_targets_inc) > + cxl_rr->nr_targets--; > if (cxl_rr->nr_eps == 0) > free_region_ref(cxl_rr); > return rc; >
Ira Weiny wrote: > On Mon, Aug 01, 2022 at 04:00:21PM -0700, Dan Williams wrote: > > Dan reports: > > > > The error handling in cxl_port_attach_region() looks like it might > > have a similar bug. The cxl_rr->nr_targets++; might want a --. > > > > That function is more complicated. > > > > Indeed cxl_rr->nr_targets leaks when cxl_rr_ep_add() fails, but that > > flow is not clear. Fix the bug and the clarity by separating the 'new' > ^^^^^^^ > clarify? It's a code 'clarity' fix. > > > region-reference case from the 'extend' region-reference case. This also > > moves the host-physical-address (HPA) validation, that the HPA of a new > > region being accounted to the port is greater than the HPA of all other > > regions associated with the port, to alloc_region_ref(). > > Technically this means the kdoc 'steps list' is out of order a bit. :-( > > But I like that this is only done on creation. And I'm not sure it matter much > because 'steps' is more of a list of things this function does. Yeah, the steps are not compiled. > > > > > Introduce @nr_targets_inc to track when the error exit path needs to > > clean up cxl_rr->nr_targets. > > > > Fixes: 384e624bb211 ("cxl/region: Attach endpoint decoders") > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > --- > > drivers/cxl/core/region.c | 71 +++++++++++++++++++++++++++------------------ > > 1 file changed, 43 insertions(+), 28 deletions(-) > > > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > > index aff4f736b63c..cf5d5811fe4c 100644 > > --- a/drivers/cxl/core/region.c > > +++ b/drivers/cxl/core/region.c > > @@ -709,12 +709,26 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > > static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > > struct cxl_region *cxlr) > > { > > - struct cxl_region_ref *cxl_rr; > > + struct cxl_region_params *p = &cxlr->params; > > + struct cxl_region_ref *cxl_rr, *iter; > > + unsigned long index; > > int rc; > > > > + xa_for_each(&port->regions, index, iter) { > > + struct cxl_region_params *ip = &iter->region->params; > > + > > + if (ip->res->start > p->res->start) { > > + dev_dbg(&cxlr->dev, > > + "%s: HPA order violation %s:%pr vs %pr\n", > > + dev_name(&port->dev), > > + dev_name(&iter->region->dev), ip->res, p->res); > > + return ERR_PTR(-EBUSY); > > + } > > + } > > + > > cxl_rr = kzalloc(sizeof(*cxl_rr), GFP_KERNEL); > > if (!cxl_rr) > > - return NULL; > > + return ERR_PTR(-ENOMEM); > > cxl_rr->port = port; > > cxl_rr->region = cxlr; > > cxl_rr->nr_targets = 1; > > @@ -726,7 +740,7 @@ static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, > > "%s: failed to track region reference: %d\n", > > dev_name(&port->dev), rc); > > kfree(cxl_rr); > > - return NULL; > > + return ERR_PTR(rc); > > } > > > > return cxl_rr; > > @@ -801,33 +815,25 @@ static int cxl_port_attach_region(struct cxl_port *port, > > { > > struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); > > struct cxl_ep *ep = cxl_ep_load(port, cxlmd); > > - struct cxl_region_ref *cxl_rr = NULL, *iter; > > - struct cxl_region_params *p = &cxlr->params; > > - struct cxl_decoder *cxld = NULL; > > + struct cxl_region_ref *cxl_rr; > > + bool nr_targets_inc = false; > > + struct cxl_decoder *cxld; > > unsigned long index; > > int rc = -EBUSY; > > > > lockdep_assert_held_write(&cxl_region_rwsem); > > > > - xa_for_each(&port->regions, index, iter) { > > - struct cxl_region_params *ip = &iter->region->params; > > - > > - if (iter->region == cxlr) > > - cxl_rr = iter; > > - if (ip->res->start > p->res->start) { > > - dev_dbg(&cxlr->dev, > > - "%s: HPA order violation %s:%pr vs %pr\n", > > - dev_name(&port->dev), > > - dev_name(&iter->region->dev), ip->res, p->res); > > - return -EBUSY; > > - } > > - } > > - > > + cxl_rr = cxl_rr_load(port, cxlr); > > if (cxl_rr) { > > struct cxl_ep *ep_iter; > > int found = 0; > > > > - cxld = cxl_rr->decoder; > > + /* > > + * Walk the existing endpoints that have been attached to > > + * @cxlr at @port and see if they share the same 'next' port > > + * in the downstream direction. I.e. endpoints that share common > > + * upstream switch. > > + */ > > xa_for_each(&cxl_rr->endpoints, index, ep_iter) { > > if (ep_iter == ep) > > continue; > > @@ -838,22 +844,29 @@ static int cxl_port_attach_region(struct cxl_port *port, > > } > > > > /* > > - * If this is a new target or if this port is direct connected > > - * to this endpoint then add to the target count. > > + * New target port, or @port is an endpoint port that always > > + * accounts its own local decode as a target. > > */ > > - if (!found || !ep->next) > > + if (!found || !ep->next) { > > cxl_rr->nr_targets++; > > + nr_targets_inc = true; > > + } > > + > > + /* > > + * The decoder for @cxlr was allocated when the region was first > > + * attached to @port. > > + */ > > + cxld = cxl_rr->decoder; > > } else { > > cxl_rr = alloc_region_ref(port, cxlr); > > - if (!cxl_rr) { > > + if (IS_ERR(cxl_rr)) { > > dev_dbg(&cxlr->dev, > > "%s: failed to allocate region reference\n", > > dev_name(&port->dev)); > > - return -ENOMEM; > > + return PTR_ERR(cxl_rr); > > } > > - } > > + nr_targets_inc = true; > > I don't think this is technically needed but I would leave it if I were you. > > I think the leak only occurs on the extend case but I think it is safe to do > this and does make the code cleaner. Appreciate the detailed look. You're right that it's ultimately pointless to do the tracking in this case because the cxl_rr instance will be freed shortly. For the casual reader though it saves them a trip to go read the details to convice themselves there's no leak. > Regardless of the kdoc subtlety and with the spelling fix. > > Reviewed-by: Ira Weiny <ira.weiny@intel.com> Thanks!
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index aff4f736b63c..cf5d5811fe4c 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -709,12 +709,26 @@ static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, struct cxl_region *cxlr) { - struct cxl_region_ref *cxl_rr; + struct cxl_region_params *p = &cxlr->params; + struct cxl_region_ref *cxl_rr, *iter; + unsigned long index; int rc; + xa_for_each(&port->regions, index, iter) { + struct cxl_region_params *ip = &iter->region->params; + + if (ip->res->start > p->res->start) { + dev_dbg(&cxlr->dev, + "%s: HPA order violation %s:%pr vs %pr\n", + dev_name(&port->dev), + dev_name(&iter->region->dev), ip->res, p->res); + return ERR_PTR(-EBUSY); + } + } + cxl_rr = kzalloc(sizeof(*cxl_rr), GFP_KERNEL); if (!cxl_rr) - return NULL; + return ERR_PTR(-ENOMEM); cxl_rr->port = port; cxl_rr->region = cxlr; cxl_rr->nr_targets = 1; @@ -726,7 +740,7 @@ static struct cxl_region_ref *alloc_region_ref(struct cxl_port *port, "%s: failed to track region reference: %d\n", dev_name(&port->dev), rc); kfree(cxl_rr); - return NULL; + return ERR_PTR(rc); } return cxl_rr; @@ -801,33 +815,25 @@ static int cxl_port_attach_region(struct cxl_port *port, { struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); struct cxl_ep *ep = cxl_ep_load(port, cxlmd); - struct cxl_region_ref *cxl_rr = NULL, *iter; - struct cxl_region_params *p = &cxlr->params; - struct cxl_decoder *cxld = NULL; + struct cxl_region_ref *cxl_rr; + bool nr_targets_inc = false; + struct cxl_decoder *cxld; unsigned long index; int rc = -EBUSY; lockdep_assert_held_write(&cxl_region_rwsem); - xa_for_each(&port->regions, index, iter) { - struct cxl_region_params *ip = &iter->region->params; - - if (iter->region == cxlr) - cxl_rr = iter; - if (ip->res->start > p->res->start) { - dev_dbg(&cxlr->dev, - "%s: HPA order violation %s:%pr vs %pr\n", - dev_name(&port->dev), - dev_name(&iter->region->dev), ip->res, p->res); - return -EBUSY; - } - } - + cxl_rr = cxl_rr_load(port, cxlr); if (cxl_rr) { struct cxl_ep *ep_iter; int found = 0; - cxld = cxl_rr->decoder; + /* + * Walk the existing endpoints that have been attached to + * @cxlr at @port and see if they share the same 'next' port + * in the downstream direction. I.e. endpoints that share common + * upstream switch. + */ xa_for_each(&cxl_rr->endpoints, index, ep_iter) { if (ep_iter == ep) continue; @@ -838,22 +844,29 @@ static int cxl_port_attach_region(struct cxl_port *port, } /* - * If this is a new target or if this port is direct connected - * to this endpoint then add to the target count. + * New target port, or @port is an endpoint port that always + * accounts its own local decode as a target. */ - if (!found || !ep->next) + if (!found || !ep->next) { cxl_rr->nr_targets++; + nr_targets_inc = true; + } + + /* + * The decoder for @cxlr was allocated when the region was first + * attached to @port. + */ + cxld = cxl_rr->decoder; } else { cxl_rr = alloc_region_ref(port, cxlr); - if (!cxl_rr) { + if (IS_ERR(cxl_rr)) { dev_dbg(&cxlr->dev, "%s: failed to allocate region reference\n", dev_name(&port->dev)); - return -ENOMEM; + return PTR_ERR(cxl_rr); } - } + nr_targets_inc = true; - if (!cxld) { if (port == cxled_to_port(cxled)) cxld = &cxled->cxld; else @@ -896,6 +909,8 @@ static int cxl_port_attach_region(struct cxl_port *port, return 0; out_erase: + if (nr_targets_inc) + cxl_rr->nr_targets--; if (cxl_rr->nr_eps == 0) free_region_ref(cxl_rr); return rc;
Dan reports: The error handling in cxl_port_attach_region() looks like it might have a similar bug. The cxl_rr->nr_targets++; might want a --. That function is more complicated. Indeed cxl_rr->nr_targets leaks when cxl_rr_ep_add() fails, but that flow is not clear. Fix the bug and the clarity by separating the 'new' region-reference case from the 'extend' region-reference case. This also moves the host-physical-address (HPA) validation, that the HPA of a new region being accounted to the port is greater than the HPA of all other regions associated with the port, to alloc_region_ref(). Introduce @nr_targets_inc to track when the error exit path needs to clean up cxl_rr->nr_targets. Fixes: 384e624bb211 ("cxl/region: Attach endpoint decoders") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/cxl/core/region.c | 71 +++++++++++++++++++++++++++------------------ 1 file changed, 43 insertions(+), 28 deletions(-)