Message ID | 169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.in |
---|---|
State | Accepted |
Commit | 0718588c7aaa7a1510b4de972370535b61dddd0d |
Headers | show |
Series | [v2] cxl/region: don't try to cleanup after cxl_region_setup_targets() fails | expand |
Acked-by: Dan Carpenter <dan.carpenter@linaro.org>
regards,
dan carpenter
On Wed, 11 Oct 2023 14:51:31 +0000 Jim Harris <jim.harris@samsung.com> wrote: > Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in > cxl_region_attach()") tried to avoid 'eiw' initialization errors when > ->nr_targets exceeded 16, by just decrementing ->nr_targets when > cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup > target list on attach error") extended that cleanup to also clear > cxled->pos and p->targets[pos]. > > The initialization error was incidentally fixed separately by patch > 8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings") > which was merged a few days after 5e42bcbc. > > But now the original cleanup when cxl_region_setup_targets() fails > prevents endpoint and switch decoder resources from being reused: > > 1) the cleanup does not set the decoder's region to NULL, which results > in future dpa_size_store() calls returning -EBUSY > 2) the decoder is not properly freed, which results in future commit > errors associated with the upstream switch > > Now that the initialization errors were fixed separately, the proper > cleanup for this case is to just return immediately. Then the resources > associated with this target get cleanup up as normal when the failed > region is deleted. > > The ->nr_targets decrement in the error case also helped prevent > a p->targets[] array overflow, so add a new check to prevent against > that overflow. > > Tested by trying to create an invalid region for a 2 switch * 2 endpoint > topology, and then following up with creating a valid region. > > Signed-off-by: Jim Harris <jim.harris@samsung.com> I agree with your analysis and that this seems to fix the cases you've called out. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > drivers/cxl/core/region.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 6d63b8798c29..2b3b3c62d0a7 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -1658,6 +1658,12 @@ static int cxl_region_attach(struct cxl_region *cxlr, > return -ENXIO; > } > > + if (p->nr_targets >= p->interleave_ways) { > + dev_dbg(&cxlr->dev, "region already has %d endpoints\n", > + p->nr_targets); > + return -EINVAL; > + } > + > ep_port = cxled_to_port(cxled); > root_port = cxlrd_to_port(cxlrd); > dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge); > @@ -1750,7 +1756,7 @@ static int cxl_region_attach(struct cxl_region *cxlr, > if (p->nr_targets == p->interleave_ways) { > rc = cxl_region_setup_targets(cxlr); > if (rc) > - goto err_decrement; > + return rc; > p->state = CXL_CONFIG_ACTIVE; > } > > @@ -1762,12 +1768,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, > }; > > return 0; > - > -err_decrement: > - p->nr_targets--; > - cxled->pos = -1; > - p->targets[pos] = NULL; > - return rc; > } > > static int cxl_region_detach(struct cxl_endpoint_decoder *cxled) >
On 10/11/23 07:51, Jim Harris wrote: > Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in > cxl_region_attach()") tried to avoid 'eiw' initialization errors when > ->nr_targets exceeded 16, by just decrementing ->nr_targets when > cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup > target list on attach error") extended that cleanup to also clear > cxled->pos and p->targets[pos]. > > The initialization error was incidentally fixed separately by patch > 8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings") > which was merged a few days after 5e42bcbc. > > But now the original cleanup when cxl_region_setup_targets() fails > prevents endpoint and switch decoder resources from being reused: > > 1) the cleanup does not set the decoder's region to NULL, which results > in future dpa_size_store() calls returning -EBUSY > 2) the decoder is not properly freed, which results in future commit > errors associated with the upstream switch > > Now that the initialization errors were fixed separately, the proper > cleanup for this case is to just return immediately. Then the resources > associated with this target get cleanup up as normal when the failed > region is deleted. > > The ->nr_targets decrement in the error case also helped prevent > a p->targets[] array overflow, so add a new check to prevent against > that overflow. > > Tested by trying to create an invalid region for a 2 switch * 2 endpoint > topology, and then following up with creating a valid region. > > Signed-off-by: Jim Harris <jim.harris@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/region.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 6d63b8798c29..2b3b3c62d0a7 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -1658,6 +1658,12 @@ static int cxl_region_attach(struct cxl_region *cxlr, > return -ENXIO; > } > > + if (p->nr_targets >= p->interleave_ways) { > + dev_dbg(&cxlr->dev, "region already has %d endpoints\n", > + p->nr_targets); > + return -EINVAL; > + } > + > ep_port = cxled_to_port(cxled); > root_port = cxlrd_to_port(cxlrd); > dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge); > @@ -1750,7 +1756,7 @@ static int cxl_region_attach(struct cxl_region *cxlr, > if (p->nr_targets == p->interleave_ways) { > rc = cxl_region_setup_targets(cxlr); > if (rc) > - goto err_decrement; > + return rc; > p->state = CXL_CONFIG_ACTIVE; > } > > @@ -1762,12 +1768,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, > }; > > return 0; > - > -err_decrement: > - p->nr_targets--; > - cxled->pos = -1; > - p->targets[pos] = NULL; > - return rc; > } > > static int cxl_region_detach(struct cxl_endpoint_decoder *cxled) >
Jim Harris wrote: > Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in > cxl_region_attach()") tried to avoid 'eiw' initialization errors when > ->nr_targets exceeded 16, by just decrementing ->nr_targets when > cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup > target list on attach error") extended that cleanup to also clear > cxled->pos and p->targets[pos]. > > The initialization error was incidentally fixed separately by patch > 8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings") > which was merged a few days after 5e42bcbc. Patch looks good, but I did reflow the above paragraphs to have commit references per checkpatch expectations. I believe it did not flag them for you as it did not recognize "Patch <SHA>" as referring to a commit: Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") tried to avoid 'eiw' initialization errors when ->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed. Commit 86987c766276 ("cxl/region: Cleanup target list on attach error") extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by: Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable warnings") which was merged a few days after 5e42bcbc3fef. I also went ahead and added: Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") Cc: <stable@vger.kernel.org> Otherwise, good find, thanks Jim!
On Tue, Oct 24, 2023 at 04:01:19PM -0700, Dan Williams wrote: > > Patch looks good, but I did reflow the above paragraphs to have commit > references per checkpatch expectations. I believe it did not flag them > for you as it did not recognize "Patch <SHA>" as referring to a commit: > > Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in > cxl_region_attach()") tried to avoid 'eiw' initialization errors when > ->nr_targets exceeded 16, by just decrementing ->nr_targets when > cxl_region_setup_targets() failed. > > Commit 86987c766276 ("cxl/region: Cleanup target list on attach error") > extended that cleanup to also clear cxled->pos and p->targets[pos]. The > initialization error was incidentally fixed separately by: > Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable > warnings") which was merged a few days after 5e42bcbc3fef. > > I also went ahead and added: > > Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") > Cc: <stable@vger.kernel.org> > Thanks Dan, I'll keep an eye out for these in the future.
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 6d63b8798c29..2b3b3c62d0a7 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1658,6 +1658,12 @@ static int cxl_region_attach(struct cxl_region *cxlr, return -ENXIO; } + if (p->nr_targets >= p->interleave_ways) { + dev_dbg(&cxlr->dev, "region already has %d endpoints\n", + p->nr_targets); + return -EINVAL; + } + ep_port = cxled_to_port(cxled); root_port = cxlrd_to_port(cxlrd); dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge); @@ -1750,7 +1756,7 @@ static int cxl_region_attach(struct cxl_region *cxlr, if (p->nr_targets == p->interleave_ways) { rc = cxl_region_setup_targets(cxlr); if (rc) - goto err_decrement; + return rc; p->state = CXL_CONFIG_ACTIVE; } @@ -1762,12 +1768,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, }; return 0; - -err_decrement: - p->nr_targets--; - cxled->pos = -1; - p->targets[pos] = NULL; - return rc; } static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") tried to avoid 'eiw' initialization errors when ->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup target list on attach error") extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by patch 8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings") which was merged a few days after 5e42bcbc. But now the original cleanup when cxl_region_setup_targets() fails prevents endpoint and switch decoder resources from being reused: 1) the cleanup does not set the decoder's region to NULL, which results in future dpa_size_store() calls returning -EBUSY 2) the decoder is not properly freed, which results in future commit errors associated with the upstream switch Now that the initialization errors were fixed separately, the proper cleanup for this case is to just return immediately. Then the resources associated with this target get cleanup up as normal when the failed region is deleted. The ->nr_targets decrement in the error case also helped prevent a p->targets[] array overflow, so add a new check to prevent against that overflow. Tested by trying to create an invalid region for a 2 switch * 2 endpoint topology, and then following up with creating a valid region. Signed-off-by: Jim Harris <jim.harris@samsung.com> --- drivers/cxl/core/region.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)