Message ID | 20240604003609.202682-1-alison.schofield@intel.com |
---|---|
State | Accepted |
Commit | 285f2a08841432fc3e498b1cd00cce5216cdf189 |
Headers | show |
Series | cxl/region: Avoid null pointer dereference in region lookup | expand |
On Mon, 3 Jun 2024 17:36:09 -0700 alison.schofield@intel.com wrote: > From: Alison Schofield <alison.schofield@intel.com> > > cxl_dpa_to_region() looks up a region based on a memdev and DPA. > It wrongly assumes an endpoint found mapping the DPA is also of > a fully assembled region. When not true it leads to a null pointer > dereference looking up the region name. > > This appears during testing of region lookup after a failure to > assemble a BIOS defined region or if the lookup raced with the > assembly of the BIOS defined region. > > Failure to clean up BIOS defined regions that fail assembly is an > issue in itself and a fix to that problem will alleviate some of > the impact. It will not alleviate the race condition so let's harden > this path. > > The behavior change is that the kernel oops due to a null pointer > dereference is replaced with a dev_dbg() message noting that an > endpoint was mapped. > > Additional comments are added so that future users of this function > can more clearly understand what it provides. > > Fixes: 0a105ab28a4d ("cxl/memdev: Warn of poison inject or clear to a mapped region") > Signed-off-by: Alison Schofield <alison.schofield@intel.com> Nasty corner case. Fix looks correct to me. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > drivers/cxl/core/region.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 3c2b6144be23..88051bb673bf 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -2688,22 +2688,33 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg) > { > struct cxl_dpa_to_region_context *ctx = arg; > struct cxl_endpoint_decoder *cxled; > + struct cxl_region *cxlr; > u64 dpa = ctx->dpa; > > if (!is_endpoint_decoder(dev)) > return 0; > > cxled = to_cxl_endpoint_decoder(dev); > - if (!cxled->dpa_res || !resource_size(cxled->dpa_res)) > + if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res)) > return 0; > > if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start) > return 0; > > - dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, > - dev_name(&cxled->cxld.region->dev)); > + /* > + * Stop the region search (return 1) when an endpoint mapping is > + * found. The region may not be fully constructed so offering > + * the cxlr in the context structure is not guaranteed. > + */ > + cxlr = cxled->cxld.region; > + if (cxlr) > + dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, > + dev_name(&cxlr->dev)); > + else > + dev_dbg(dev, "dpa:0x%llx mapped in endpoint:%s\n", dpa, > + dev_name(dev)); > > - ctx->cxlr = cxled->cxld.region; > + ctx->cxlr = cxlr; > > return 1; > } > > base-commit: 49ba7b515c4c0719b866d16f068e62d16a8a3dd1
Perhaps change the subject with: s/Avoid/Fix/? alison.schofield@ wrote: > From: Alison Schofield <alison.schofield@intel.com> > > cxl_dpa_to_region() looks up a region based on a memdev and DPA. > It wrongly assumes an endpoint found mapping the DPA is also of > a fully assembled region. When not true it leads to a null pointer > dereference looking up the region name. Can you place the a snippet of the Call Trace here, that helps support that the writeup came to the right conclusion, and it helps folks hitting this error to find this patch in a web search. > This appears during testing of region lookup after a failure to > assemble a BIOS defined region or if the lookup raced with the > assembly of the BIOS defined region. > > Failure to clean up BIOS defined regions that fail assembly is an > issue in itself and a fix to that problem will alleviate some of > the impact. It will not alleviate the race condition so let's harden > this path. > > The behavior change is that the kernel oops due to a null pointer > dereference is replaced with a dev_dbg() message noting that an > endpoint was mapped. > > Additional comments are added so that future users of this function > can more clearly understand what it provides. > > Fixes: 0a105ab28a4d ("cxl/memdev: Warn of poison inject or clear to a mapped region") > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > --- > drivers/cxl/core/region.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 3c2b6144be23..88051bb673bf 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -2688,22 +2688,33 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg) > { > struct cxl_dpa_to_region_context *ctx = arg; > struct cxl_endpoint_decoder *cxled; > + struct cxl_region *cxlr; > u64 dpa = ctx->dpa; > > if (!is_endpoint_decoder(dev)) > return 0; > > cxled = to_cxl_endpoint_decoder(dev); > - if (!cxled->dpa_res || !resource_size(cxled->dpa_res)) > + if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res)) Why add the !cxled check? to_cxl_endpoint_decoder() only returns NULL if someone messes up and sends a !is_endpoint_decoder(@dev) to this conversion routine and that check was already performed above. > return 0; > > if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start) > return 0; > > - dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, > - dev_name(&cxled->cxld.region->dev)); > + /* > + * Stop the region search (return 1) when an endpoint mapping is > + * found. The region may not be fully constructed so offering > + * the cxlr in the context structure is not guaranteed. > + */ Comment looks good... > + cxlr = cxled->cxld.region; > + if (cxlr) > + dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, > + dev_name(&cxlr->dev)); > + else > + dev_dbg(dev, "dpa:0x%llx mapped in endpoint:%s\n", dpa, > + dev_name(dev)); ...but I am not sure gymanistics to improve a debug message are worth it, especially because all of the callers either emit the found region in a tracepoint or print a warning, so I would just be in favor of deleting the debug message altogether, or at most just trimming to the "mapped in endpoint" one.
On Tue, Jun 04, 2024 at 10:47:35AM -0700, Dan Williams wrote: > Perhaps change the subject with: s/Avoid/Fix/? OK > > alison.schofield@ wrote: > > From: Alison Schofield <alison.schofield@intel.com> > > > > cxl_dpa_to_region() looks up a region based on a memdev and DPA. > > It wrongly assumes an endpoint found mapping the DPA is also of > > a fully assembled region. When not true it leads to a null pointer > > dereference looking up the region name. > > Can you place the a snippet of the Call Trace here, that helps support > that the writeup came to the right conclusion, and it helps folks > hitting this error to find this patch in a web search. Will add in v2 snip... > > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > > index 3c2b6144be23..88051bb673bf 100644 > > --- a/drivers/cxl/core/region.c > > +++ b/drivers/cxl/core/region.c > > @@ -2688,22 +2688,33 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg) > > { > > struct cxl_dpa_to_region_context *ctx = arg; > > struct cxl_endpoint_decoder *cxled; > > + struct cxl_region *cxlr; > > u64 dpa = ctx->dpa; > > > > if (!is_endpoint_decoder(dev)) > > return 0; > > > > cxled = to_cxl_endpoint_decoder(dev); > > - if (!cxled->dpa_res || !resource_size(cxled->dpa_res)) > > + if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res)) > > Why add the !cxled check? to_cxl_endpoint_decoder() only returns NULL if > someone messes up and sends a !is_endpoint_decoder(@dev) to this > conversion routine and that check was already performed above. > Abundance of caution I suppose. I'll remove. > > return 0; > > > > if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start) > > return 0; > > > > - dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, > > - dev_name(&cxled->cxld.region->dev)); > > + /* > > + * Stop the region search (return 1) when an endpoint mapping is > > + * found. The region may not be fully constructed so offering > > + * the cxlr in the context structure is not guaranteed. > > + */ > > Comment looks good... > > > + cxlr = cxled->cxld.region; > > + if (cxlr) > > + dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, > > + dev_name(&cxlr->dev)); > > + else > > + dev_dbg(dev, "dpa:0x%llx mapped in endpoint:%s\n", dpa, > > + dev_name(dev)); > > ...but I am not sure gymanistics to improve a debug message are worth > it, especially because all of the callers either emit the found region > in a tracepoint or print a warning, so I would just be in favor of > deleting the debug message altogether, or at most just trimming to the > "mapped in endpoint" one. Yeah, probably went overboard here. Previously was offering the name of the region so wanted to keep same - but seeing as this is a debug only message, I think the endpoint alone, as you suggest is good. Thanks for the review, Alison
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 3c2b6144be23..88051bb673bf 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -2688,22 +2688,33 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg) { struct cxl_dpa_to_region_context *ctx = arg; struct cxl_endpoint_decoder *cxled; + struct cxl_region *cxlr; u64 dpa = ctx->dpa; if (!is_endpoint_decoder(dev)) return 0; cxled = to_cxl_endpoint_decoder(dev); - if (!cxled->dpa_res || !resource_size(cxled->dpa_res)) + if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res)) return 0; if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start) return 0; - dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, - dev_name(&cxled->cxld.region->dev)); + /* + * Stop the region search (return 1) when an endpoint mapping is + * found. The region may not be fully constructed so offering + * the cxlr in the context structure is not guaranteed. + */ + cxlr = cxled->cxld.region; + if (cxlr) + dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, + dev_name(&cxlr->dev)); + else + dev_dbg(dev, "dpa:0x%llx mapped in endpoint:%s\n", dpa, + dev_name(dev)); - ctx->cxlr = cxled->cxld.region; + ctx->cxlr = cxlr; return 1; }