diff mbox series

cxl: Fix cleanup of port devices on failure to probe driver.

Message ID 20220609134519.11668-1-Jonathan.Cameron@huawei.com
State Accepted
Commit db9a3a35d31ea337331f0e6e07e04bcd52642894
Headers show
Series cxl: Fix cleanup of port devices on failure to probe driver. | expand

Commit Message

Jonathan Cameron June 9, 2022, 1:45 p.m. UTC
The device is created, and then there is a check if a driver succesfully
bound to it. In event of failing the bind (e.g. failure in cxl_port_probe())
the device is left registered. When a bus rescan later occurs, fresh
devices are created leading to a multiple device representing the same
underlying hardware. Bad things may follow and at very least we have far too many
devices.

Fix by ensuring autoremove is registered if the device create succeeds,
but doesn't depend on sucessful binding to a driver.

Bug was observed as side effect of incorrect ownership in
[PATCH v9 6/9] cxl/port: Read CDAT table
but will result from any failure to in cxl_port_probe().

Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 drivers/cxl/mem.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Ira Weiny June 10, 2022, 12:31 a.m. UTC | #1
On Thu, Jun 09, 2022 at 02:45:19PM +0100, Jonathan Cameron wrote:
> The device is created, and then there is a check if a driver succesfully
> bound to it. In event of failing the bind (e.g. failure in cxl_port_probe())
> the device is left registered. When a bus rescan later occurs, fresh
> devices are created leading to a multiple device representing the same
> underlying hardware. Bad things may follow and at very least we have far too many
> devices.
> 
> Fix by ensuring autoremove is registered if the device create succeeds,
> but doesn't depend on sucessful binding to a driver.
> 
> Bug was observed as side effect of incorrect ownership in
> [PATCH v9 6/9] cxl/port: Read CDAT table
> but will result from any failure to in cxl_port_probe().
> 
> Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/cxl/mem.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index c310f1fd3db0..a979d0b484d5 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -29,6 +29,7 @@ static int create_endpoint(struct cxl_memdev *cxlmd,
>  {
>  	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>  	struct cxl_port *endpoint;
> +	int rc;
>  
>  	endpoint = devm_cxl_add_port(&parent_port->dev, &cxlmd->dev,
>  				     cxlds->component_reg_phys, parent_port);
> @@ -37,13 +38,17 @@ static int create_endpoint(struct cxl_memdev *cxlmd,
>  
>  	dev_dbg(&cxlmd->dev, "add: %s\n", dev_name(&endpoint->dev));
>  
> +	rc = cxl_endpoint_autoremove(cxlmd, endpoint);
> +	if (rc)
> +		return rc;
> +
>  	if (!endpoint->dev.driver) {
>  		dev_err(&cxlmd->dev, "%s failed probe\n",
>  			dev_name(&endpoint->dev));
>  		return -ENXIO;
>  	}

I wonder if this code is really required?  I think that if the suspend code in
cxl_mem_probe was reworked a bit I think this could be removed.

But for now:

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

>  
> -	return cxl_endpoint_autoremove(cxlmd, endpoint);
> +	return 0;
>  }
>  
>  static void enable_suspend(void *data)
> -- 
> 2.32.0
>
diff mbox series

Patch

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index c310f1fd3db0..a979d0b484d5 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -29,6 +29,7 @@  static int create_endpoint(struct cxl_memdev *cxlmd,
 {
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct cxl_port *endpoint;
+	int rc;
 
 	endpoint = devm_cxl_add_port(&parent_port->dev, &cxlmd->dev,
 				     cxlds->component_reg_phys, parent_port);
@@ -37,13 +38,17 @@  static int create_endpoint(struct cxl_memdev *cxlmd,
 
 	dev_dbg(&cxlmd->dev, "add: %s\n", dev_name(&endpoint->dev));
 
+	rc = cxl_endpoint_autoremove(cxlmd, endpoint);
+	if (rc)
+		return rc;
+
 	if (!endpoint->dev.driver) {
 		dev_err(&cxlmd->dev, "%s failed probe\n",
 			dev_name(&endpoint->dev));
 		return -ENXIO;
 	}
 
-	return cxl_endpoint_autoremove(cxlmd, endpoint);
+	return 0;
 }
 
 static void enable_suspend(void *data)