diff mbox series

[17/26] dax/region: Create extent resources on DAX region driver load

Message ID 20240324-dcd-type2-upstream-v1-17-b7b00d623625@intel.com (mailing list archive)
State New, archived
Headers show
Series DCD: Add support for Dynamic Capacity Devices (DCD) | expand

Commit Message

Ira Weiny March 24, 2024, 11:18 p.m. UTC
From: Navneet Singh <navneet.singh@intel.com>

DAX regions mapping dynamic capacity partitions introduce a requirement
for the memory backing the region to come and go as required.  This
results in a DAX region with sparse areas of memory backing.  To track
the sparseness of the region, DAX extent objects need to track
sub-resource information as a new layer between the DAX region resource
and DAX device range resources.

Recall that DCD extents may be accepted when a region is first created.
Extend this support on region driver load.  Scan existing extents and
create DAX extent resources as a first step to DAX extent realization.

The lifetime of a DAX extent is tricky to manage because the extent life
may end in one of two ways.  First, the device may request the extent be
released.  Second, the region may release the extent when it is
destroyed without hardware involvement.  Support extent release without
hardware involvement first.  Subsequent patches will provide for
hardware to request extent removal.

Signed-off-by: Navneet Singh <navneet.singh@intel.com>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes for v1
[iweiny: remove xarrays]
[iweiny: remove as much of extra reference stuff as possible]
[iweiny: Move extent resource handling to core DAX code]
---
 drivers/dax/bus.c         | 55 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/dax/cxl.c         | 43 ++++++++++++++++++++++++++++++++++--
 drivers/dax/dax-private.h | 12 +++++++++++
 3 files changed, 108 insertions(+), 2 deletions(-)

Comments

Jonathan Cameron April 4, 2024, 4:36 p.m. UTC | #1
On Sun, 24 Mar 2024 16:18:20 -0700
ira.weiny@intel.com wrote:

> From: Navneet Singh <navneet.singh@intel.com>
> 
> DAX regions mapping dynamic capacity partitions introduce a requirement
> for the memory backing the region to come and go as required.  This
> results in a DAX region with sparse areas of memory backing.  To track
> the sparseness of the region, DAX extent objects need to track
> sub-resource information as a new layer between the DAX region resource
> and DAX device range resources.
> 
> Recall that DCD extents may be accepted when a region is first created.
> Extend this support on region driver load.  Scan existing extents and
> create DAX extent resources as a first step to DAX extent realization.
> 
> The lifetime of a DAX extent is tricky to manage because the extent life
> may end in one of two ways.  First, the device may request the extent be
> released.  Second, the region may release the extent when it is
> destroyed without hardware involvement.  Support extent release without
> hardware involvement first.  Subsequent patches will provide for
> hardware to request extent removal.
> 
> Signed-off-by: Navneet Singh <navneet.singh@intel.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
LGTM though I'm far from an expert on DAX..

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> Changes for v1
> [iweiny: remove xarrays]
> [iweiny: remove as much of extra reference stuff as possible]
> [iweiny: Move extent resource handling to core DAX code]
> ---
>  drivers/dax/bus.c         | 55 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/dax/cxl.c         | 43 ++++++++++++++++++++++++++++++++++--
>  drivers/dax/dax-private.h | 12 +++++++++++
>  3 files changed, 108 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 903566aff5eb..4d5ed7ab6537 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -186,6 +186,61 @@ static bool is_sparse(struct dax_region *dax_region)
>  	return (dax_region->res.flags & IORESOURCE_DAX_SPARSE_CAP) != 0;
>  }
>  
> +static int dax_region_add_resource(struct dax_region *dax_region,
> +				   struct dax_extent *dax_ext,
> +				   resource_size_t start,
> +				   resource_size_t length)
> +{
> +	struct resource *ext_res;
> +
> +	dev_dbg(dax_region->dev, "DAX region resource %pr\n", &dax_region->res);
> +	ext_res = __request_region(&dax_region->res, start, length, "extent", 0);
> +	if (!ext_res) {
> +		dev_err(dax_region->dev, "Failed to add region s:%pa l:%pa\n",
> +			&start, &length);
> +		return -ENOSPC;
> +	}
> +
> +	dax_ext->region = dax_region;
> +	dax_ext->res = ext_res;
> +	dev_dbg(dax_region->dev, "Extent add resource %pr\n", ext_res);
> +
> +	return 0;
> +}
Fan Ni April 9, 2024, 4:22 p.m. UTC | #2
On Sun, Mar 24, 2024 at 04:18:20PM -0700, ira.weiny@intel.com wrote:
> From: Navneet Singh <navneet.singh@intel.com>
> 
> DAX regions mapping dynamic capacity partitions introduce a requirement
> for the memory backing the region to come and go as required.  This
> results in a DAX region with sparse areas of memory backing.  To track
> the sparseness of the region, DAX extent objects need to track
> sub-resource information as a new layer between the DAX region resource
> and DAX device range resources.
> 
> Recall that DCD extents may be accepted when a region is first created.
> Extend this support on region driver load.  Scan existing extents and
> create DAX extent resources as a first step to DAX extent realization.
> 
> The lifetime of a DAX extent is tricky to manage because the extent life
> may end in one of two ways.  First, the device may request the extent be
> released.  Second, the region may release the extent when it is
> destroyed without hardware involvement.  Support extent release without
> hardware involvement first.  Subsequent patches will provide for
> hardware to request extent removal.
> 
> Signed-off-by: Navneet Singh <navneet.singh@intel.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
Trivial comments inline.
> ---
> Changes for v1
> [iweiny: remove xarrays]
> [iweiny: remove as much of extra reference stuff as possible]
> [iweiny: Move extent resource handling to core DAX code]
> ---
>  drivers/dax/bus.c         | 55 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/dax/cxl.c         | 43 ++++++++++++++++++++++++++++++++++--
>  drivers/dax/dax-private.h | 12 +++++++++++
>  3 files changed, 108 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 903566aff5eb..4d5ed7ab6537 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -186,6 +186,61 @@ static bool is_sparse(struct dax_region *dax_region)
>  	return (dax_region->res.flags & IORESOURCE_DAX_SPARSE_CAP) != 0;
>  }
>  
> +static int dax_region_add_resource(struct dax_region *dax_region,
> +				   struct dax_extent *dax_ext,
> +				   resource_size_t start,
> +				   resource_size_t length)
> +{
> +	struct resource *ext_res;
> +
> +	dev_dbg(dax_region->dev, "DAX region resource %pr\n", &dax_region->res);
> +	ext_res = __request_region(&dax_region->res, start, length, "extent", 0);
> +	if (!ext_res) {
> +		dev_err(dax_region->dev, "Failed to add region s:%pa l:%pa\n",
> +			&start, &length);
> +		return -ENOSPC;
> +	}
> +
> +	dax_ext->region = dax_region;
> +	dax_ext->res = ext_res;
> +	dev_dbg(dax_region->dev, "Extent add resource %pr\n", ext_res);
> +
> +	return 0;
> +}
> +
> +static void dax_region_release_extent(void *ext)
> +{
> +	struct dax_extent *dax_ext = ext;
> +	struct dax_region *dax_region = dax_ext->region;
> +
> +	dev_dbg(dax_region->dev, "Extent release resource %pr\n", dax_ext->res);
> +	if (dax_ext->res)
> +		__release_region(&dax_region->res, dax_ext->res->start,
> +				 resource_size(dax_ext->res));
> +
> +	kfree(dax_ext);
> +}
> +
> +int dax_region_add_extent(struct dax_region *dax_region, struct device *ext_dev,
> +			  resource_size_t start, resource_size_t length)
> +{
> +	int rc;
> +
> +	struct dax_extent *dax_ext __free(kfree) = kzalloc(sizeof(*dax_ext),
> +							   GFP_KERNEL);
> +	if (!dax_ext)
> +		return -ENOMEM;
> +
> +	guard(rwsem_write)(&dax_region_rwsem);
> +	rc = dax_region_add_resource(dax_region, dax_ext, start, length);
> +	if (rc)
> +		return rc;
> +
> +	return devm_add_action_or_reset(ext_dev, dax_region_release_extent,
> +					no_free_ptr(dax_ext));
> +}
> +EXPORT_SYMBOL_GPL(dax_region_add_extent);
> +
>  bool static_dev_dax(struct dev_dax *dev_dax)
>  {
>  	return is_static(dev_dax->region);
> diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
> index 415d03fbf9b6..70bdc7a878ab 100644
> --- a/drivers/dax/cxl.c
> +++ b/drivers/dax/cxl.c
> @@ -5,6 +5,42 @@
>  
>  #include "../cxl/cxl.h"
>  #include "bus.h"
> +#include "dax-private.h"
> +
> +static int __cxl_dax_region_add_extent(struct dax_region *dax_region,
> +				       struct region_extent *reg_ext)
> +{
> +	struct device *ext_dev = &reg_ext->dev;
> +	resource_size_t start, length;
> +
> +	dev_dbg(dax_region->dev, "Adding extent HPA %#llx - %#llx\n",
> +		reg_ext->hpa_range.start, reg_ext->hpa_range.end);
> +
> +	start = dax_region->res.start + reg_ext->hpa_range.start;
> +	length = reg_ext->hpa_range.end - reg_ext->hpa_range.start + 1;
use range_len() instead?

Fan

> +
> +	return dax_region_add_extent(dax_region, ext_dev, start, length);
> +}
> +
> +static int cxl_dax_region_add_extent(struct device *dev, void *data)
> +{
> +	struct dax_region *dax_region = data;
> +	struct region_extent *reg_ext;
> +
> +	if (!is_region_extent(dev))
> +		return 0;
> +
> +	reg_ext = to_region_extent(dev);
> +
> +	return __cxl_dax_region_add_extent(dax_region, reg_ext);
> +}
> +
> +static void cxl_dax_region_add_extents(struct cxl_dax_region *cxlr_dax,
> +				      struct dax_region *dax_region)
> +{
> +	dev_dbg(&cxlr_dax->dev, "Adding extents\n");
> +	device_for_each_child(&cxlr_dax->dev, dax_region, cxl_dax_region_add_extent);
> +}
>  
>  static int cxl_dax_region_probe(struct device *dev)
>  {
> @@ -29,9 +65,12 @@ static int cxl_dax_region_probe(struct device *dev)
>  		return -ENOMEM;
>  
>  	dev_size = range_len(&cxlr_dax->hpa_range);
> -	/* Add empty seed dax device */
> -	if (cxlr->mode == CXL_REGION_DC)
> +	if (cxlr->mode == CXL_REGION_DC) {
> +		/* NOTE: Depends on dax_region being set in driver data */
> +		cxl_dax_region_add_extents(cxlr_dax, dax_region);
> +		/* Add empty seed dax device */
>  		dev_size = 0;
> +	}
>  
>  	data = (struct dev_dax_data) {
>  		.dax_region = dax_region,
> diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
> index 446617b73aea..c6319c6567fb 100644
> --- a/drivers/dax/dax-private.h
> +++ b/drivers/dax/dax-private.h
> @@ -16,6 +16,18 @@ struct inode *dax_inode(struct dax_device *dax_dev);
>  int dax_bus_init(void);
>  void dax_bus_exit(void);
>  
> +/**
> + * struct dax_extent - For sparse regions; an active extent
> + * @region: dax_region this resources is in
> + * @res: resource this extent covers
> + */
> +struct dax_extent {
> +	struct dax_region *region;
> +	struct resource *res;
> +};
> +int dax_region_add_extent(struct dax_region *dax_region, struct device *ext_dev,
> +			  resource_size_t start, resource_size_t length);
> +
>  /**
>   * struct dax_region - mapping infrastructure for dax devices
>   * @id: kernel-wide unique region for a memory range
> 
> -- 
> 2.44.0
>
Dan Williams May 7, 2024, 2:31 a.m. UTC | #3
ira.weiny@ wrote:
> From: Navneet Singh <navneet.singh@intel.com>
> 
> DAX regions mapping dynamic capacity partitions introduce a requirement
> for the memory backing the region to come and go as required.  This
> results in a DAX region with sparse areas of memory backing.  To track
> the sparseness of the region, DAX extent objects need to track
> sub-resource information as a new layer between the DAX region resource
> and DAX device range resources.
> 
> Recall that DCD extents may be accepted when a region is first created.
> Extend this support on region driver load.  Scan existing extents and
> create DAX extent resources as a first step to DAX extent realization.
> 
> The lifetime of a DAX extent is tricky to manage because the extent life
> may end in one of two ways.  First, the device may request the extent be
> released.  Second, the region may release the extent when it is
> destroyed without hardware involvement.  Support extent release without
> hardware involvement first.  Subsequent patches will provide for
> hardware to request extent removal.
> 
> Signed-off-by: Navneet Singh <navneet.singh@intel.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes for v1
> [iweiny: remove xarrays]
> [iweiny: remove as much of extra reference stuff as possible]
> [iweiny: Move extent resource handling to core DAX code]
> ---
>  drivers/dax/bus.c         | 55 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/dax/cxl.c         | 43 ++++++++++++++++++++++++++++++++++--
>  drivers/dax/dax-private.h | 12 +++++++++++
>  3 files changed, 108 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 903566aff5eb..4d5ed7ab6537 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -186,6 +186,61 @@ static bool is_sparse(struct dax_region *dax_region)
>  	return (dax_region->res.flags & IORESOURCE_DAX_SPARSE_CAP) != 0;
>  }
>  
> +static int dax_region_add_resource(struct dax_region *dax_region,
> +				   struct dax_extent *dax_ext,
> +				   resource_size_t start,
> +				   resource_size_t length)
> +{
> +	struct resource *ext_res;
> +
> +	dev_dbg(dax_region->dev, "DAX region resource %pr\n", &dax_region->res);
> +	ext_res = __request_region(&dax_region->res, start, length, "extent", 0);
> +	if (!ext_res) {
> +		dev_err(dax_region->dev, "Failed to add region s:%pa l:%pa\n",
> +			&start, &length);
> +		return -ENOSPC;
> +	}
> +
> +	dax_ext->region = dax_region;
> +	dax_ext->res = ext_res;
> +	dev_dbg(dax_region->dev, "Extent add resource %pr\n", ext_res);

dax_ext is never used, it feels like this these helpers are in the wrong
patch. Like consumer side of dax_ext infrastructure lands *before* the
producer side.

Because the justification for this producer-side patch is after the case
for 'struct dax_extent' has been made.

> +int dax_region_add_extent(struct dax_region *dax_region, struct device *ext_dev,
> +			  resource_size_t start, resource_size_t length)
> +{
> +	int rc;
> +
> +	struct dax_extent *dax_ext __free(kfree) = kzalloc(sizeof(*dax_ext),
> +							   GFP_KERNEL);
> +	if (!dax_ext)
> +		return -ENOMEM;
> +
> +	guard(rwsem_write)(&dax_region_rwsem);
> +	rc = dax_region_add_resource(dax_region, dax_ext, start, length);
> +	if (rc)
> +		return rc;
> +
> +	return devm_add_action_or_reset(ext_dev, dax_region_release_extent,
> +					no_free_ptr(dax_ext));

This looks like an awkward rewrite of __devm_request_region(), but
likely that is because dax_ext is vestigial in this patch.

> +static void cxl_dax_region_add_extents(struct cxl_dax_region *cxlr_dax,
> +				      struct dax_region *dax_region)
> +{
> +	dev_dbg(&cxlr_dax->dev, "Adding extents\n");
> +	device_for_each_child(&cxlr_dax->dev, dax_region, cxl_dax_region_add_extent);

Per the comment on the last patch to move extent device creation to
cxl_dax_region_probe() that can get rid of looping over those devices
another time.
diff mbox series

Patch

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 903566aff5eb..4d5ed7ab6537 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -186,6 +186,61 @@  static bool is_sparse(struct dax_region *dax_region)
 	return (dax_region->res.flags & IORESOURCE_DAX_SPARSE_CAP) != 0;
 }
 
+static int dax_region_add_resource(struct dax_region *dax_region,
+				   struct dax_extent *dax_ext,
+				   resource_size_t start,
+				   resource_size_t length)
+{
+	struct resource *ext_res;
+
+	dev_dbg(dax_region->dev, "DAX region resource %pr\n", &dax_region->res);
+	ext_res = __request_region(&dax_region->res, start, length, "extent", 0);
+	if (!ext_res) {
+		dev_err(dax_region->dev, "Failed to add region s:%pa l:%pa\n",
+			&start, &length);
+		return -ENOSPC;
+	}
+
+	dax_ext->region = dax_region;
+	dax_ext->res = ext_res;
+	dev_dbg(dax_region->dev, "Extent add resource %pr\n", ext_res);
+
+	return 0;
+}
+
+static void dax_region_release_extent(void *ext)
+{
+	struct dax_extent *dax_ext = ext;
+	struct dax_region *dax_region = dax_ext->region;
+
+	dev_dbg(dax_region->dev, "Extent release resource %pr\n", dax_ext->res);
+	if (dax_ext->res)
+		__release_region(&dax_region->res, dax_ext->res->start,
+				 resource_size(dax_ext->res));
+
+	kfree(dax_ext);
+}
+
+int dax_region_add_extent(struct dax_region *dax_region, struct device *ext_dev,
+			  resource_size_t start, resource_size_t length)
+{
+	int rc;
+
+	struct dax_extent *dax_ext __free(kfree) = kzalloc(sizeof(*dax_ext),
+							   GFP_KERNEL);
+	if (!dax_ext)
+		return -ENOMEM;
+
+	guard(rwsem_write)(&dax_region_rwsem);
+	rc = dax_region_add_resource(dax_region, dax_ext, start, length);
+	if (rc)
+		return rc;
+
+	return devm_add_action_or_reset(ext_dev, dax_region_release_extent,
+					no_free_ptr(dax_ext));
+}
+EXPORT_SYMBOL_GPL(dax_region_add_extent);
+
 bool static_dev_dax(struct dev_dax *dev_dax)
 {
 	return is_static(dev_dax->region);
diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
index 415d03fbf9b6..70bdc7a878ab 100644
--- a/drivers/dax/cxl.c
+++ b/drivers/dax/cxl.c
@@ -5,6 +5,42 @@ 
 
 #include "../cxl/cxl.h"
 #include "bus.h"
+#include "dax-private.h"
+
+static int __cxl_dax_region_add_extent(struct dax_region *dax_region,
+				       struct region_extent *reg_ext)
+{
+	struct device *ext_dev = &reg_ext->dev;
+	resource_size_t start, length;
+
+	dev_dbg(dax_region->dev, "Adding extent HPA %#llx - %#llx\n",
+		reg_ext->hpa_range.start, reg_ext->hpa_range.end);
+
+	start = dax_region->res.start + reg_ext->hpa_range.start;
+	length = reg_ext->hpa_range.end - reg_ext->hpa_range.start + 1;
+
+	return dax_region_add_extent(dax_region, ext_dev, start, length);
+}
+
+static int cxl_dax_region_add_extent(struct device *dev, void *data)
+{
+	struct dax_region *dax_region = data;
+	struct region_extent *reg_ext;
+
+	if (!is_region_extent(dev))
+		return 0;
+
+	reg_ext = to_region_extent(dev);
+
+	return __cxl_dax_region_add_extent(dax_region, reg_ext);
+}
+
+static void cxl_dax_region_add_extents(struct cxl_dax_region *cxlr_dax,
+				      struct dax_region *dax_region)
+{
+	dev_dbg(&cxlr_dax->dev, "Adding extents\n");
+	device_for_each_child(&cxlr_dax->dev, dax_region, cxl_dax_region_add_extent);
+}
 
 static int cxl_dax_region_probe(struct device *dev)
 {
@@ -29,9 +65,12 @@  static int cxl_dax_region_probe(struct device *dev)
 		return -ENOMEM;
 
 	dev_size = range_len(&cxlr_dax->hpa_range);
-	/* Add empty seed dax device */
-	if (cxlr->mode == CXL_REGION_DC)
+	if (cxlr->mode == CXL_REGION_DC) {
+		/* NOTE: Depends on dax_region being set in driver data */
+		cxl_dax_region_add_extents(cxlr_dax, dax_region);
+		/* Add empty seed dax device */
 		dev_size = 0;
+	}
 
 	data = (struct dev_dax_data) {
 		.dax_region = dax_region,
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 446617b73aea..c6319c6567fb 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -16,6 +16,18 @@  struct inode *dax_inode(struct dax_device *dax_dev);
 int dax_bus_init(void);
 void dax_bus_exit(void);
 
+/**
+ * struct dax_extent - For sparse regions; an active extent
+ * @region: dax_region this resources is in
+ * @res: resource this extent covers
+ */
+struct dax_extent {
+	struct dax_region *region;
+	struct resource *res;
+};
+int dax_region_add_extent(struct dax_region *dax_region, struct device *ext_dev,
+			  resource_size_t start, resource_size_t length);
+
 /**
  * struct dax_region - mapping infrastructure for dax devices
  * @id: kernel-wide unique region for a memory range