diff mbox series

[v10,09/26] cxl: support device identification without mailbox

Message ID 20250205151950.25268-10-alucerop@amd.com (mailing list archive)
State Not Applicable
Headers show
Series cxl: add type2 device basic support | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Alejandro Lucero Palau Feb. 5, 2025, 3:19 p.m. UTC
From: Alejandro Lucero <alucerop@amd.com>

Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
memdev state params.

Allow a Type2 driver to initialize same params using an info struct and
assume partition alignment not required by now.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/memdev.c | 12 ++++++++++++
 include/cxl/cxl.h         | 11 +++++++++++
 2 files changed, 23 insertions(+)

Comments

Ira Weiny Feb. 5, 2025, 9:45 p.m. UTC | #1
alucerop@ wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
> memdev state params.
> 
> Allow a Type2 driver to initialize same params using an info struct and
> assume partition alignment not required by now.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>

This is exactly the type of thing I was hoping to avoid by removing these
members from the mds.  There is no reason you should have to fake these
values within an mds just to create partitions in the device state.

Still wrapping my head around the entire series though...

Ira

> ---
>  drivers/cxl/core/memdev.c | 12 ++++++++++++
>  include/cxl/cxl.h         | 11 +++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 456d505f1bc8..7113a51b3a93 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -655,6 +655,18 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev, u64 serial,
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, "CXL");
>  
> +void cxl_dev_state_setup(struct cxl_memdev_state *mds, struct mds_info *info)
> +{
> +	if (!mds->cxlds.media_ready)
> +		return;
> +
> +	mds->total_bytes = info->total_bytes;
> +	mds->volatile_only_bytes = info->volatile_only_bytes;
> +	mds->persistent_only_bytes = info->persistent_only_bytes;
> +	mds->partition_align_bytes = 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_dev_state_setup, "CXL");
> +
>  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>  					   const struct file_operations *fops)
>  {
> diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
> index 955e58103df6..1b2224ee1d5b 100644
> --- a/include/cxl/cxl.h
> +++ b/include/cxl/cxl.h
> @@ -39,6 +39,16 @@ enum cxl_devtype {
>  	CXL_DEVTYPE_CLASSMEM,
>  };
>  
> +/*
> + * struct for an accel driver giving partition data when Type2 device without a
> + * mailbox.
> + */
> +struct mds_info {
> +	u64 total_bytes;
> +	u64 volatile_only_bytes;
> +	u64 persistent_only_bytes;
> +};
> +
>  struct device;
>  struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev, u64 serial,
>  					   u16 dvsec, enum cxl_devtype type);
> @@ -48,4 +58,5 @@ int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_memdev_state *cxlm
>  			     unsigned long *caps);
>  int cxl_await_media_ready(struct cxl_memdev_state *mds);
>  void cxl_set_media_ready(struct cxl_memdev_state *mds);
> +void cxl_dev_state_setup(struct cxl_memdev_state *mds, struct mds_info *info);
>  #endif
> -- 
> 2.17.1
> 
>
Alejandro Lucero Palau Feb. 6, 2025, 6:10 p.m. UTC | #2
On 2/5/25 21:45, Ira Weiny wrote:
> alucerop@ wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
>> memdev state params.
>>
>> Allow a Type2 driver to initialize same params using an info struct and
>> assume partition alignment not required by now.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> This is exactly the type of thing I was hoping to avoid by removing these
> members from the mds.  There is no reason you should have to fake these
> values within an mds just to create partitions in the device state.


Let's be practical here.


A type2 without a mailbox needs to give that information for building up 
the DPA partitions. Before it was about dealing with DPA resources from 
the accel driver, but I do not think an accel driver should handle any 
partition setup at all. Mainly because there is code now doing that in 
the cxl core which can be used for accel drivers without requiring too 
much effort. You can see what the sfc driver does now, and it is 
equivalent to the current pci driver. An accel driver with a device 
supporting a mailbox will do exactly the same than the pci driver.


For avoiding the mds fields the weight should not be on the accel 
driver. This patch adds a way for giving the required (and little) info 
to the core for building the partitions. So if you or Dan suggest this 
is wrong and the accel driver should deal with the intrinsics of DPA 
partitions, I will fight against it :-)


I'm quite happy with the DPA partition work, with the result of current 
v10 being simpler and cleaner. But it is time to get the patchsets 
depending on that cleaning work going forward.
Ira Weiny Feb. 6, 2025, 7:23 p.m. UTC | #3
Alejandro Lucero Palau wrote:
> 
> On 2/5/25 21:45, Ira Weiny wrote:
> > alucerop@ wrote:
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
> >> memdev state params.
> >>
> >> Allow a Type2 driver to initialize same params using an info struct and
> >> assume partition alignment not required by now.
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> > This is exactly the type of thing I was hoping to avoid by removing these
> > members from the mds.  There is no reason you should have to fake these
> > values within an mds just to create partitions in the device state.
> 
> 
> Let's be practical here.
> 
> 
> A type2 without a mailbox needs to give that information for building up 
> the DPA partitions. Before it was about dealing with DPA resources from 
> the accel driver, but I do not think an accel driver should handle any 
> partition setup at all.

I 100% totally agree!  However, the dev state is where those partitions are
managed.  Not the memdev state.

> Mainly because there is code now doing that in 
> the cxl core which can be used for accel drivers without requiring too 
> much effort. You can see what the sfc driver does now, and it is 
> equivalent to the current pci driver. An accel driver with a device 
> supporting a mailbox will do exactly the same than the pci driver.
> 

I agree that the effort you made in these patches was not huge.  Changing the
types around and defining mds_info is not hard.  But the final result is odd
and does not fix a couple of the issues Dan had with the core architecture.
First of which is the carrying of initialization values in the memdev
state:[1]

[1]

	> @@ -473,7 +488,9 @@ static inline struct cxl_dev_state *mbox_to_cxlds(struct cxl_mailbox *cxl_mbox)
	>   * @dcd_cmds: List of DCD commands implemented by memory device
	>   * @enabled_cmds: Hardware commands found enabled in CEL.
	>   * @exclusive_cmds: Commands that are kernel-internal only
	> - * @total_bytes: sum of all possible capacities
	> + * @total_bytes: length of all possible capacities
	> + * @static_bytes: length of possible static RAM and PMEM partitions
	> + * @dynamic_bytes: length of possible DC partitions (DC Regions)
	>   * @volatile_only_bytes: hard volatile capacity
	>   * @persistent_only_bytes: hard persistent capacity
	
	I have regrets that cxl_memdev_state permanently carries runtime
	storage for init time variables, lets not continue down that path
	with DCD enabling.

	-- https://lore.kernel.org/all/67871f05cd767_20f32947f@dwillia2-xfh.jf.intel.com.notmuch/

> For avoiding the mds fields the weight should not be on the accel 
> driver.

I agree.  So why would you want to use the mds fields at all?

I proposed a helper function to create cxl_dpa_info [cxl_add_partition] and Dan
proposed a function to create the partitions from cxl_dpa_info
[cxl_dpa_setup].[2]

[2]

   void cxl_add_partition(struct cxl_dpa_info *info, u64 start, u64 size, enum cxl_partition_mode mode)
   int cxl_dpa_setup(struct cxl_dev_state *cxlds, const struct cxl_dpa_info *info)

	-- https://lore.kernel.org/all/20250128-rfc-rearch-mem-res-v1-2-26d1ca151376@intel.com/

What more do you need?

> This patch adds a way for giving the required (and little) info 
> to the core for building the partitions.

The second issue with your patch set is in the addition of struct mds_info.
This has the same issue which Dan objected to about creating a temporary
variable[3] but this is worse than my proposal in that your set continues to
carry the initialization state around in the memdev forever.

[3]

	The crux of the concern for me is less about the role of
	cxl_mem_get_partition_info() and more about the introduction of a new
	'struct cxl_mem_dev_info' in/out parameter which is similar in function
	to 'struct cxl_dpa_info'. If you can find a way to avoid another level
	of indirection or otherwise consolidate all these steps into a straight
	line routine that does "all the DPA enumeration" things.

	-- https://lore.kernel.org/all/67a28921ca0b5_2d2c29434@dwillia2-xfh.jf.intel.com.notmuch/


Note to Dan.  I think doing 'all the DPA enumeration' things is the issue
here.  DCD further complicates this because it adds an additional DPA
discovery mechanism.  In summary we have:

	1) Identify Memory Device (existing)
	2) Hard coded values (Alejandro's type 2 set)
	3) Get dynamic capacity configuration (DCD set)

It is conceivable that a device might want to do some random combination of
those.  But the combinations we have in front of us are:

	A) 1 only
	B) 2 only
	C) 1 & 3

I'm not sure it is worth having a single call which attempts to enumerate the
dpa info.  I'll explore having a call which does A & C for mailbox supported
devices.  But B was specifically in my mind when I came up with the
cxl_add_partition() call.  And I felt using it in A and C would work just
fine.

> So if you or Dan suggest this 
> is wrong and the accel driver should deal with the intrinsics of DPA 
> partitions, I will fight against it :-)

I don't want an accel driver to deal with the intrinsics of the DPA
partitions at all!  But it should be able to specify the size parameters
separate from creating dummy memdev state objects with values it does not
care about.

> 
> I'm quite happy with the DPA partition work,

As am I.  I'm just trying to go a step further so it fits a bit cleaner
when DCD comes along.  I do apologize for the delay and churn in your set.
That was not my intention.  But I thought the alterations of the memdev
state were a good clean up.

> with the result of current 
> v10 being simpler and cleaner. But it is time to get the patchsets 
> depending on that cleaning work going forward.

Agreed.

If Dan likes what you have here I will adjust the DCD work.

Ira
Alejandro Lucero Palau Feb. 17, 2025, 1:41 p.m. UTC | #4
On 2/6/25 19:23, Ira Weiny wrote:
> Alejandro Lucero Palau wrote:
>> On 2/5/25 21:45, Ira Weiny wrote:
>>> alucerop@ wrote:
>>>> From: Alejandro Lucero <alucerop@amd.com>
>>>>
>>>> Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
>>>> memdev state params.
>>>>
>>>> Allow a Type2 driver to initialize same params using an info struct and
>>>> assume partition alignment not required by now.
>>>>
>>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>>> This is exactly the type of thing I was hoping to avoid by removing these
>>> members from the mds.  There is no reason you should have to fake these
>>> values within an mds just to create partitions in the device state.
>>
>> Let's be practical here.
>>
>>
>> A type2 without a mailbox needs to give that information for building up
>> the DPA partitions. Before it was about dealing with DPA resources from
>> the accel driver, but I do not think an accel driver should handle any
>> partition setup at all.
> I 100% totally agree!  However, the dev state is where those partitions are
> managed.  Not the memdev state.


But as I said in other previous patches, this patchset version does use 
cxl_memdev_state as the opaque struct to be used by the accel driver.


>> Mainly because there is code now doing that in
>> the cxl core which can be used for accel drivers without requiring too
>> much effort. You can see what the sfc driver does now, and it is
>> equivalent to the current pci driver. An accel driver with a device
>> supporting a mailbox will do exactly the same than the pci driver.
>>
> I agree that the effort you made in these patches was not huge.  Changing the
> types around and defining mds_info is not hard.  But the final result is odd
> and does not fix a couple of the issues Dan had with the core architecture.
> First of which is the carrying of initialization values in the memdev
> state:[1]
>
> [1]
>
> 	> @@ -473,7 +488,9 @@ static inline struct cxl_dev_state *mbox_to_cxlds(struct cxl_mailbox *cxl_mbox)
> 	>   * @dcd_cmds: List of DCD commands implemented by memory device
> 	>   * @enabled_cmds: Hardware commands found enabled in CEL.
> 	>   * @exclusive_cmds: Commands that are kernel-internal only
> 	> - * @total_bytes: sum of all possible capacities
> 	> + * @total_bytes: length of all possible capacities
> 	> + * @static_bytes: length of possible static RAM and PMEM partitions
> 	> + * @dynamic_bytes: length of possible DC partitions (DC Regions)
> 	>   * @volatile_only_bytes: hard volatile capacity
> 	>   * @persistent_only_bytes: hard persistent capacity
> 	
> 	I have regrets that cxl_memdev_state permanently carries runtime
> 	storage for init time variables, lets not continue down that path
> 	with DCD enabling.
>
> 	-- https://lore.kernel.org/all/67871f05cd767_20f32947f@dwillia2-xfh.jf.intel.com.notmuch/
>
>> For avoiding the mds fields the weight should not be on the accel
>> driver.
> I agree.  So why would you want to use the mds fields at all?


I just wanted to have Type2 support patchset working with the new DPA 
work. I was hoping those concerns not addressed with another patch or 
patches Type2 work should be adapted to.


>
> I proposed a helper function to create cxl_dpa_info [cxl_add_partition] and Dan
> proposed a function to create the partitions from cxl_dpa_info
> [cxl_dpa_setup].[2]
>
> [2]
>
>     void cxl_add_partition(struct cxl_dpa_info *info, u64 start, u64 size, enum cxl_partition_mode mode)
>     int cxl_dpa_setup(struct cxl_dev_state *cxlds, const struct cxl_dpa_info *info)
>
> 	-- https://lore.kernel.org/all/20250128-rfc-rearch-mem-res-v1-2-26d1ca151376@intel.com/
>
> What more do you need?


I need a stable API to work with which is not going to change so quick 
after a work like the DPA changes.


>> This patch adds a way for giving the required (and little) info
>> to the core for building the partitions.
> The second issue with your patch set is in the addition of struct mds_info.
> This has the same issue which Dan objected to about creating a temporary
> variable[3] but this is worse than my proposal in that your set continues to
> carry the initialization state around in the memdev forever.
>
> [3]
>
> 	The crux of the concern for me is less about the role of
> 	cxl_mem_get_partition_info() and more about the introduction of a new
> 	'struct cxl_mem_dev_info' in/out parameter which is similar in function
> 	to 'struct cxl_dpa_info'. If you can find a way to avoid another level
> 	of indirection or otherwise consolidate all these steps into a straight
> 	line routine that does "all the DPA enumeration" things.
>
> 	-- https://lore.kernel.org/all/67a28921ca0b5_2d2c29434@dwillia2-xfh.jf.intel.com.notmuch/
>
>
> Note to Dan.  I think doing 'all the DPA enumeration' things is the issue
> here.  DCD further complicates this because it adds an additional DPA
> discovery mechanism.  In summary we have:
>
> 	1) Identify Memory Device (existing)
> 	2) Hard coded values (Alejandro's type 2 set)
> 	3) Get dynamic capacity configuration (DCD set)
>
> It is conceivable that a device might want to do some random combination of
> those.  But the combinations we have in front of us are:
>
> 	A) 1 only
> 	B) 2 only
> 	C) 1 & 3
>
> I'm not sure it is worth having a single call which attempts to enumerate the
> dpa info.  I'll explore having a call which does A & C for mailbox supported
> devices.  But B was specifically in my mind when I came up with the
> cxl_add_partition() call.  And I felt using it in A and C would work just
> fine.
>
>> So if you or Dan suggest this
>> is wrong and the accel driver should deal with the intrinsics of DPA
>> partitions, I will fight against it :-)
> I don't want an accel driver to deal with the intrinsics of the DPA
> partitions at all!  But it should be able to specify the size parameters
> separate from creating dummy memdev state objects with values it does not
> care about.


Yes, but those objects have been there for a long time ...

I bet we can optimize other aspects of those structs as well, but this 
is being done in the middle of patches like Type2 and DCD relying on them.


>> I'm quite happy with the DPA partition work,
> As am I.  I'm just trying to go a step further so it fits a bit cleaner
> when DCD comes along.  I do apologize for the delay and churn in your set.
> That was not my intention.  But I thought the alterations of the memdev
> state were a good clean up.
>
>> with the result of current
>> v10 being simpler and cleaner. But it is time to get the patchsets
>> depending on that cleaning work going forward.
> Agreed.
>
> If Dan likes what you have here I will adjust the DCD work.
>
> Ira
diff mbox series

Patch

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 456d505f1bc8..7113a51b3a93 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -655,6 +655,18 @@  struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev, u64 serial,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, "CXL");
 
+void cxl_dev_state_setup(struct cxl_memdev_state *mds, struct mds_info *info)
+{
+	if (!mds->cxlds.media_ready)
+		return;
+
+	mds->total_bytes = info->total_bytes;
+	mds->volatile_only_bytes = info->volatile_only_bytes;
+	mds->persistent_only_bytes = info->persistent_only_bytes;
+	mds->partition_align_bytes = 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_dev_state_setup, "CXL");
+
 static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
 					   const struct file_operations *fops)
 {
diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
index 955e58103df6..1b2224ee1d5b 100644
--- a/include/cxl/cxl.h
+++ b/include/cxl/cxl.h
@@ -39,6 +39,16 @@  enum cxl_devtype {
 	CXL_DEVTYPE_CLASSMEM,
 };
 
+/*
+ * struct for an accel driver giving partition data when Type2 device without a
+ * mailbox.
+ */
+struct mds_info {
+	u64 total_bytes;
+	u64 volatile_only_bytes;
+	u64 persistent_only_bytes;
+};
+
 struct device;
 struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev, u64 serial,
 					   u16 dvsec, enum cxl_devtype type);
@@ -48,4 +58,5 @@  int cxl_pci_accel_setup_regs(struct pci_dev *pdev, struct cxl_memdev_state *cxlm
 			     unsigned long *caps);
 int cxl_await_media_ready(struct cxl_memdev_state *mds);
 void cxl_set_media_ready(struct cxl_memdev_state *mds);
+void cxl_dev_state_setup(struct cxl_memdev_state *mds, struct mds_info *info);
 #endif