diff mbox series

[v4,24/26] cxl: preclude device memory to be used for dax

Message ID 20241017165225.21206-25-alejandro.lucero-palau@amd.com (mailing list archive)
State Not Applicable
Headers show
Series cxl: add Type2 device support | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Lucero Palau, Alejandro Oct. 17, 2024, 4:52 p.m. UTC
From: Alejandro Lucero <alucerop@amd.com>

By definition a type2 cxl device will use the host managed memory for
specific functionality, therefore it should not be available to other
uses.

Signed-off-by: Alejandro Lucero <alucerop@amd.com>
---
 drivers/cxl/core/region.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Ben Cheatham Oct. 17, 2024, 9:50 p.m. UTC | #1
On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> By definition a type2 cxl device will use the host managed memory for
> specific functionality, therefore it should not be available to other
> uses.
> 

I disagree that this is a valid assumption. I don't think that the device memory
should be added as system ram, but I do think there is value in having the
option to have the memory be available as a device-dax region. My reasoning here is:

1) I can think of a possible use case where the memory benefits from being user space
accessible (CXL memory GPU buffers).
2) I think it's really early to say this is the only way we expect these devices to
be used. The flip side of this is that it is early, so we can always change it later
when we start seeing real devices, but I would vote to keep a more flexible structure
early and if no one uses it oh well.

My idea here is that whoever writes the driver indicates whether they want to make
the device memory device-dax mappable, or do it all manually like you are now. I've
been working on a RFC based on v3 of this series that has this (as well as the
"better" solution mentioned in patch 22/26) that I was planning
on sending out in the next week or two, but if the consensus here is that this is
not the direction we want to go I'll probably drop that portion.

Thanks,
Ben

> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/region.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 04c270a29e96..7c84d8f89af6 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3703,6 +3703,9 @@ static int cxl_region_probe(struct device *dev)
>  	case CXL_DECODER_PMEM:
>  		return devm_cxl_add_pmem_region(cxlr);
>  	case CXL_DECODER_RAM:
> +		if (cxlr->type != CXL_DECODER_HOSTONLYMEM)
> +			return 0;
> +
>  		/*
>  		 * The region can not be manged by CXL if any portion of
>  		 * it is already online as 'System RAM'
Alejandro Lucero Palau Oct. 18, 2024, 8:10 a.m. UTC | #2
On 10/17/24 22:50, Ben Cheatham wrote:
> On 10/17/24 11:52 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> By definition a type2 cxl device will use the host managed memory for
>> specific functionality, therefore it should not be available to other
>> uses.
>>
> I disagree that this is a valid assumption. I don't think that the device memory
> should be added as system ram, but I do think there is value in having the
> option to have the memory be available as a device-dax region. My reasoning here is:
>
> 1) I can think of a possible use case where the memory benefits from being user space
> accessible (CXL memory GPU buffers).
> 2) I think it's really early to say this is the only way we expect these devices to
> be used. The flip side of this is that it is early, so we can always change it later
> when we start seeing real devices, but I would vote to keep a more flexible structure
> early and if no one uses it oh well.
>
> My idea here is that whoever writes the driver indicates whether they want to make
> the device memory device-dax mappable, or do it all manually like you are now. I've
> been working on a RFC based on v3 of this series that has this (as well as the
> "better" solution mentioned in patch 22/26) that I was planning
> on sending out in the next week or two, but if the consensus here is that this is
> not the direction we want to go I'll probably drop that portion.


I understand your point and I agree dax creation could be required or 
interesting for some accelerators.


My experience when testing without this patch is the system is using 
that DAX even without any specific user space app, so the system was 
crashing because the CXL memory backend was not doing the expected 
thing. That is exactly the same case for our device, where memory should 
not be used except with the right format when writing. So the trivial 
patch was to preclude this dax creation for an accel/Type2.


I'm not against adding that flexibility now with a flag set by the 
driver at region creation time, so I'll add it for v5 if none is against it.


Thanks!


> Thanks,
> Ben
>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> ---
>>   drivers/cxl/core/region.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 04c270a29e96..7c84d8f89af6 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -3703,6 +3703,9 @@ static int cxl_region_probe(struct device *dev)
>>   	case CXL_DECODER_PMEM:
>>   		return devm_cxl_add_pmem_region(cxlr);
>>   	case CXL_DECODER_RAM:
>> +		if (cxlr->type != CXL_DECODER_HOSTONLYMEM)
>> +			return 0;
>> +
>>   		/*
>>   		 * The region can not be manged by CXL if any portion of
>>   		 * it is already online as 'System RAM'
diff mbox series

Patch

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 04c270a29e96..7c84d8f89af6 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3703,6 +3703,9 @@  static int cxl_region_probe(struct device *dev)
 	case CXL_DECODER_PMEM:
 		return devm_cxl_add_pmem_region(cxlr);
 	case CXL_DECODER_RAM:
+		if (cxlr->type != CXL_DECODER_HOSTONLYMEM)
+			return 0;
+
 		/*
 		 * The region can not be manged by CXL if any portion of
 		 * it is already online as 'System RAM'