diff mbox series

[v1,1/1] cxl/mem: Do not return error if CONFIG_CXL_MCE unset

Message ID 20250227101848.388595-1-ming.li@zohomail.com
State Accepted
Commit 7d3d6e187ae9f319c56db9abf54ed2a5a8de55c3
Headers show
Series [v1,1/1] cxl/mem: Do not return error if CONFIG_CXL_MCE unset | expand

Commit Message

Li Ming Feb. 27, 2025, 10:18 a.m. UTC
CONFIG_CXL_MCE depends on CONFIG_MEMORY_FAILURE, if
CONFIG_CXL_MCE is not set, devm_cxl_register_mce_notifier() will return
an -EOPNOTSUPP, it will cause cxl_mem_state_create() failure , and then
cxl pci device probing failed. In this case, it should not break cxl pci
device probing.

Add a checking in cxl_mem_state_create() to check if the returned value
of devm_cxl_register_mce_notifier() is -EOPNOTSUPP, if yes, just output
a warning log, do not let cxl_mem_state_create() return an error.

Signed-off-by: Li Ming <ming.li@zohomail.com>
---
I hit this issue on my cxl_test environment with latest cxl-next. If
CONFIG_MEMORY_FAILURE is unset, all cxl pci devices will fail to probe.

...
[    6.337952] cxl_mock_mem cxl_mem.6: probe with driver cxl_mock_mem failed with error -95
[    6.338880] cxl_mock_mem cxl_mem.4: probe with driver cxl_mock_mem failed with error -95
[    6.339593] cxl_mock_mem cxl_mem.9: probe with driver cxl_mock_mem failed with error -95
[    6.340588] cxl_mock_mem cxl_mem.2: probe with driver cxl_mock_mem failed with error -95
[    6.340914] cxl_mock_mem cxl_mem.0: probe with driver cxl_mock_mem failed with error -95
[    6.345762] cxl_mock_mem cxl_rcd.10: probe with driver cxl_mock_mem failed with error -95
[    6.345793] cxl_mock_mem cxl_mem.7: probe with driver cxl_mock_mem failed with error -95
...
[    6.519824] cxl_pci 0000:c4:00.0: probe with driver cxl_pci failed with error -95
[    6.520178] cxl_pci 0000:38:00.0: probe with driver cxl_pci failed with error -95
...

base-commit: 22eea823f69ae39dc060c4027e8d1470803d2e49 cxl/next
---
 drivers/cxl/core/mbox.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Dave Jiang Feb. 27, 2025, 3:16 p.m. UTC | #1
On 2/27/25 3:18 AM, Li Ming wrote:
> CONFIG_CXL_MCE depends on CONFIG_MEMORY_FAILURE, if
> CONFIG_CXL_MCE is not set, devm_cxl_register_mce_notifier() will return
> an -EOPNOTSUPP, it will cause cxl_mem_state_create() failure , and then
> cxl pci device probing failed. In this case, it should not break cxl pci
> device probing.
> 
> Add a checking in cxl_mem_state_create() to check if the returned value
> of devm_cxl_register_mce_notifier() is -EOPNOTSUPP, if yes, just output
> a warning log, do not let cxl_mem_state_create() return an error.
> 
> Signed-off-by: Li Ming <ming.li@zohomail.com>

Thanks!

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> ---
> I hit this issue on my cxl_test environment with latest cxl-next. If
> CONFIG_MEMORY_FAILURE is unset, all cxl pci devices will fail to probe.
> 
> ...
> [    6.337952] cxl_mock_mem cxl_mem.6: probe with driver cxl_mock_mem failed with error -95
> [    6.338880] cxl_mock_mem cxl_mem.4: probe with driver cxl_mock_mem failed with error -95
> [    6.339593] cxl_mock_mem cxl_mem.9: probe with driver cxl_mock_mem failed with error -95
> [    6.340588] cxl_mock_mem cxl_mem.2: probe with driver cxl_mock_mem failed with error -95
> [    6.340914] cxl_mock_mem cxl_mem.0: probe with driver cxl_mock_mem failed with error -95
> [    6.345762] cxl_mock_mem cxl_rcd.10: probe with driver cxl_mock_mem failed with error -95
> [    6.345793] cxl_mock_mem cxl_mem.7: probe with driver cxl_mock_mem failed with error -95
> ...
> [    6.519824] cxl_pci 0000:c4:00.0: probe with driver cxl_pci failed with error -95
> [    6.520178] cxl_pci 0000:38:00.0: probe with driver cxl_pci failed with error -95
> ...
> 
> base-commit: 22eea823f69ae39dc060c4027e8d1470803d2e49 cxl/next
> ---
>  drivers/cxl/core/mbox.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 13cac98846bc..d72764056ce6 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1503,7 +1503,9 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
>  	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
>  
>  	rc = devm_cxl_register_mce_notifier(dev, &mds->mce_notifier);
> -	if (rc)
> +	if (rc == -EOPNOTSUPP)
> +		dev_warn(dev, "CXL MCE unsupported\n");
> +	else if (rc)
>  		return ERR_PTR(rc);
>  
>  	return mds;
Ira Weiny Feb. 27, 2025, 3:57 p.m. UTC | #2
Li Ming wrote:
> CONFIG_CXL_MCE depends on CONFIG_MEMORY_FAILURE, if
> CONFIG_CXL_MCE is not set, devm_cxl_register_mce_notifier() will return
> an -EOPNOTSUPP, it will cause cxl_mem_state_create() failure , and then
> cxl pci device probing failed. In this case, it should not break cxl pci
> device probing.
> 
> Add a checking in cxl_mem_state_create() to check if the returned value
> of devm_cxl_register_mce_notifier() is -EOPNOTSUPP, if yes, just output
> a warning log, do not let cxl_mem_state_create() return an error.
> 
> Signed-off-by: Li Ming <ming.li@zohomail.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]
Dave Jiang Feb. 27, 2025, 4:21 p.m. UTC | #3
On 2/27/25 3:18 AM, Li Ming wrote:
> CONFIG_CXL_MCE depends on CONFIG_MEMORY_FAILURE, if
> CONFIG_CXL_MCE is not set, devm_cxl_register_mce_notifier() will return
> an -EOPNOTSUPP, it will cause cxl_mem_state_create() failure , and then
> cxl pci device probing failed. In this case, it should not break cxl pci
> device probing.
> 
> Add a checking in cxl_mem_state_create() to check if the returned value
> of devm_cxl_register_mce_notifier() is -EOPNOTSUPP, if yes, just output
> a warning log, do not let cxl_mem_state_create() return an error.
> 
> Signed-off-by: Li Ming <ming.li@zohomail.com>

Applied to cxl/next

thanks!
> ---
> I hit this issue on my cxl_test environment with latest cxl-next. If
> CONFIG_MEMORY_FAILURE is unset, all cxl pci devices will fail to probe.
> 
> ...
> [    6.337952] cxl_mock_mem cxl_mem.6: probe with driver cxl_mock_mem failed with error -95
> [    6.338880] cxl_mock_mem cxl_mem.4: probe with driver cxl_mock_mem failed with error -95
> [    6.339593] cxl_mock_mem cxl_mem.9: probe with driver cxl_mock_mem failed with error -95
> [    6.340588] cxl_mock_mem cxl_mem.2: probe with driver cxl_mock_mem failed with error -95
> [    6.340914] cxl_mock_mem cxl_mem.0: probe with driver cxl_mock_mem failed with error -95
> [    6.345762] cxl_mock_mem cxl_rcd.10: probe with driver cxl_mock_mem failed with error -95
> [    6.345793] cxl_mock_mem cxl_mem.7: probe with driver cxl_mock_mem failed with error -95
> ...
> [    6.519824] cxl_pci 0000:c4:00.0: probe with driver cxl_pci failed with error -95
> [    6.520178] cxl_pci 0000:38:00.0: probe with driver cxl_pci failed with error -95
> ...
> 
> base-commit: 22eea823f69ae39dc060c4027e8d1470803d2e49 cxl/next
> ---
>  drivers/cxl/core/mbox.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 13cac98846bc..d72764056ce6 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1503,7 +1503,9 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
>  	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
>  
>  	rc = devm_cxl_register_mce_notifier(dev, &mds->mce_notifier);
> -	if (rc)
> +	if (rc == -EOPNOTSUPP)
> +		dev_warn(dev, "CXL MCE unsupported\n");
> +	else if (rc)
>  		return ERR_PTR(rc);
>  
>  	return mds;
diff mbox series

Patch

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 13cac98846bc..d72764056ce6 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1503,7 +1503,9 @@  struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
 	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
 
 	rc = devm_cxl_register_mce_notifier(dev, &mds->mce_notifier);
-	if (rc)
+	if (rc == -EOPNOTSUPP)
+		dev_warn(dev, "CXL MCE unsupported\n");
+	else if (rc)
 		return ERR_PTR(rc);
 
 	return mds;