diff mbox series

cxl/acpi: Warn on unsupported platform config detection

Message ID 20240619125949.167936-1-fabio.m.de.francesco@linux.intel.com
State Superseded
Headers show
Series cxl/acpi: Warn on unsupported platform config detection | expand

Commit Message

Fabio M. De Francesco June 19, 2024, 12:59 p.m. UTC
Each Host Bridge instance has a corresponding CXL Host Bridge Structure
(CHBS) ACPI table that identifies its capabilities. CHBS tables can be
two types: RCRB and CHBCR.

If a Host Bridge is attached to a device that is operating in Restricted
CXL Device Mode (RCD), BIOS publishes an RCRB with the base address of
registers that describe its capabilities.

However, the new (CXL 2.0+) Component registers (e.g., Extended Security
Capability), can only be accessed by means of a base address published
with a CHBCR.

An algorithm to locate a CHBCR associated with an RCRB would be too
invasive to land without some concrete motivation.

Therefore, just print a message to inform of unsupported config.

Count how many different CHBS "Version" types are detected by
cxl_get_chbs_iter(). Then make cxl_get_chbs() print a warning if that sum
is greater than 1.

Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/cxl/acpi.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

Comments

Alison Schofield June 21, 2024, 8:57 p.m. UTC | #1
On Wed, Jun 19, 2024 at 02:59:41PM +0200, Fabio M. De Francesco wrote:

Hi Fabio,
You've written such a detailed commit msg, that it pulls me in,
and now I want to understand more....


> Each Host Bridge instance has a corresponding CXL Host Bridge Structure
> (CHBS) ACPI table that identifies its capabilities. CHBS tables can be
> two types: RCRB and CHBCR.

Is there a spec reference for this?
While you're spelling things out, please expand RCRB and CHBCR

> 
> If a Host Bridge is attached to a device that is operating in Restricted
> CXL Device Mode (RCD), BIOS publishes an RCRB with the base address of
> registers that describe its capabilities.
> 
> However, the new (CXL 2.0+) Component registers (e.g., Extended Security
> Capability), can only be accessed by means of a base address published
> with a CHBCR.
> 
> An algorithm to locate a CHBCR associated with an RCRB would be too
> invasive to land without some concrete motivation.
> 
> Therefore, just print a message to inform of unsupported config.
> 


Were users seeing this and confused by this silent failure?
What did it look like before?


> Count how many different CHBS "Version" types are detected by
> cxl_get_chbs_iter(). Then make cxl_get_chbs() print a warning if that sum
> is greater than 1.
> 
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/cxl/acpi.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 571069863c62..9e226a65a5ea 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -482,6 +482,7 @@ struct cxl_chbs_context {
>  	unsigned long long uid;
>  	resource_size_t base;
>  	u32 cxl_version;
> +	int count;

Maybe s/count/nr_versions to be more explicit of what it counts.

-- Alison
>  };
>  
snip
>
Dan Williams June 25, 2024, 11:14 p.m. UTC | #2
Fabio M. De Francesco wrote:
> Each Host Bridge instance has a corresponding CXL Host Bridge Structure
> (CHBS) ACPI table that identifies its capabilities. CHBS tables can be
> two types: RCRB and CHBCR.
> 
> If a Host Bridge is attached to a device that is operating in Restricted
> CXL Device Mode (RCD), BIOS publishes an RCRB with the base address of
> registers that describe its capabilities.
> 
> However, the new (CXL 2.0+) Component registers (e.g., Extended Security
> Capability), can only be accessed by means of a base address published
> with a CHBCR.
> 
> An algorithm to locate a CHBCR associated with an RCRB would be too
> invasive to land without some concrete motivation.
> 
> Therefore, just print a message to inform of unsupported config.
> 
> Count how many different CHBS "Version" types are detected by
> cxl_get_chbs_iter(). Then make cxl_get_chbs() print a warning if that sum
> is greater than 1.
> 
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/cxl/acpi.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 571069863c62..9e226a65a5ea 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -482,6 +482,7 @@ struct cxl_chbs_context {
>  	unsigned long long uid;
>  	resource_size_t base;
>  	u32 cxl_version;
> +	int count;
>  };
>  
>  static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
> @@ -490,10 +491,17 @@ static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
>  	struct cxl_chbs_context *ctx = arg;
>  	struct acpi_cedt_chbs *chbs;
>  
> -	if (ctx->base != CXL_RESOURCE_NONE)
> +	chbs = (struct acpi_cedt_chbs *) header;
> +
> +	if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
> +	    chbs->length != CXL_RCRB_SIZE)
>  		return 0;
>  
> -	chbs = (struct acpi_cedt_chbs *) header;
> +	if (ctx->cxl_version != chbs->cxl_version)
> +		ctx->count++;
> +
> +	if (ctx->base != CXL_RESOURCE_NONE)
> +		return 0;
>  
>  	if (ctx->uid != chbs->uid)
>  		return 0;
> @@ -502,10 +510,6 @@ static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
>  	if (!chbs->base)
>  		return 0;
>  
> -	if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
> -	    chbs->length != CXL_RCRB_SIZE)
> -		return 0;
> -
>  	ctx->base = chbs->base;
>  
>  	return 0;
> @@ -533,6 +537,10 @@ static int cxl_get_chbs(struct device *dev, struct acpi_device *hb,
>  
>  	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CHBS, cxl_get_chbs_iter, ctx);
>  
> +	if (ctx->count > 1)
> +		/* Disclaim eRCD support given some component register may only be found via CHBCR */
> +		dev_info(dev, "Unsupported platform config, mixed Virtual Host and Restricted CXL Host hierarchy.");

I believe this is already queued, but my personal preference is that
multiline statements include brackets, or move the comment above the "if
()", so either:

	/* Disclaim eRCD support given some component register may only be found via CHBCR */
	if (ctx->count > 1)
		dev_info(dev, "Unsupported platform config, mixed Virtual Host and Restricted CXL Host hierarchy.");

...or:

	if (ctx->count > 1) {
		/* Disclaim eRCD support given some component register may only be found via CHBCR */
		dev_info(dev, "Unsupported platform config, mixed Virtual Host and Restricted CXL Host hierarchy.");
	}

...but don't spin the patch just for that fixup.
Dan Williams June 25, 2024, 11:41 p.m. UTC | #3
Alison Schofield wrote:
> On Wed, Jun 19, 2024 at 02:59:41PM +0200, Fabio M. De Francesco wrote:
> 
> Hi Fabio,
> You've written such a detailed commit msg, that it pulls me in,
> and now I want to understand more....
> 
> 
> > Each Host Bridge instance has a corresponding CXL Host Bridge Structure
> > (CHBS) ACPI table that identifies its capabilities. CHBS tables can be
> > two types: RCRB and CHBCR.
> 
> Is there a spec reference for this?
> While you're spelling things out, please expand RCRB and CHBCR
> 
> > 
> > If a Host Bridge is attached to a device that is operating in Restricted
> > CXL Device Mode (RCD), BIOS publishes an RCRB with the base address of
> > registers that describe its capabilities.
> > 
> > However, the new (CXL 2.0+) Component registers (e.g., Extended Security
> > Capability), can only be accessed by means of a base address published
> > with a CHBCR.
> > 
> > An algorithm to locate a CHBCR associated with an RCRB would be too
> > invasive to land without some concrete motivation.
> > 
> > Therefore, just print a message to inform of unsupported config.
> > 
> 
> 
> Were users seeing this and confused by this silent failure?
> What did it look like before?

This is a "by inspection" realization that the CXL specification
currently does not define a mechanism for finding CXL-2.0-only root-port
component regsiters like HDM decoders and Extended Security capability.
Rather than go through a fire drill of clarifying how that is supposed
to work, just the known conflict for now.

When / if someone actually shows up with an eRCD (a device that forces
the host-bridge into CXL 1.1 Restricted CXL Host mode) then we can
consider if Linux needs to do the work to support that. To my knowledge
all of the commercial devices on the market are so-called "CXL 1.1+"
which support training in VH (CXL Virtual Host) mode when attached to a
CXL 2.0 host. So, Linux can likely live with disclaiming "eRCD attached
to a VH host support" for the foreseeable future.
Alison Schofield June 26, 2024, 12:10 a.m. UTC | #4
On Wed, Jun 19, 2024 at 02:59:41PM +0200, Fabio M. De Francesco wrote:

Fabio,

cxl/acpi does a lot of platform config work. "...unsupported platform
config detection" gives no hint that this is about CHBS's or an eRCD.
Please offer something more specific. Thanks.

--Alison

snip...
Dan Williams June 26, 2024, 12:45 a.m. UTC | #5
Alison Schofield wrote:
> On Wed, Jun 19, 2024 at 02:59:41PM +0200, Fabio M. De Francesco wrote:
> 
> Fabio,
> 
> cxl/acpi does a lot of platform config work. "...unsupported platform
> config detection" gives no hint that this is about CHBS's or an eRCD.
> Please offer something more specific. Thanks.

The message specifies "mixed Virtual Host and Restricted CXL Host
hierarchy" as the conflict. The relationship between RCH and eRCDs is an
exercise for the reader, and CHBS is an ACPI detail that really should
not be emitted in an error message. So I am struggling to imagine what a
more specific error message would be without paragraphs of backstory.

All that is needed here is just enough words for when someone posts a
problem to the list that someone savvy can go "ah, you fell into this
specification hole where CXL 2.0 root port registers are difficult to
associate with an RCH config, thanks for the report now we know that
Linux needs to worry about this case".
Fabio M. De Francesco June 26, 2024, 10 a.m. UTC | #6
On Friday, June 21, 2024 10:57:32 PM GMT+2 Alison Schofield wrote:
> On Wed, Jun 19, 2024 at 02:59:41PM +0200, Fabio M. De Francesco wrote:
> 
Hi Alison,
>
> Hi Fabio,
> You've written such a detailed commit msg, that it pulls me in,
> and now I want to understand more....
> 
>
> > Each Host Bridge instance has a corresponding CXL Host Bridge Structure
> > (CHBS) ACPI table that identifies its capabilities. CHBS tables can be
> > two types: RCRB and CHBCR.
> 
> Is there a spec reference for this?
>
I'll add a spec reference with v2.

> While you're spelling things out, please expand RCRB and CHBCR

I'll do it.
>
> > 
> > If a Host Bridge is attached to a device that is operating in Restricted
> > CXL Device Mode (RCD), BIOS publishes an RCRB with the base address of
> > registers that describe its capabilities.
> > 
> > However, the new (CXL 2.0+) Component registers (e.g., Extended Security
> > Capability), can only be accessed by means of a base address published
> > with a CHBCR.
> > 
> > An algorithm to locate a CHBCR associated with an RCRB would be too
> > invasive to land without some concrete motivation.
> > 
> > Therefore, just print a message to inform of unsupported config.
> > 
> 
> Were users seeing this and confused by this silent failure?
> What did it look like before?
> 
Please read Dan on this topic.
I think I'll add a few lines to the commit message to summarize Dan's comment.
>
> > Count how many different CHBS "Version" types are detected by
> > cxl_get_chbs_iter(). Then make cxl_get_chbs() print a warning if that sum
> > is greater than 1.
> > 
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> >  drivers/cxl/acpi.c | 20 ++++++++++++++------
> >  1 file changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 571069863c62..9e226a65a5ea 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -482,6 +482,7 @@ struct cxl_chbs_context {
> >  	unsigned long long uid;
> >  	resource_size_t base;
> >  	u32 cxl_version;
> > +	int count;
> 
> Maybe s/count/nr_versions to be more explicit of what it counts.
>
Yes, "nr_versions" is better.

Thank you.

Fabio
>
> -- Alison
> >  };
> >  
> snip
> > 
>
Fabio M. De Francesco June 26, 2024, 10:28 a.m. UTC | #7
On Wednesday, June 26, 2024 1:14:18 AM GMT+2 Dan Williams wrote:

Hi Dan,

> Fabio M. De Francesco wrote:
> > Each Host Bridge instance has a corresponding CXL Host Bridge Structure
> > (CHBS) ACPI table that identifies its capabilities. CHBS tables can be
> > two types: RCRB and CHBCR.
> > 
> > If a Host Bridge is attached to a device that is operating in Restricted
> > CXL Device Mode (RCD), BIOS publishes an RCRB with the base address of
> > registers that describe its capabilities.
> > 
> > However, the new (CXL 2.0+) Component registers (e.g., Extended Security
> > Capability), can only be accessed by means of a base address published
> > with a CHBCR.
> > 
> > An algorithm to locate a CHBCR associated with an RCRB would be too
> > invasive to land without some concrete motivation.
> > 
> > Therefore, just print a message to inform of unsupported config.
> > 
> > Count how many different CHBS "Version" types are detected by
> > cxl_get_chbs_iter(). Then make cxl_get_chbs() print a warning if that sum
> > is greater than 1.
> > 
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> >  drivers/cxl/acpi.c | 20 ++++++++++++++------
> >  1 file changed, 14 insertions(+), 6 deletions(-)
> > 

snip

> > @@ -533,6 +537,10 @@ static int cxl_get_chbs(struct device *dev, struct 
acpi_device *hb,
> >  
> >  	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CHBS, cxl_get_chbs_iter, 
ctx);
> >  
> > +	if (ctx->count > 1)
> > +		/* Disclaim eRCD support given some component register 
may only be found via CHBCR */
> > +		dev_info(dev, "Unsupported platform config, mixed 
Virtual Host and Restricted CXL Host hierarchy.");
> 
> I believe this is already queued, but my personal preference is that
> multiline statements include brackets, or move the comment above the "if
> ()", so either:
> 
> 	/* Disclaim eRCD support given some component register may only be 
found via CHBCR */
> 	if (ctx->count > 1)
> 		dev_info(dev, "Unsupported platform config, mixed Virtual 
Host and Restricted CXL Host hierarchy.");
> 
> ...or:
> 
> 	if (ctx->count > 1) {
> 		/* Disclaim eRCD support given some component register 
may only be found via CHBCR */
> 		dev_info(dev, "Unsupported platform config, mixed Virtual 
Host and Restricted CXL Host hierarchy.");
> 	}
> 
> ...but don't spin the patch just for that fixup.
> 
I'll send v2 mainly for Alison's comments and so I will also add brackets here 
(the second solution meets my preferences).

Thank you,

Fabio
Fabio M. De Francesco June 26, 2024, 10:31 a.m. UTC | #8
On Wednesday, June 26, 2024 2:45:14 AM GMT+2 Dan Williams wrote:
> Alison Schofield wrote:
> > On Wed, Jun 19, 2024 at 02:59:41PM +0200, Fabio M. De Francesco wrote:
> > 
> > Fabio,
> > 
> > cxl/acpi does a lot of platform config work. "...unsupported platform
> > config detection" gives no hint that this is about CHBS's or an eRCD.
> > Please offer something more specific. Thanks.
> 
> The message specifies "mixed Virtual Host and Restricted CXL Host
> hierarchy" as the conflict. The relationship between RCH and eRCDs is an
> exercise for the reader, and CHBS is an ACPI detail that really should
> not be emitted in an error message. So I am struggling to imagine what a
> more specific error message would be without paragraphs of backstory.
> 
> All that is needed here is just enough words for when someone posts a
> problem to the list that someone savvy can go "ah, you fell into this
> specification hole where CXL 2.0 root port registers are difficult to
> associate with an RCH config, thanks for the report now we know that
> Linux needs to worry about this case".
> 
I think that Alison raised a good point.

If you have nothing against it, I'll change the subject line to: "cxl/acpi: 
Warn on mixed VH and eRCH hierarchies"

How about that change?

Fabio
Dan Williams June 27, 2024, 1:09 a.m. UTC | #9
Fabio M. De Francesco wrote:
> On Wednesday, June 26, 2024 2:45:14 AM GMT+2 Dan Williams wrote:
> > Alison Schofield wrote:
> > > On Wed, Jun 19, 2024 at 02:59:41PM +0200, Fabio M. De Francesco wrote:
> > > 
> > > Fabio,
> > > 
> > > cxl/acpi does a lot of platform config work. "...unsupported platform
> > > config detection" gives no hint that this is about CHBS's or an eRCD.
> > > Please offer something more specific. Thanks.
> > 
> > The message specifies "mixed Virtual Host and Restricted CXL Host
> > hierarchy" as the conflict. The relationship between RCH and eRCDs is an
> > exercise for the reader, and CHBS is an ACPI detail that really should
> > not be emitted in an error message. So I am struggling to imagine what a
> > more specific error message would be without paragraphs of backstory.
> > 
> > All that is needed here is just enough words for when someone posts a
> > problem to the list that someone savvy can go "ah, you fell into this
> > specification hole where CXL 2.0 root port registers are difficult to
> > associate with an RCH config, thanks for the report now we know that
> > Linux needs to worry about this case".
> > 
> I think that Alison raised a good point.
> 
> If you have nothing against it, I'll change the subject line to: "cxl/acpi: 
> Warn on mixed VH and eRCH hierarchies"
> 
> How about that change?

Just, s/eRCH/RCH/, but other than that looks fine to me.

However, I thought this comment was about the error message not the patch subject.
diff mbox series

Patch

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 571069863c62..9e226a65a5ea 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -482,6 +482,7 @@  struct cxl_chbs_context {
 	unsigned long long uid;
 	resource_size_t base;
 	u32 cxl_version;
+	int count;
 };
 
 static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
@@ -490,10 +491,17 @@  static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
 	struct cxl_chbs_context *ctx = arg;
 	struct acpi_cedt_chbs *chbs;
 
-	if (ctx->base != CXL_RESOURCE_NONE)
+	chbs = (struct acpi_cedt_chbs *) header;
+
+	if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
+	    chbs->length != CXL_RCRB_SIZE)
 		return 0;
 
-	chbs = (struct acpi_cedt_chbs *) header;
+	if (ctx->cxl_version != chbs->cxl_version)
+		ctx->count++;
+
+	if (ctx->base != CXL_RESOURCE_NONE)
+		return 0;
 
 	if (ctx->uid != chbs->uid)
 		return 0;
@@ -502,10 +510,6 @@  static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
 	if (!chbs->base)
 		return 0;
 
-	if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
-	    chbs->length != CXL_RCRB_SIZE)
-		return 0;
-
 	ctx->base = chbs->base;
 
 	return 0;
@@ -533,6 +537,10 @@  static int cxl_get_chbs(struct device *dev, struct acpi_device *hb,
 
 	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CHBS, cxl_get_chbs_iter, ctx);
 
+	if (ctx->count > 1)
+		/* Disclaim eRCD support given some component register may only be found via CHBCR */
+		dev_info(dev, "Unsupported platform config, mixed Virtual Host and Restricted CXL Host hierarchy.");
+
 	return 0;
 }