diff mbox series

[v2,1/6] cxl/port: Fix CXL port initialization order when the subsystem is built-in

Message ID 172964780249.81806.11601867702278939388.stgit@dwillia2-xfh.jf.intel.com
State Accepted
Commit d05c56d26cfe149c56724555cf1b28bd89042289
Headers show
Series cxl: Initialization and shutdown fixes | expand

Commit Message

Dan Williams Oct. 23, 2024, 1:43 a.m. UTC
When the CXL subsystem is built-in the module init order is determined
by Makefile order. That order violates expectations. The expectation is
that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
the race cxl_mem will find the enabled CXL root ports it needs and if
cxl_acpi loses the race it will retrigger cxl_mem to attach via
cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
enabled immediately upon cxl_acpi_probe() return. That in turn can only
happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
before the cxl_acpi object in the Makefile.

Fix up the order to prevent initialization failures, and make sure that
cxl_port is built-in if cxl_acpi is also built-in.

As for what contributed to this not being found earlier, the CXL
regression environment, cxl_test, builds all CXL functionality as a
module to allow to symbol mocking and other dynamic reload tests.  As a
result there is no regression coverage for the built-in case.

Reported-by: Gregory Price <gourry@gourry.net>
Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
Tested-by: Gregory Price <gourry@gourry.net>
Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
Cc: <stable@vger.kernel.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/Kconfig  |    1 +
 drivers/cxl/Makefile |   20 ++++++++++++++------
 2 files changed, 15 insertions(+), 6 deletions(-)

Comments

Jonathan Cameron Oct. 24, 2024, 9:42 a.m. UTC | #1
On Tue, 22 Oct 2024 18:43:24 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> When the CXL subsystem is built-in the module init order is determined
> by Makefile order. That order violates expectations. The expectation is
> that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
> the race cxl_mem will find the enabled CXL root ports it needs and if
> cxl_acpi loses the race it will retrigger cxl_mem to attach via
> cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
> enabled immediately upon cxl_acpi_probe() return. That in turn can only
> happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
> before the cxl_acpi object in the Makefile.
> 
> Fix up the order to prevent initialization failures, and make sure that
> cxl_port is built-in if cxl_acpi is also built-in.
> 
> As for what contributed to this not being found earlier, the CXL
> regression environment, cxl_test, builds all CXL functionality as a
> module to allow to symbol mocking and other dynamic reload tests.  As a
> result there is no regression coverage for the built-in case.
> 
> Reported-by: Gregory Price <gourry@gourry.net>
> Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
> Tested-by: Gregory Price <gourry@gourry.net>
> Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")

I don't like this due to likely long term fragility, but any other
solution is probably more painful.  Long term we should really get
a regression test for these ordering issues in place in one of
the CIs.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> Cc: <stable@vger.kernel.org>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/Kconfig  |    1 +
>  drivers/cxl/Makefile |   20 ++++++++++++++------
>  2 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 29c192f20082..876469e23f7a 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -60,6 +60,7 @@ config CXL_ACPI
>  	default CXL_BUS
>  	select ACPI_TABLE_LIB
>  	select ACPI_HMAT
> +	select CXL_PORT
>  	help
>  	  Enable support for host managed device memory (HDM) resources
>  	  published by a platform's ACPI CXL memory layout description.  See
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index db321f48ba52..2caa90fa4bf2 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,13 +1,21 @@
>  # SPDX-License-Identifier: GPL-2.0
> +
> +# Order is important here for the built-in case:
> +# - 'core' first for fundamental init
> +# - 'port' before platform root drivers like 'acpi' so that CXL-root ports
> +#   are immediately enabled
> +# - 'mem' and 'pmem' before endpoint drivers so that memdevs are
> +#   immediately enabled
> +# - 'pci' last, also mirrors the hardware enumeration hierarchy
>  obj-y += core/
> -obj-$(CONFIG_CXL_PCI) += cxl_pci.o
> -obj-$(CONFIG_CXL_MEM) += cxl_mem.o
> +obj-$(CONFIG_CXL_PORT) += cxl_port.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> -obj-$(CONFIG_CXL_PORT) += cxl_port.o
> +obj-$(CONFIG_CXL_MEM) += cxl_mem.o
> +obj-$(CONFIG_CXL_PCI) += cxl_pci.o
>  
> -cxl_mem-y := mem.o
> -cxl_pci-y := pci.o
> +cxl_port-y := port.o
>  cxl_acpi-y := acpi.o
>  cxl_pmem-y := pmem.o security.o
> -cxl_port-y := port.o
> +cxl_mem-y := mem.o
> +cxl_pci-y := pci.o
> 
>
Alejandro Lucero Palau Oct. 24, 2024, 10:36 a.m. UTC | #2
On 10/23/24 02:43, Dan Williams wrote:
> When the CXL subsystem is built-in the module init order is determined
> by Makefile order. That order violates expectations. The expectation is
> that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
> the race cxl_mem will find the enabled CXL root ports it needs and if
> cxl_acpi loses the race it will retrigger cxl_mem to attach via
> cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
> enabled immediately upon cxl_acpi_probe() return. That in turn can only
> happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
> before the cxl_acpi object in the Makefile.


I'm having problems with understanding this. The acpi module is 
initialised following the initcall levels, as defined by the code with 
the subsys_initcall(cxl_acpi_init), and the cxl_mem module is not, so 
AFAIK, there should not be any race there with the acpi module always 
being initialised first. It I'm right, the problem should be another one 
we do not know yet ...

> Fix up the order to prevent initialization failures, and make sure that
> cxl_port is built-in if cxl_acpi is also built-in.


... or forcing cxl_port to be built-in is enough. I wonder how, without 
it, the cxl root ports can be there for cxl_mem ...


>
> As for what contributed to this not being found earlier, the CXL
> regression environment, cxl_test, builds all CXL functionality as a
> module to allow to symbol mocking and other dynamic reload tests.  As a
> result there is no regression coverage for the built-in case.
>
> Reported-by: Gregory Price <gourry@gourry.net>
> Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
> Tested-by: Gregory Price <gourry@gourry.net>
> Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
> Cc: <stable@vger.kernel.org>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>   drivers/cxl/Kconfig  |    1 +
>   drivers/cxl/Makefile |   20 ++++++++++++++------
>   2 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 29c192f20082..876469e23f7a 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -60,6 +60,7 @@ config CXL_ACPI
>   	default CXL_BUS
>   	select ACPI_TABLE_LIB
>   	select ACPI_HMAT
> +	select CXL_PORT
>   	help
>   	  Enable support for host managed device memory (HDM) resources
>   	  published by a platform's ACPI CXL memory layout description.  See
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index db321f48ba52..2caa90fa4bf2 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,13 +1,21 @@
>   # SPDX-License-Identifier: GPL-2.0
> +
> +# Order is important here for the built-in case:
> +# - 'core' first for fundamental init
> +# - 'port' before platform root drivers like 'acpi' so that CXL-root ports
> +#   are immediately enabled
> +# - 'mem' and 'pmem' before endpoint drivers so that memdevs are
> +#   immediately enabled
> +# - 'pci' last, also mirrors the hardware enumeration hierarchy
>   obj-y += core/
> -obj-$(CONFIG_CXL_PCI) += cxl_pci.o
> -obj-$(CONFIG_CXL_MEM) += cxl_mem.o
> +obj-$(CONFIG_CXL_PORT) += cxl_port.o
>   obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>   obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> -obj-$(CONFIG_CXL_PORT) += cxl_port.o
> +obj-$(CONFIG_CXL_MEM) += cxl_mem.o
> +obj-$(CONFIG_CXL_PCI) += cxl_pci.o
>   
> -cxl_mem-y := mem.o
> -cxl_pci-y := pci.o
> +cxl_port-y := port.o
>   cxl_acpi-y := acpi.o
>   cxl_pmem-y := pmem.o security.o
> -cxl_port-y := port.o
> +cxl_mem-y := mem.o
> +cxl_pci-y := pci.o
>
>
Ira Weiny Oct. 24, 2024, 2:14 p.m. UTC | #3
Dan Williams wrote:
> When the CXL subsystem is built-in the module init order is determined
> by Makefile order. That order violates expectations. The expectation is
> that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
> the race cxl_mem will find the enabled CXL root ports it needs and if
> cxl_acpi loses the race it will retrigger cxl_mem to attach via
> cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
> enabled immediately upon cxl_acpi_probe() return. That in turn can only
> happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
> before the cxl_acpi object in the Makefile.
> 
> Fix up the order to prevent initialization failures, and make sure that
> cxl_port is built-in if cxl_acpi is also built-in.
> 
> As for what contributed to this not being found earlier, the CXL
> regression environment, cxl_test, builds all CXL functionality as a
> module to allow to symbol mocking and other dynamic reload tests.  As a
> result there is no regression coverage for the built-in case.
> 
> Reported-by: Gregory Price <gourry@gourry.net>
> Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
> Tested-by: Gregory Price <gourry@gourry.net>
> Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
> Cc: <stable@vger.kernel.org>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]
Dan Williams Oct. 24, 2024, 4:19 p.m. UTC | #4
Jonathan Cameron wrote:
> On Tue, 22 Oct 2024 18:43:24 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > When the CXL subsystem is built-in the module init order is determined
> > by Makefile order. That order violates expectations. The expectation is
> > that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
> > the race cxl_mem will find the enabled CXL root ports it needs and if
> > cxl_acpi loses the race it will retrigger cxl_mem to attach via
> > cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
> > enabled immediately upon cxl_acpi_probe() return. That in turn can only
> > happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
> > before the cxl_acpi object in the Makefile.
> > 
> > Fix up the order to prevent initialization failures, and make sure that
> > cxl_port is built-in if cxl_acpi is also built-in.
> > 
> > As for what contributed to this not being found earlier, the CXL
> > regression environment, cxl_test, builds all CXL functionality as a
> > module to allow to symbol mocking and other dynamic reload tests.  As a
> > result there is no regression coverage for the built-in case.
> > 
> > Reported-by: Gregory Price <gourry@gourry.net>
> > Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
> > Tested-by: Gregory Price <gourry@gourry.net>
> > Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
> 
> I don't like this due to likely long term fragility, but any other

Please be specific about the fragility and how is this different than
any other Makefile order fragility, like the many cases in
drivers/Makefile/, or patches fixing up initcall order?

Now, an argument can be made that there are too many CXL sub-objects and
more can be merged into a monolithic cxl_core object. The flipside of
that is reduced testability, at least via symbol mocking techniques.
Just look at the recent case where the fact that
drivers/cxl/core/region.c is built into cxl_core.o rather than its own
cxl_region.o object results in an in-line code change to support
cxl_test [1]. There are tradeoffs.

> solution is probably more painful.  Long term we should really get
> a regression test for these ordering issues in place in one of
> the CIs.

The final patch in this series does improve cxl_test's ability to catch
regressions in module init order, and that ordering change did uncover a
bug. The system works! 
Dan Williams Oct. 24, 2024, 4:32 p.m. UTC | #5
Alejandro Lucero Palau wrote:
> 
> On 10/23/24 02:43, Dan Williams wrote:
> > When the CXL subsystem is built-in the module init order is determined
> > by Makefile order. That order violates expectations. The expectation is
> > that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
> > the race cxl_mem will find the enabled CXL root ports it needs and if
> > cxl_acpi loses the race it will retrigger cxl_mem to attach via
> > cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
> > enabled immediately upon cxl_acpi_probe() return. That in turn can only
> > happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
> > before the cxl_acpi object in the Makefile.
> 
> 
> I'm having problems with understanding this. The acpi module is 
> initialised following the initcall levels, as defined by the code with 
> the subsys_initcall(cxl_acpi_init), and the cxl_mem module is not, so 
> AFAIK, there should not be any race there with the acpi module always 
> being initialised first. It I'm right, the problem should be another one 
> we do not know yet ...

This is a valid point, and I do think that cxl_port should also move to
subsys_initcall() for completeness.

However, the reason this Makefile change works, even though cxl_acpi
finishes init before cxl_port when both are built-in, is due to device
discovery order.

With the old Makefile order it is possible for cxl_mem to race
cxl_acpi_probe() in a way that defeats the cxl_bus_rescan() that is
there to resolve device discovery races.

> > Fix up the order to prevent initialization failures, and make sure that
> > cxl_port is built-in if cxl_acpi is also built-in.
> 
> ... or forcing cxl_port to be built-in is enough. I wonder how, without 
> it, the cxl root ports can be there for cxl_mem ...

It does not need to be there for cxl_mem. It is ok for cxl_mem to load
and complete enumeration well before cxl_acpi ever arrives. As long as
cxl_bus_rescan() enables those devices after the fact then everything is
ok.

The problematic case being fixed is the opposite, i.e. that
cxl_bus_rescan() completes and never triggers again after cxl_mem has
failed to find the root ports.
Jonathan Cameron Oct. 24, 2024, 4:39 p.m. UTC | #6
On Thu, 24 Oct 2024 09:19:17 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Tue, 22 Oct 2024 18:43:24 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > When the CXL subsystem is built-in the module init order is determined
> > > by Makefile order. That order violates expectations. The expectation is
> > > that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
> > > the race cxl_mem will find the enabled CXL root ports it needs and if
> > > cxl_acpi loses the race it will retrigger cxl_mem to attach via
> > > cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
> > > enabled immediately upon cxl_acpi_probe() return. That in turn can only
> > > happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
> > > before the cxl_acpi object in the Makefile.
> > > 
> > > Fix up the order to prevent initialization failures, and make sure that
> > > cxl_port is built-in if cxl_acpi is also built-in.
> > > 
> > > As for what contributed to this not being found earlier, the CXL
> > > regression environment, cxl_test, builds all CXL functionality as a
> > > module to allow to symbol mocking and other dynamic reload tests.  As a
> > > result there is no regression coverage for the built-in case.
> > > 
> > > Reported-by: Gregory Price <gourry@gourry.net>
> > > Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net
> > > Tested-by: Gregory Price <gourry@gourry.net>
> > > Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")  
> > 
> > I don't like this due to likely long term fragility, but any other  
> 
> Please be specific about the fragility and how is this different than
> any other Makefile order fragility, like the many cases in
> drivers/Makefile/, or patches fixing up initcall order?

Sure, I don't like any of them ;) Mostly was having a grumpy day
rather than suggesting a change in this.

> 
> Now, an argument can be made that there are too many CXL sub-objects and
> more can be merged into a monolithic cxl_core object. The flipside of
> that is reduced testability, at least via symbol mocking techniques.
> Just look at the recent case where the fact that
> drivers/cxl/core/region.c is built into cxl_core.o rather than its own
> cxl_region.o object results in an in-line code change to support
> cxl_test [1]. There are tradeoffs.
Absolutely agree. 
> 
> > solution is probably more painful.  Long term we should really get
> > a regression test for these ordering issues in place in one of
> > the CIs.  
> 
> The final patch in this series does improve cxl_test's ability to catch
> regressions in module init order, and that ordering change did uncover a
> bug. The system works! 
Alejandro Lucero Palau Oct. 25, 2024, 8:43 a.m. UTC | #7
On 10/24/24 17:32, Dan Williams wrote:
> Alejandro Lucero Palau wrote:
>> On 10/23/24 02:43, Dan Williams wrote:
>>> When the CXL subsystem is built-in the module init order is determined
>>> by Makefile order. That order violates expectations. The expectation is
>>> that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
>>> the race cxl_mem will find the enabled CXL root ports it needs and if
>>> cxl_acpi loses the race it will retrigger cxl_mem to attach via
>>> cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
>>> enabled immediately upon cxl_acpi_probe() return. That in turn can only
>>> happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
>>> before the cxl_acpi object in the Makefile.
>>
>> I'm having problems with understanding this. The acpi module is
>> initialised following the initcall levels, as defined by the code with
>> the subsys_initcall(cxl_acpi_init), and the cxl_mem module is not, so
>> AFAIK, there should not be any race there with the acpi module always
>> being initialised first. It I'm right, the problem should be another one
>> we do not know yet ...
> This is a valid point, and I do think that cxl_port should also move to
> subsys_initcall() for completeness.
>
> However, the reason this Makefile change works, even though cxl_acpi
> finishes init before cxl_port when both are built-in, is due to device
> discovery order.
>
> With the old Makefile order it is possible for cxl_mem to race
> cxl_acpi_probe() in a way that defeats the cxl_bus_rescan() that is
> there to resolve device discovery races.


OK. Then rephrasing the commit would help.


Apart from that:


Tested-by: Alejandro Lucero <alucerop@amd.com>

Reviewed-by: Alejandro Lucero <alucerop@amd.com>


>>> Fix up the order to prevent initialization failures, and make sure that
>>> cxl_port is built-in if cxl_acpi is also built-in.
>> ... or forcing cxl_port to be built-in is enough. I wonder how, without
>> it, the cxl root ports can be there for cxl_mem ...
> It does not need to be there for cxl_mem. It is ok for cxl_mem to load
> and complete enumeration well before cxl_acpi ever arrives. As long as
> cxl_bus_rescan() enables those devices after the fact then everything is
> ok.
>
> The problematic case being fixed is the opposite, i.e. that
> cxl_bus_rescan() completes and never triggers again after cxl_mem has
> failed to find the root ports.
Dan Williams Oct. 25, 2024, 3:19 p.m. UTC | #8
Alejandro Lucero Palau wrote:
> 
> On 10/24/24 17:32, Dan Williams wrote:
> > Alejandro Lucero Palau wrote:
> >> On 10/23/24 02:43, Dan Williams wrote:
> >>> When the CXL subsystem is built-in the module init order is determined
> >>> by Makefile order. That order violates expectations. The expectation is
> >>> that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins
> >>> the race cxl_mem will find the enabled CXL root ports it needs and if
> >>> cxl_acpi loses the race it will retrigger cxl_mem to attach via
> >>> cxl_bus_rescan(). That only works if cxl_acpi can assume ports are
> >>> enabled immediately upon cxl_acpi_probe() return. That in turn can only
> >>> happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears
> >>> before the cxl_acpi object in the Makefile.
> >>
> >> I'm having problems with understanding this. The acpi module is
> >> initialised following the initcall levels, as defined by the code with
> >> the subsys_initcall(cxl_acpi_init), and the cxl_mem module is not, so
> >> AFAIK, there should not be any race there with the acpi module always
> >> being initialised first. It I'm right, the problem should be another one
> >> we do not know yet ...
> > This is a valid point, and I do think that cxl_port should also move to
> > subsys_initcall() for completeness.
> >
> > However, the reason this Makefile change works, even though cxl_acpi
> > finishes init before cxl_port when both are built-in, is due to device
> > discovery order.
> >
> > With the old Makefile order it is possible for cxl_mem to race
> > cxl_acpi_probe() in a way that defeats the cxl_bus_rescan() that is
> > there to resolve device discovery races.
> 
> 
> OK. Then rephrasing the commit would help.

That and moving cxl_port to subsys_initcall(). Will respin this one.

> Apart from that:
> 
> Tested-by: Alejandro Lucero <alucerop@amd.com>
> 
> Reviewed-by: Alejandro Lucero <alucerop@amd.com>

Thanks!
diff mbox series

Patch

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 29c192f20082..876469e23f7a 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -60,6 +60,7 @@  config CXL_ACPI
 	default CXL_BUS
 	select ACPI_TABLE_LIB
 	select ACPI_HMAT
+	select CXL_PORT
 	help
 	  Enable support for host managed device memory (HDM) resources
 	  published by a platform's ACPI CXL memory layout description.  See
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index db321f48ba52..2caa90fa4bf2 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,13 +1,21 @@ 
 # SPDX-License-Identifier: GPL-2.0
+
+# Order is important here for the built-in case:
+# - 'core' first for fundamental init
+# - 'port' before platform root drivers like 'acpi' so that CXL-root ports
+#   are immediately enabled
+# - 'mem' and 'pmem' before endpoint drivers so that memdevs are
+#   immediately enabled
+# - 'pci' last, also mirrors the hardware enumeration hierarchy
 obj-y += core/
-obj-$(CONFIG_CXL_PCI) += cxl_pci.o
-obj-$(CONFIG_CXL_MEM) += cxl_mem.o
+obj-$(CONFIG_CXL_PORT) += cxl_port.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
-obj-$(CONFIG_CXL_PORT) += cxl_port.o
+obj-$(CONFIG_CXL_MEM) += cxl_mem.o
+obj-$(CONFIG_CXL_PCI) += cxl_pci.o
 
-cxl_mem-y := mem.o
-cxl_pci-y := pci.o
+cxl_port-y := port.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o security.o
-cxl_port-y := port.o
+cxl_mem-y := mem.o
+cxl_pci-y := pci.o