diff mbox series

[6/7] cxl/region: Fix 'distance' calculation with passthrough ports

Message ID 166752185440.947915.6617495912508299445.stgit@dwillia2-xfh.jf.intel.com
State Accepted
Commit 5e96dd75dce43134a66480648efa5e4a47d22b3b
Headers show
Series CXL region creation fixes for 6.1 | expand

Commit Message

Dan Williams Nov. 4, 2022, 12:30 a.m. UTC
When programming port decode targets, the algorithm wants to ensure that
two devices are compatible to be programmed as peers beneath a given
port. A compatible peer is a target that shares the same dport, and
where that target's interleave position also routes it to the same
dport. Compatibility is determined by the device's interleave position
being >= to distance. For example, if a given dport can only map every
Nth position then positions less than N away from the last target
programmed are incompatible.

The @distance for the host-bridge-cxl_port a simple dual-ported
host-bridge configuration with 2 direct-attached devices is 1. An x2
region divded by 2 dports to reach 2 region targets.

An x4 region under an x2 host-bridge would need 2 intervening switches
where the @distance at the host bridge level is 2 (x4 region divided by
2 switches to reach 4 devices).

However, the distance between peers underneath a single ported
host-bridge is always zero because there is no limit to the number of
devices that can be mapped. In other words, there are no decoders to
program in a passthrough, all descendants are mapped and distance only
starts matters for the intervening descendant ports of the passthrough
port.

Add tracking for the number of dports mapped to a port, and use that to
detect the passthrough case for calculating @distance.

Cc: <stable@vger.kernel.org>
Reported-by: Bobo WL <lmw.bobo@gmail.com>
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/port.c   |   11 +++++++++--
 drivers/cxl/core/region.c |    9 ++++++++-
 drivers/cxl/cxl.h         |    2 ++
 3 files changed, 19 insertions(+), 3 deletions(-)

Comments

Verma, Vishal L Nov. 4, 2022, 6:42 a.m. UTC | #1
On Thu, 2022-11-03 at 17:30 -0700, Dan Williams wrote:
> When programming port decode targets, the algorithm wants to ensure that
> two devices are compatible to be programmed as peers beneath a given
> port. A compatible peer is a target that shares the same dport, and
> where that target's interleave position also routes it to the same
> dport. Compatibility is determined by the device's interleave position
> being >= to distance. For example, if a given dport can only map every
> Nth position then positions less than N away from the last target
> programmed are incompatible.
> 
> The @distance for the host-bridge-cxl_port a simple dual-ported
                                   ^
Is this meant to be "the distance for the host-bridge to a cxl_port"?

Also missing '/in/ a simple dual-ported..'?

> host-bridge configuration with 2 direct-attached devices is 1. An x2
> region divded by 2 dports to reach 2 region targets.

The second sentence seems slightly incomprehensible too.
> 
> An x4 region under an x2 host-bridge would need 2 intervening switches
> where the @distance at the host bridge level is 2 (x4 region divided by
> 2 switches to reach 4 devices).
> 
> However, the distance between peers underneath a single ported
> host-bridge is always zero because there is no limit to the number of
> devices that can be mapped. In other words, there are no decoders to
> program in a passthrough, all descendants are mapped and distance only
> starts matters for the intervening descendant ports of the passthrough

starts to matter?

> port.
> 
> Add tracking for the number of dports mapped to a port, and use that to
> detect the passthrough case for calculating @distance.
> 
> Cc: <stable@vger.kernel.org>
> Reported-by: Bobo WL <lmw.bobo@gmail.com>
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
> Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/port.c   |   11 +++++++++--
>  drivers/cxl/core/region.c |    9 ++++++++-
>  drivers/cxl/cxl.h         |    2 ++
>  3 files changed, 19 insertions(+), 3 deletions(-)

Other than the above, looks good,

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>

> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index bffde862de0b..e7556864ea80 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -811,6 +811,7 @@ static struct cxl_dport *find_dport(struct cxl_port *port, int id)
>  static int add_dport(struct cxl_port *port, struct cxl_dport *new)
>  {
>         struct cxl_dport *dup;
> +       int rc;
>  
>         device_lock_assert(&port->dev);
>         dup = find_dport(port, new->port_id);
> @@ -821,8 +822,14 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
>                         dev_name(dup->dport));
>                 return -EBUSY;
>         }
> -       return xa_insert(&port->dports, (unsigned long)new->dport, new,
> -                        GFP_KERNEL);
> +
> +       rc = xa_insert(&port->dports, (unsigned long)new->dport, new,
> +                      GFP_KERNEL);
> +       if (rc)
> +               return rc;
> +
> +       port->nr_dports++;
> +       return 0;
>  }
>  
>  /*
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c52465e09f26..c0253de74945 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -990,7 +990,14 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>         if (cxl_rr->nr_targets_set) {
>                 int i, distance;
>  
> -               distance = p->nr_targets / cxl_rr->nr_targets;
> +               /*
> +                * Passthrough ports impose no distance requirements between
> +                * peers
> +                */
> +               if (port->nr_dports == 1)
> +                       distance = 0;
> +               else
> +                       distance = p->nr_targets / cxl_rr->nr_targets;
>                 for (i = 0; i < cxl_rr->nr_targets_set; i++)
>                         if (ep->dport == cxlsd->target[i]) {
>                                 rc = check_last_peer(cxled, ep, cxl_rr,
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 1164ad49f3d3..ac75554b5d76 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -457,6 +457,7 @@ struct cxl_pmem_region {
>   * @regions: cxl_region_ref instances, regions mapped by this port
>   * @parent_dport: dport that points to this port in the parent
>   * @decoder_ida: allocator for decoder ids
> + * @nr_dports: number of entries in @dports
>   * @hdm_end: track last allocated HDM decoder instance for allocation ordering
>   * @commit_end: cursor to track highest committed decoder for commit ordering
>   * @component_reg_phys: component register capability base address (optional)
> @@ -475,6 +476,7 @@ struct cxl_port {
>         struct xarray regions;
>         struct cxl_dport *parent_dport;
>         struct ida decoder_ida;
> +       int nr_dports;
>         int hdm_end;
>         int commit_end;
>         resource_size_t component_reg_phys;
>
Dan Williams Nov. 4, 2022, 6:59 p.m. UTC | #2
Verma, Vishal L wrote:
> On Thu, 2022-11-03 at 17:30 -0700, Dan Williams wrote:
> > When programming port decode targets, the algorithm wants to ensure that
> > two devices are compatible to be programmed as peers beneath a given
> > port. A compatible peer is a target that shares the same dport, and
> > where that target's interleave position also routes it to the same
> > dport. Compatibility is determined by the device's interleave position
> > being >= to distance. For example, if a given dport can only map every
> > Nth position then positions less than N away from the last target
> > programmed are incompatible.
> > 
> > The @distance for the host-bridge-cxl_port a simple dual-ported
>                                    ^
> Is this meant to be "the distance for the host-bridge to a cxl_port"?

No, but I will preface this explanation by admitting "distance" may not
be the best term for what this value is describing. CXL decode routes to
targets in a round robin fashion per-port. Take the diagram below:

 ┌───────────────────────────────────┬──┐
 │WINDOW0                            │x2│
 └─────────┬─────────────────┬───────┴──┘
           │                 │
 ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
 │HB0           │x2│  │HB1           │x2│
 └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
    │        │          │         │
 ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
 │DEV0│x4│ │DEV1│x4│  │DEV2│x4│ │DEV3│x4│
 └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
    0         2          1         3

...where an x4 region is being established, and all the xN values are
the interleave-ways settings for those ports/devices. The @distance
value for that "HB0" port is 2. I.e. in order for 2 devices in that
region to be mapped under HB0 they must be at position X and X+2 in the
region.  The algorithm needs to be flexible to also allow this
configuration:

 ┌───────────────────────────────────┬──┐
 │WINDOW0                            │x2│
 └─────────┬─────────────────┬───────┴──┘
           │                 │
 ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
 │HB0           │x2│  │HB1           │x2│
 └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
    │        │          │         │
 ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
 │DEV3│x4│ │DEV2│x4│  │DEV1│x4│ │DEV0│x4│
 └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
    3         1          2         0

...and the algorithm can not know that a device is in the wrong position
until trying to map the peer (like DEV0 and DEV1 are peers) into the
decode. So "The @distance for the host-bridge-cxl_port" is referring to
the value "2" for host-bridge-cxl_port:HB0 and host-bridge-cxl_port:HB1
in the diagram.

> Also missing '/in/ a simple dual-ported..'?

Yes to this fixup though.

> 
> > host-bridge configuration with 2 direct-attached devices is 1. An x2
> > region divded by 2 dports to reach 2 region targets.
> 
> The second sentence seems slightly incomprehensible too.

Oh, I think I meant that to be s/. An/, i.e. an/:

"...host-bridge configuration with 2 direct-attached devices is 1, i.e.
an x2 region divded by 2 dports to reach 2 region positions"


> > 
> > An x4 region under an x2 host-bridge would need 2 intervening switches
> > where the @distance at the host bridge level is 2 (x4 region divided by
> > 2 switches to reach 4 devices).
> > 
> > However, the distance between peers underneath a single ported
> > host-bridge is always zero because there is no limit to the number of
> > devices that can be mapped. In other words, there are no decoders to
> > program in a passthrough, all descendants are mapped and distance only
> > starts matters for the intervening descendant ports of the passthrough
> 
> starts to matter?

s/starts matters/matters/

> 
> > port.
> > 
> > Add tracking for the number of dports mapped to a port, and use that to
> > detect the passthrough case for calculating @distance.
> > 
> > Cc: <stable@vger.kernel.org>
> > Reported-by: Bobo WL <lmw.bobo@gmail.com>
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
> > Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/core/port.c   |   11 +++++++++--
> >  drivers/cxl/core/region.c |    9 ++++++++-
> >  drivers/cxl/cxl.h         |    2 ++
> >  3 files changed, 19 insertions(+), 3 deletions(-)
> 
> Other than the above, looks good,
> 
> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
> 
> > 
> > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > index bffde862de0b..e7556864ea80 100644
> > --- a/drivers/cxl/core/port.c
> > +++ b/drivers/cxl/core/port.c
> > @@ -811,6 +811,7 @@ static struct cxl_dport *find_dport(struct cxl_port *port, int id)
> >  static int add_dport(struct cxl_port *port, struct cxl_dport *new)
> >  {
> >         struct cxl_dport *dup;
> > +       int rc;
> >  
> >         device_lock_assert(&port->dev);
> >         dup = find_dport(port, new->port_id);
> > @@ -821,8 +822,14 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
> >                         dev_name(dup->dport));
> >                 return -EBUSY;
> >         }
> > -       return xa_insert(&port->dports, (unsigned long)new->dport, new,
> > -                        GFP_KERNEL);
> > +
> > +       rc = xa_insert(&port->dports, (unsigned long)new->dport, new,
> > +                      GFP_KERNEL);
> > +       if (rc)
> > +               return rc;
> > +
> > +       port->nr_dports++;
> > +       return 0;
> >  }
> >  
> >  /*
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index c52465e09f26..c0253de74945 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -990,7 +990,14 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> >         if (cxl_rr->nr_targets_set) {
> >                 int i, distance;
> >  
> > -               distance = p->nr_targets / cxl_rr->nr_targets;
> > +               /*
> > +                * Passthrough ports impose no distance requirements between
> > +                * peers
> > +                */
> > +               if (port->nr_dports == 1)
> > +                       distance = 0;
> > +               else
> > +                       distance = p->nr_targets / cxl_rr->nr_targets;
> >                 for (i = 0; i < cxl_rr->nr_targets_set; i++)
> >                         if (ep->dport == cxlsd->target[i]) {
> >                                 rc = check_last_peer(cxled, ep, cxl_rr,
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 1164ad49f3d3..ac75554b5d76 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -457,6 +457,7 @@ struct cxl_pmem_region {
> >   * @regions: cxl_region_ref instances, regions mapped by this port
> >   * @parent_dport: dport that points to this port in the parent
> >   * @decoder_ida: allocator for decoder ids
> > + * @nr_dports: number of entries in @dports
> >   * @hdm_end: track last allocated HDM decoder instance for allocation ordering
> >   * @commit_end: cursor to track highest committed decoder for commit ordering
> >   * @component_reg_phys: component register capability base address (optional)
> > @@ -475,6 +476,7 @@ struct cxl_port {
> >         struct xarray regions;
> >         struct cxl_dport *parent_dport;
> >         struct ida decoder_ida;
> > +       int nr_dports;
> >         int hdm_end;
> >         int commit_end;
> >         resource_size_t component_reg_phys;
> > 
>
Verma, Vishal L Nov. 4, 2022, 7:31 p.m. UTC | #3
On Fri, 2022-11-04 at 11:59 -0700, Dan Williams wrote:
> Verma, Vishal L wrote:
> > On Thu, 2022-11-03 at 17:30 -0700, Dan Williams wrote:
> > > When programming port decode targets, the algorithm wants to ensure that
> > > two devices are compatible to be programmed as peers beneath a given
> > > port. A compatible peer is a target that shares the same dport, and
> > > where that target's interleave position also routes it to the same
> > > dport. Compatibility is determined by the device's interleave position
> > > being >= to distance. For example, if a given dport can only map every
> > > Nth position then positions less than N away from the last target
> > > programmed are incompatible.
> > > 
> > > The @distance for the host-bridge-cxl_port a simple dual-ported
> >                                    ^
> > Is this meant to be "the distance for the host-bridge to a cxl_port"?
> 
> No, but I will preface this explanation by admitting "distance" may not
> be the best term for what this value is describing. CXL decode routes to
> targets in a round robin fashion per-port. Take the diagram below:
> 
>  ┌───────────────────────────────────┬──┐
>  │WINDOW0                            │x2│
>  └─────────┬─────────────────┬───────┴──┘
>            │                 │
>  ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
>  │HB0           │x2│  │HB1           │x2│
>  └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
>     │        │          │         │
>  ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
>  │DEV0│x4│ │DEV1│x4│  │DEV2│x4│ │DEV3│x4│
>  └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
>     0         2          1         3
> 
> ...where an x4 region is being established, and all the xN values are
> the interleave-ways settings for those ports/devices. The @distance
> value for that "HB0" port is 2. I.e. in order for 2 devices in that
> region to be mapped under HB0 they must be at position X and X+2 in the
> region.  The algorithm needs to be flexible to also allow this
> configuration:
> 
>  ┌───────────────────────────────────┬──┐
>  │WINDOW0                            │x2│
>  └─────────┬─────────────────┬───────┴──┘
>            │                 │
>  ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
>  │HB0           │x2│  │HB1           │x2│
>  └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
>     │        │          │         │
>  ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
>  │DEV3│x4│ │DEV2│x4│  │DEV1│x4│ │DEV0│x4│
>  └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
>     3         1          2         0
> 
> ...and the algorithm can not know that a device is in the wrong position
> until trying to map the peer (like DEV0 and DEV1 are peers) into the
> decode. So "The @distance for the host-bridge-cxl_port" is referring to
> the value "2" for host-bridge-cxl_port:HB0 and host-bridge-cxl_port:HB1
> in the diagram.

Ah I understand now. The host-bridge-cxl_port term confused me slightly
- I wonder if it is better to say "the host-bridge's cxl_port" or
something? Regardless, thanks for the explanation!

> 
> > Also missing '/in/ a simple dual-ported..'?
> 
> Yes to this fixup though.
> 
> > 
> > > host-bridge configuration with 2 direct-attached devices is 1. An x2
> > > region divded by 2 dports to reach 2 region targets.
> > 
> > The second sentence seems slightly incomprehensible too.
> 
> Oh, I think I meant that to be s/. An/, i.e. an/:
> 
> "...host-bridge configuration with 2 direct-attached devices is 1, i.e.
> an x2 region divded by 2 dports to reach 2 region positions"

Ah ok that makes more sense!

> 
> > > 
> > > An x4 region under an x2 host-bridge would need 2 intervening switches
> > > where the @distance at the host bridge level is 2 (x4 region divided by
> > > 2 switches to reach 4 devices).
> > > 
> > > However, the distance between peers underneath a single ported
> > > host-bridge is always zero because there is no limit to the number of
> > > devices that can be mapped. In other words, there are no decoders to
> > > program in a passthrough, all descendants are mapped and distance only
> > > starts matters for the intervening descendant ports of the passthrough
> > 
> > starts to matter?
> 
> s/starts matters/matters/
> 
> > 
> > > port.
> > > 
> > > Add tracking for the number of dports mapped to a port, and use that to
> > > detect the passthrough case for calculating @distance.
> > > 
> > > Cc: <stable@vger.kernel.org>
> > > Reported-by: Bobo WL <lmw.bobo@gmail.com>
> > > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
> > > Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > ---
> > >  drivers/cxl/core/port.c   |   11 +++++++++--
> > >  drivers/cxl/core/region.c |    9 ++++++++-
> > >  drivers/cxl/cxl.h         |    2 ++
> > >  3 files changed, 19 insertions(+), 3 deletions(-)
> > 
> > Other than the above, looks good,
> > 
> > Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
> > 
> > > 
> > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > > index bffde862de0b..e7556864ea80 100644
> > > --- a/drivers/cxl/core/port.c
> > > +++ b/drivers/cxl/core/port.c
> > > @@ -811,6 +811,7 @@ static struct cxl_dport *find_dport(struct cxl_port *port, int id)
> > >  static int add_dport(struct cxl_port *port, struct cxl_dport *new)
> > >  {
> > >         struct cxl_dport *dup;
> > > +       int rc;
> > >  
> > >         device_lock_assert(&port->dev);
> > >         dup = find_dport(port, new->port_id);
> > > @@ -821,8 +822,14 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
> > >                         dev_name(dup->dport));
> > >                 return -EBUSY;
> > >         }
> > > -       return xa_insert(&port->dports, (unsigned long)new->dport, new,
> > > -                        GFP_KERNEL);
> > > +
> > > +       rc = xa_insert(&port->dports, (unsigned long)new->dport, new,
> > > +                      GFP_KERNEL);
> > > +       if (rc)
> > > +               return rc;
> > > +
> > > +       port->nr_dports++;
> > > +       return 0;
> > >  }
> > >  
> > >  /*
> > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > > index c52465e09f26..c0253de74945 100644
> > > --- a/drivers/cxl/core/region.c
> > > +++ b/drivers/cxl/core/region.c
> > > @@ -990,7 +990,14 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> > >         if (cxl_rr->nr_targets_set) {
> > >                 int i, distance;
> > >  
> > > -               distance = p->nr_targets / cxl_rr->nr_targets;
> > > +               /*
> > > +                * Passthrough ports impose no distance requirements between
> > > +                * peers
> > > +                */
> > > +               if (port->nr_dports == 1)
> > > +                       distance = 0;
> > > +               else
> > > +                       distance = p->nr_targets / cxl_rr->nr_targets;
> > >                 for (i = 0; i < cxl_rr->nr_targets_set; i++)
> > >                         if (ep->dport == cxlsd->target[i]) {
> > >                                 rc = check_last_peer(cxled, ep, cxl_rr,
> > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > > index 1164ad49f3d3..ac75554b5d76 100644
> > > --- a/drivers/cxl/cxl.h
> > > +++ b/drivers/cxl/cxl.h
> > > @@ -457,6 +457,7 @@ struct cxl_pmem_region {
> > >   * @regions: cxl_region_ref instances, regions mapped by this port
> > >   * @parent_dport: dport that points to this port in the parent
> > >   * @decoder_ida: allocator for decoder ids
> > > + * @nr_dports: number of entries in @dports
> > >   * @hdm_end: track last allocated HDM decoder instance for allocation ordering
> > >   * @commit_end: cursor to track highest committed decoder for commit ordering
> > >   * @component_reg_phys: component register capability base address (optional)
> > > @@ -475,6 +476,7 @@ struct cxl_port {
> > >         struct xarray regions;
> > >         struct cxl_dport *parent_dport;
> > >         struct ida decoder_ida;
> > > +       int nr_dports;
> > >         int hdm_end;
> > >         int commit_end;
> > >         resource_size_t component_reg_phys;
> > > 
> > 
> 
> 
>
Dave Jiang Nov. 4, 2022, 9:58 p.m. UTC | #4
On 11/4/2022 11:59 AM, Dan Williams wrote:
> Verma, Vishal L wrote:
>> On Thu, 2022-11-03 at 17:30 -0700, Dan Williams wrote:
>>> When programming port decode targets, the algorithm wants to ensure that
>>> two devices are compatible to be programmed as peers beneath a given
>>> port. A compatible peer is a target that shares the same dport, and
>>> where that target's interleave position also routes it to the same
>>> dport. Compatibility is determined by the device's interleave position
>>> being >= to distance. For example, if a given dport can only map every
>>> Nth position then positions less than N away from the last target
>>> programmed are incompatible.
>>>
>>> The @distance for the host-bridge-cxl_port a simple dual-ported
>>                                     ^
>> Is this meant to be "the distance for the host-bridge to a cxl_port"?
> 
> No, but I will preface this explanation by admitting "distance" may not
> be the best term for what this value is describing. CXL decode routes to
> targets in a round robin fashion per-port. Take the diagram below:
> 
>   ┌───────────────────────────────────┬──┐
>   │WINDOW0                            │x2│
>   └─────────┬─────────────────┬───────┴──┘
>             │                 │
>   ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
>   │HB0           │x2│  │HB1           │x2│
>   └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
>      │        │          │         │
>   ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
>   │DEV0│x4│ │DEV1│x4│  │DEV2│x4│ │DEV3│x4│
>   └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
>      0         2          1         3
> 
> ...where an x4 region is being established, and all the xN values are
> the interleave-ways settings for those ports/devices. The @distance
> value for that "HB0" port is 2. I.e. in order for 2 devices in that
> region to be mapped under HB0 they must be at position X and X+2 in the
> region.  The algorithm needs to be flexible to also allow this
> configuration:
> 
>   ┌───────────────────────────────────┬──┐
>   │WINDOW0                            │x2│
>   └─────────┬─────────────────┬───────┴──┘
>             │                 │
>   ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
>   │HB0           │x2│  │HB1           │x2│
>   └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
>      │        │          │         │
>   ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
>   │DEV3│x4│ │DEV2│x4│  │DEV1│x4│ │DEV0│x4│
>   └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
>      3         1          2         0
> 
> ...and the algorithm can not know that a device is in the wrong position
> until trying to map the peer (like DEV0 and DEV1 are peers) into the
> decode. So "The @distance for the host-bridge-cxl_port" is referring to
> the value "2" for host-bridge-cxl_port:HB0 and host-bridge-cxl_port:HB1
> in the diagram.
> 
>> Also missing '/in/ a simple dual-ported..'?
> 
> Yes to this fixup though.
> 
>>
>>> host-bridge configuration with 2 direct-attached devices is 1. An x2
>>> region divded by 2 dports to reach 2 region targets.
>>
>> The second sentence seems slightly incomprehensible too.
> 
> Oh, I think I meant that to be s/. An/, i.e. an/:
> 
> "...host-bridge configuration with 2 direct-attached devices is 1, i.e.
> an x2 region divded by 2 dports to reach 2 region positions"

s/divded/divided/ ?

> 
> 
>>>
>>> An x4 region under an x2 host-bridge would need 2 intervening switches
>>> where the @distance at the host bridge level is 2 (x4 region divided by
>>> 2 switches to reach 4 devices).
>>>
>>> However, the distance between peers underneath a single ported
>>> host-bridge is always zero because there is no limit to the number of
>>> devices that can be mapped. In other words, there are no decoders to
>>> program in a passthrough, all descendants are mapped and distance only
>>> starts matters for the intervening descendant ports of the passthrough
>>
>> starts to matter?
> 
> s/starts matters/matters/
> 
>>
>>> port.
>>>
>>> Add tracking for the number of dports mapped to a port, and use that to
>>> detect the passthrough case for calculating @distance.
>>>
>>> Cc: <stable@vger.kernel.org>
>>> Reported-by: Bobo WL <lmw.bobo@gmail.com>
>>> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>> Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
>>> Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>> ---
>>>   drivers/cxl/core/port.c   |   11 +++++++++--
>>>   drivers/cxl/core/region.c |    9 ++++++++-
>>>   drivers/cxl/cxl.h         |    2 ++
>>>   3 files changed, 19 insertions(+), 3 deletions(-)
>>
>> Other than the above, looks good,
>>
>> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
>>
>>>
>>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>>> index bffde862de0b..e7556864ea80 100644
>>> --- a/drivers/cxl/core/port.c
>>> +++ b/drivers/cxl/core/port.c
>>> @@ -811,6 +811,7 @@ static struct cxl_dport *find_dport(struct cxl_port *port, int id)
>>>   static int add_dport(struct cxl_port *port, struct cxl_dport *new)
>>>   {
>>>          struct cxl_dport *dup;
>>> +       int rc;
>>>   
>>>          device_lock_assert(&port->dev);
>>>          dup = find_dport(port, new->port_id);
>>> @@ -821,8 +822,14 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
>>>                          dev_name(dup->dport));
>>>                  return -EBUSY;
>>>          }
>>> -       return xa_insert(&port->dports, (unsigned long)new->dport, new,
>>> -                        GFP_KERNEL);
>>> +
>>> +       rc = xa_insert(&port->dports, (unsigned long)new->dport, new,
>>> +                      GFP_KERNEL);
>>> +       if (rc)
>>> +               return rc;
>>> +
>>> +       port->nr_dports++;
>>> +       return 0;
>>>   }
>>>   
>>>   /*
>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index c52465e09f26..c0253de74945 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -990,7 +990,14 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>>>          if (cxl_rr->nr_targets_set) {
>>>                  int i, distance;
>>>   
>>> -               distance = p->nr_targets / cxl_rr->nr_targets;
>>> +               /*
>>> +                * Passthrough ports impose no distance requirements between
>>> +                * peers
>>> +                */
>>> +               if (port->nr_dports == 1)
>>> +                       distance = 0;
>>> +               else
>>> +                       distance = p->nr_targets / cxl_rr->nr_targets;
>>>                  for (i = 0; i < cxl_rr->nr_targets_set; i++)
>>>                          if (ep->dport == cxlsd->target[i]) {
>>>                                  rc = check_last_peer(cxled, ep, cxl_rr,
>>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>>> index 1164ad49f3d3..ac75554b5d76 100644
>>> --- a/drivers/cxl/cxl.h
>>> +++ b/drivers/cxl/cxl.h
>>> @@ -457,6 +457,7 @@ struct cxl_pmem_region {
>>>    * @regions: cxl_region_ref instances, regions mapped by this port
>>>    * @parent_dport: dport that points to this port in the parent
>>>    * @decoder_ida: allocator for decoder ids
>>> + * @nr_dports: number of entries in @dports
>>>    * @hdm_end: track last allocated HDM decoder instance for allocation ordering
>>>    * @commit_end: cursor to track highest committed decoder for commit ordering
>>>    * @component_reg_phys: component register capability base address (optional)
>>> @@ -475,6 +476,7 @@ struct cxl_port {
>>>          struct xarray regions;
>>>          struct cxl_dport *parent_dport;
>>>          struct ida decoder_ida;
>>> +       int nr_dports;
>>>          int hdm_end;
>>>          int commit_end;
>>>          resource_size_t component_reg_phys;
>>>
>>
> 
> 
>
Dan Williams Nov. 4, 2022, 10:51 p.m. UTC | #5
Dave Jiang wrote:
> 
> 
> On 11/4/2022 11:59 AM, Dan Williams wrote:
> > Verma, Vishal L wrote:
> >> On Thu, 2022-11-03 at 17:30 -0700, Dan Williams wrote:
> >>> When programming port decode targets, the algorithm wants to ensure that
> >>> two devices are compatible to be programmed as peers beneath a given
> >>> port. A compatible peer is a target that shares the same dport, and
> >>> where that target's interleave position also routes it to the same
> >>> dport. Compatibility is determined by the device's interleave position
> >>> being >= to distance. For example, if a given dport can only map every
> >>> Nth position then positions less than N away from the last target
> >>> programmed are incompatible.
> >>>
> >>> The @distance for the host-bridge-cxl_port a simple dual-ported
> >>                                     ^
> >> Is this meant to be "the distance for the host-bridge to a cxl_port"?
> > 
> > No, but I will preface this explanation by admitting "distance" may not
> > be the best term for what this value is describing. CXL decode routes to
> > targets in a round robin fashion per-port. Take the diagram below:
> > 
> >   ┌───────────────────────────────────┬──┐
> >   │WINDOW0                            │x2│
> >   └─────────┬─────────────────┬───────┴──┘
> >             │                 │
> >   ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
> >   │HB0           │x2│  │HB1           │x2│
> >   └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
> >      │        │          │         │
> >   ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
> >   │DEV0│x4│ │DEV1│x4│  │DEV2│x4│ │DEV3│x4│
> >   └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
> >      0         2          1         3
> > 
> > ...where an x4 region is being established, and all the xN values are
> > the interleave-ways settings for those ports/devices. The @distance
> > value for that "HB0" port is 2. I.e. in order for 2 devices in that
> > region to be mapped under HB0 they must be at position X and X+2 in the
> > region.  The algorithm needs to be flexible to also allow this
> > configuration:
> > 
> >   ┌───────────────────────────────────┬──┐
> >   │WINDOW0                            │x2│
> >   └─────────┬─────────────────┬───────┴──┘
> >             │                 │
> >   ┌─────────▼────┬──┐  ┌──────▼───────┬──┐
> >   │HB0           │x2│  │HB1           │x2│
> >   └──┬────────┬──┴──┘  └─┬─────────┬──┴──┘
> >      │        │          │         │
> >   ┌──▼─┬──┐ ┌─▼──┬──┐  ┌─▼──┬──┐ ┌─▼──┬──┐
> >   │DEV3│x4│ │DEV2│x4│  │DEV1│x4│ │DEV0│x4│
> >   └────┴──┘ └────┴──┘  └────┴──┘ └────┴──┘
> >      3         1          2         0
> > 
> > ...and the algorithm can not know that a device is in the wrong position
> > until trying to map the peer (like DEV0 and DEV1 are peers) into the
> > decode. So "The @distance for the host-bridge-cxl_port" is referring to
> > the value "2" for host-bridge-cxl_port:HB0 and host-bridge-cxl_port:HB1
> > in the diagram.
> > 
> >> Also missing '/in/ a simple dual-ported..'?
> > 
> > Yes to this fixup though.
> > 
> >>
> >>> host-bridge configuration with 2 direct-attached devices is 1. An x2
> >>> region divded by 2 dports to reach 2 region targets.
> >>
> >> The second sentence seems slightly incomprehensible too.
> > 
> > Oh, I think I meant that to be s/. An/, i.e. an/:
> > 
> > "...host-bridge configuration with 2 direct-attached devices is 1, i.e.
> > an x2 region divded by 2 dports to reach 2 region positions"
> 
> s/divded/divided/ ?

Yup, thanks.
diff mbox series

Patch

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index bffde862de0b..e7556864ea80 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -811,6 +811,7 @@  static struct cxl_dport *find_dport(struct cxl_port *port, int id)
 static int add_dport(struct cxl_port *port, struct cxl_dport *new)
 {
 	struct cxl_dport *dup;
+	int rc;
 
 	device_lock_assert(&port->dev);
 	dup = find_dport(port, new->port_id);
@@ -821,8 +822,14 @@  static int add_dport(struct cxl_port *port, struct cxl_dport *new)
 			dev_name(dup->dport));
 		return -EBUSY;
 	}
-	return xa_insert(&port->dports, (unsigned long)new->dport, new,
-			 GFP_KERNEL);
+
+	rc = xa_insert(&port->dports, (unsigned long)new->dport, new,
+		       GFP_KERNEL);
+	if (rc)
+		return rc;
+
+	port->nr_dports++;
+	return 0;
 }
 
 /*
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c52465e09f26..c0253de74945 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -990,7 +990,14 @@  static int cxl_port_setup_targets(struct cxl_port *port,
 	if (cxl_rr->nr_targets_set) {
 		int i, distance;
 
-		distance = p->nr_targets / cxl_rr->nr_targets;
+		/*
+		 * Passthrough ports impose no distance requirements between
+		 * peers
+		 */
+		if (port->nr_dports == 1)
+			distance = 0;
+		else
+			distance = p->nr_targets / cxl_rr->nr_targets;
 		for (i = 0; i < cxl_rr->nr_targets_set; i++)
 			if (ep->dport == cxlsd->target[i]) {
 				rc = check_last_peer(cxled, ep, cxl_rr,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 1164ad49f3d3..ac75554b5d76 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -457,6 +457,7 @@  struct cxl_pmem_region {
  * @regions: cxl_region_ref instances, regions mapped by this port
  * @parent_dport: dport that points to this port in the parent
  * @decoder_ida: allocator for decoder ids
+ * @nr_dports: number of entries in @dports
  * @hdm_end: track last allocated HDM decoder instance for allocation ordering
  * @commit_end: cursor to track highest committed decoder for commit ordering
  * @component_reg_phys: component register capability base address (optional)
@@ -475,6 +476,7 @@  struct cxl_port {
 	struct xarray regions;
 	struct cxl_dport *parent_dport;
 	struct ida decoder_ida;
+	int nr_dports;
 	int hdm_end;
 	int commit_end;
 	resource_size_t component_reg_phys;