Message ID | 166662616015.232090.4970569004666131514.stgit@djiang5-desk3.ch.intel.com |
---|---|
State | New, archived |
Headers | show |
Series | [v3] cxl: check decoder count for end device | expand |
On Mon, 24 Oct 2022 08:43:30 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also > indicates that for devices, only 10 decoders should be advertised. Add > check on number of decoders greater than 10 for devices and emit warning. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> I wonder... Should warning say CXL r3.0 spec violation? Seems possible this might grow in future versions and so it might be nice to give a hint that it 'might' be valid in a higher spec version? I don't care strongly either way though. FWIW Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > > v3: > - Fix commit header and output message to reflect code change from v2. > > v2: > - Remove decoder count reassignment from violation (Dan) > > drivers/cxl/core/hdm.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c > index d1d2caea5c62..c3b756f93261 100644 > --- a/drivers/cxl/core/hdm.c > +++ b/drivers/cxl/core/hdm.c > @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL); > static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) > { > u32 hdm_cap; > + struct device *dev = &cxlhdm->port->dev; > > hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET); > cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); > + /* > + * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise > + * more than 10 decoders. Switches and Host Bridges may advertise up to > + * 32 decoders. Set the decoders to 10 for devices if more than 10 are > + * found. > + */ > + if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10) > + dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n", > + cxlhdm->decoder_count); > cxlhdm->target_count = > FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap); > if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap)) > >
Jonathan Cameron wrote: > On Mon, 24 Oct 2022 08:43:30 -0700 > Dave Jiang <dave.jiang@intel.com> wrote: > > > CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also > > indicates that for devices, only 10 decoders should be advertised. Add > > check on number of decoders greater than 10 for devices and emit warning. > > > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > I wonder... Should warning say CXL r3.0 spec violation? > > Seems possible this might grow in future versions and so it > might be nice to give a hint that it 'might' be valid in a higher spec version? > > I don't care strongly either way though. > > FWIW > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > --- > > > > v3: > > - Fix commit header and output message to reflect code change from v2. > > > > v2: > > - Remove decoder count reassignment from violation (Dan) > > > > drivers/cxl/core/hdm.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c > > index d1d2caea5c62..c3b756f93261 100644 > > --- a/drivers/cxl/core/hdm.c > > +++ b/drivers/cxl/core/hdm.c > > @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL); > > static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) > > { > > u32 hdm_cap; > > + struct device *dev = &cxlhdm->port->dev; > > > > hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET); > > cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); > > + /* > > + * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise > > + * more than 10 decoders. Switches and Host Bridges may advertise up to > > + * 32 decoders. Set the decoders to 10 for devices if more than 10 are > > + * found. Stale comment, the "Set the decoders to 10" is no longer done. > > + */ > > + if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10) > > + dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n", > > + cxlhdm->decoder_count); This is still too scary for how the kernel is going to handle it. I would expect: dev_dbg(dev, "Endpoint decoder count %d, expected max 10\n") ...what downside are you seeing that's prompting this patch? Sometimes Linux might want to constrain what hardware can do to avoid code maintenance burden, but I do not see that harm here. Otherwise, there are users that treat any log message above KERN_NOTICE priority as a sign of a broken system. Is a platform with an endpoint with 16 decoders broken? Also, if we do this bounds check for endpoints why are we not doing it for bridges for the > 32 case?
On 10/25/2022 10:38 AM, Dan Williams wrote: > Jonathan Cameron wrote: >> On Mon, 24 Oct 2022 08:43:30 -0700 >> Dave Jiang <dave.jiang@intel.com> wrote: >> >>> CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also >>> indicates that for devices, only 10 decoders should be advertised. Add >>> check on number of decoders greater than 10 for devices and emit warning. >>> >>> Signed-off-by: Dave Jiang <dave.jiang@intel.com> >> I wonder... Should warning say CXL r3.0 spec violation? >> >> Seems possible this might grow in future versions and so it >> might be nice to give a hint that it 'might' be valid in a higher spec version? >> >> I don't care strongly either way though. >> >> FWIW >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> >>> --- >>> >>> v3: >>> - Fix commit header and output message to reflect code change from v2. >>> >>> v2: >>> - Remove decoder count reassignment from violation (Dan) >>> >>> drivers/cxl/core/hdm.c | 10 ++++++++++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c >>> index d1d2caea5c62..c3b756f93261 100644 >>> --- a/drivers/cxl/core/hdm.c >>> +++ b/drivers/cxl/core/hdm.c >>> @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL); >>> static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) >>> { >>> u32 hdm_cap; >>> + struct device *dev = &cxlhdm->port->dev; >>> >>> hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET); >>> cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); >>> + /* >>> + * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise >>> + * more than 10 decoders. Switches and Host Bridges may advertise up to >>> + * 32 decoders. Set the decoders to 10 for devices if more than 10 are >>> + * found. > Stale comment, the "Set the decoders to 10" is no longer done. > >>> + */ >>> + if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10) >>> + dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n", >>> + cxlhdm->decoder_count); > This is still too scary for how the kernel is going to handle it. I > would expect: > > dev_dbg(dev, "Endpoint decoder count %d, expected max 10\n") > > ...what downside are you seeing that's prompting this patch? Sometimes > Linux might want to constrain what hardware can do to avoid code > maintenance burden, but I do not see that harm here. Otherwise, there > are users that treat any log message above KERN_NOTICE priority as a > sign of a broken system. Is a platform with an endpoint with 16 decoders > broken? Also, if we do this bounds check for endpoints why are we not > doing it for bridges for the > 32 case? This is purely the spec changed for 3.0. Should I add check for bridges as well and add debug emit for both or just drop it entirely?
Dave Jiang wrote: > > On 10/25/2022 10:38 AM, Dan Williams wrote: > > Jonathan Cameron wrote: > >> On Mon, 24 Oct 2022 08:43:30 -0700 > >> Dave Jiang <dave.jiang@intel.com> wrote: > >> > >>> CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also > >>> indicates that for devices, only 10 decoders should be advertised. Add > >>> check on number of decoders greater than 10 for devices and emit warning. > >>> > >>> Signed-off-by: Dave Jiang <dave.jiang@intel.com> > >> I wonder... Should warning say CXL r3.0 spec violation? > >> > >> Seems possible this might grow in future versions and so it > >> might be nice to give a hint that it 'might' be valid in a higher spec version? > >> > >> I don't care strongly either way though. > >> > >> FWIW > >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > >> > >>> --- > >>> > >>> v3: > >>> - Fix commit header and output message to reflect code change from v2. > >>> > >>> v2: > >>> - Remove decoder count reassignment from violation (Dan) > >>> > >>> drivers/cxl/core/hdm.c | 10 ++++++++++ > >>> 1 file changed, 10 insertions(+) > >>> > >>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c > >>> index d1d2caea5c62..c3b756f93261 100644 > >>> --- a/drivers/cxl/core/hdm.c > >>> +++ b/drivers/cxl/core/hdm.c > >>> @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL); > >>> static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) > >>> { > >>> u32 hdm_cap; > >>> + struct device *dev = &cxlhdm->port->dev; > >>> > >>> hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET); > >>> cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); > >>> + /* > >>> + * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise > >>> + * more than 10 decoders. Switches and Host Bridges may advertise up to > >>> + * 32 decoders. Set the decoders to 10 for devices if more than 10 are > >>> + * found. > > Stale comment, the "Set the decoders to 10" is no longer done. > > > >>> + */ > >>> + if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10) > >>> + dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n", > >>> + cxlhdm->decoder_count); > > This is still too scary for how the kernel is going to handle it. I > > would expect: > > > > dev_dbg(dev, "Endpoint decoder count %d, expected max 10\n") > > > > ...what downside are you seeing that's prompting this patch? Sometimes > > Linux might want to constrain what hardware can do to avoid code > > maintenance burden, but I do not see that harm here. Otherwise, there > > are users that treat any log message above KERN_NOTICE priority as a > > sign of a broken system. Is a platform with an endpoint with 16 decoders > > broken? Also, if we do this bounds check for endpoints why are we not > > doing it for bridges for the > 32 case? > > This is purely the spec changed for 3.0. Should I add check for bridges > as well and add debug emit for both or just drop it entirely? I see your point about the decoders > 32 case since those are not even defined in the 3.0 specification, however if the driver had been strict about the 2.0 decoder count limits then 3.0 devices would start tripping a dev_warn() unnecessarily. So I am skeptical of being noisy about spec compliance in places that have no material affect on the behavior of the driver. If it turned out that the decoders above 10 were broken we could fix that with a quirk, but those decoders are otherwise avoidable via region creation policy. So I think just drop this, it does not add anything and ends up asking more questions than it answers.
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index d1d2caea5c62..c3b756f93261 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL); static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) { u32 hdm_cap; + struct device *dev = &cxlhdm->port->dev; hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET); cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); + /* + * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise + * more than 10 decoders. Switches and Host Bridges may advertise up to + * 32 decoders. Set the decoders to 10 for devices if more than 10 are + * found. + */ + if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10) + dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n", + cxlhdm->decoder_count); cxlhdm->target_count = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap); if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))
CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also indicates that for devices, only 10 decoders should be advertised. Add check on number of decoders greater than 10 for devices and emit warning. Signed-off-by: Dave Jiang <dave.jiang@intel.com> --- v3: - Fix commit header and output message to reflect code change from v2. v2: - Remove decoder count reassignment from violation (Dan) drivers/cxl/core/hdm.c | 10 ++++++++++ 1 file changed, 10 insertions(+)