Message ID | cover.1719980933.git.alison.schofield@intel.com |
---|---|
Headers | show |
Series | XOR Math Fixups: translation & position | expand |
alison.schofield@ wrote: > From: Alison Schofield <alison.schofield@intel.com> > > Dropped tags on Patch 2 due to changes. Please Tag again. Short of obvious cases where the approach is unrecognizable from the previous version, I would say always include tags with a "please holler if you disagree" explanation as to why someone might want to withdraw their previous tag. Email communication is already lossy enough without constant revalidation. Otherwise, if you do ask for re-tag then please explain why you think the old tag is invalidated. I.e. I could have acked your explanation and trusted that description here rather then go re-review patch 2. > > Changes in v4: > - Patch 1: Updated commit msg/log > The name tidy-ups eventually led to a 'fold' not a 'rename' > - Patch 2: Rename the root decoder callback hpa_to_spa (Dan) > - Patch 2: Remove hpa_to_spa as a param to cxl_root_decoder_alloc() > - Patch 2: Add code comment that chunk check is modulo only (Fabio) > - Patch 2: Add lore link to unit test in commit log (Fabio) > - Cover Letter: Add an introduction (Dan) > > Link to v3: > https://lore.kernel.org/cover.1719275633.git.alison.schofield@intel.com/ > > > Begin cover letter: > > XOR Math Fixups are presented for both translation and position. > > Translation: > The CXL driver intends to report DPAs and their SPA translation in s/intends/has a responsibility/, right? Don't fix this up with a re-post, but for anyone new to this patchset they should know that RAS is one the main reasons to have OS awareness of CXL at all. If address translation is broken a fundamental reason for the driver to even exist is broken. So "intends" undersells how core this functionality is to even having a CXL subsystem in the kernel. > the TRACE logs for CXL poison, general_media, and dram events. It > is actually only logging the HPA, not the SPA. That works for CXL > decodes using typical MODULO arithmetic where HPA==SPA, but not for > XOR decodes. The driver needs to restore the XOR'd bits in order to > get to the SPA and it doesn't. This means that address translations > for root decoders using XOR maps are wrong. > > Specifically regions that interleave across 2,4,6,8,12, or 16 host > bridges are affected. Interleaves using 1 or 3 host bridges, even if > configured with XOR Arithmetic, do not use xormaps, and are safe. > > Aside from knowing that any address translation of a 1 or 3 way host > bridge interleave is correct no matter the decode (XOR or MODULO), > all others are suspect because the decode is actually transparent to > users. > > Position: > The position part of this patchset came from the discovery that > the driver doesn't need to calculate a targets position in a region > interleave set. The BIOS sets the target list and the driver can > simply use that order. Thanks for this writeup and the insight about XOR vs 1 or 3 way interleaves. May I ask that you incorporate this useful bit of prose as a kdoc for cxl_dpa_to_hpa()? This can be a follow-on patch, no need to respin this set again. Otherwise this gets lost to the sands of time where only someone savvy enough to do the lore archive search will see it.
From: Alison Schofield <alison.schofield@intel.com> Dropped tags on Patch 2 due to changes. Please Tag again. Changes in v4: - Patch 1: Updated commit msg/log The name tidy-ups eventually led to a 'fold' not a 'rename' - Patch 2: Rename the root decoder callback hpa_to_spa (Dan) - Patch 2: Remove hpa_to_spa as a param to cxl_root_decoder_alloc() - Patch 2: Add code comment that chunk check is modulo only (Fabio) - Patch 2: Add lore link to unit test in commit log (Fabio) - Cover Letter: Add an introduction (Dan) Link to v3: https://lore.kernel.org/cover.1719275633.git.alison.schofield@intel.com/ Begin cover letter: XOR Math Fixups are presented for both translation and position. Translation: The CXL driver intends to report DPAs and their SPA translation in the TRACE logs for CXL poison, general_media, and dram events. It is actually only logging the HPA, not the SPA. That works for CXL decodes using typical MODULO arithmetic where HPA==SPA, but not for XOR decodes. The driver needs to restore the XOR'd bits in order to get to the SPA and it doesn't. This means that address translations for root decoders using XOR maps are wrong. Specifically regions that interleave across 2,4,6,8,12, or 16 host bridges are affected. Interleaves using 1 or 3 host bridges, even if configured with XOR Arithmetic, do not use xormaps, and are safe. Aside from knowing that any address translation of a 1 or 3 way host bridge interleave is correct no matter the decode (XOR or MODULO), all others are suspect because the decode is actually transparent to users. Position: The position part of this patchset came from the discovery that the driver doesn't need to calculate a targets position in a region interleave set. The BIOS sets the target list and the driver can simply use that order. Presentation is as follows: Patch 1: Clean up - cxl_trace_hpa()-> cxl_dpa_to_hpa() Patch 2: cxl: Restore XOR'd position bits during address translation This completes the DPA->HPA->SPA translation, correcting the XOR address translation problem described above. Patch 3 & Patch 4 are paired. Patch 3 presents the new method for verifying a target position in the list and Patch 4 removes the old method. FYI - the reason I don't present the code removal first is because I think it is easier to read the diff if I leave in the old root decoder call back setup for calc_hb, insert the new call back along the same path, and then rip out the defunct calc_hb. That's the way I created the patchset and it may be an easier way for reviewers to follow along with the root decoder callback setup. Alison Schofield (4): cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa() cxl: Restore XOR'd position bits during address translation cxl/region: Verify target positions using the ordered target list cxl: Remove defunct code calculating host bridge target positions drivers/cxl/acpi.c | 84 ++++++++++++++++----------------------- drivers/cxl/core/core.h | 8 ++-- drivers/cxl/core/mbox.c | 2 +- drivers/cxl/core/port.c | 20 +--------- drivers/cxl/core/region.c | 61 ++++++++++++++-------------- drivers/cxl/core/trace.h | 4 +- drivers/cxl/cxl.h | 11 ++--- 7 files changed, 77 insertions(+), 113 deletions(-) base-commit: 22a40d14b572deb80c0648557f4bd502d7e83826