Message ID | 2-v3-9fef8cdc2ff6+150d1-smmuv3_tidy_jgg@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Tidy some minor things in the stream table/cd table area | expand |
On Tue, Aug 06, 2024 at 08:31:16PM -0300, Jason Gunthorpe wrote: > Don't open code the calculations of the indexes for each level, provide > two functions to do that math and call them in all the places. Update all > the places computing indexes. > > Calculate the L1 table size directly based on the max required index from > the cap. Remove STRTAB_L1_SZ_SHIFT in favour of STRTAB_NUM_L2_STES. > > Use STRTAB_NUM_L2_STES to replace remaining open coded 1 << STRTAB_SPLIT. > > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 51 +++++++++------------ > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 14 +++++- > 2 files changed, 35 insertions(+), 30 deletions(-) > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index b80f3359a8d12b..a695e5f8fc2880 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -1670,20 +1670,17 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab, > > static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) > { > - size_t size; > - void *strtab; > dma_addr_t l2ptr_dma; > struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; > - struct arm_smmu_strtab_l1_desc *desc = &cfg->l1_desc[sid >> STRTAB_SPLIT]; > + struct arm_smmu_strtab_l1_desc *desc = > + &cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)]; > > if (desc->l2ptr) > return 0; > > - size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3); > - strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS]; > - > - desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, &l2ptr_dma, > - GFP_KERNEL); > + desc->l2ptr = dmam_alloc_coherent( > + smmu->dev, STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste), > + &l2ptr_dma, GFP_KERNEL); Since this series is mainly about clean-up, please can you be consistent with the indentation style that the driver currently uses for multi-line function invocations? I applied the whole series, but I struggled looking at the resulting code (of course, there's no "right" way, but the driver is written with whatever I'm used to so I'd really like to keep the consistency). Will
On Fri, Sep 06, 2024 at 02:18:18PM +0100, Will Deacon wrote: > On Tue, Aug 06, 2024 at 08:31:16PM -0300, Jason Gunthorpe wrote: > > Don't open code the calculations of the indexes for each level, provide > > two functions to do that math and call them in all the places. Update all > > the places computing indexes. > > > > Calculate the L1 table size directly based on the max required index from > > the cap. Remove STRTAB_L1_SZ_SHIFT in favour of STRTAB_NUM_L2_STES. > > > > Use STRTAB_NUM_L2_STES to replace remaining open coded 1 << STRTAB_SPLIT. > > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > > Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > > --- > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 51 +++++++++------------ > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 14 +++++- > > 2 files changed, 35 insertions(+), 30 deletions(-) > > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > > index b80f3359a8d12b..a695e5f8fc2880 100644 > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > > @@ -1670,20 +1670,17 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab, > > > > static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) > > { > > - size_t size; > > - void *strtab; > > dma_addr_t l2ptr_dma; > > struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; > > - struct arm_smmu_strtab_l1_desc *desc = &cfg->l1_desc[sid >> STRTAB_SPLIT]; > > + struct arm_smmu_strtab_l1_desc *desc = > > + &cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)]; > > > > if (desc->l2ptr) > > return 0; > > > > - size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3); > > - strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS]; > > - > > - desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, &l2ptr_dma, > > - GFP_KERNEL); > > + desc->l2ptr = dmam_alloc_coherent( > > + smmu->dev, STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste), > > + &l2ptr_dma, GFP_KERNEL); > > Since this series is mainly about clean-up, please can you be consistent > with the indentation style that the driver currently uses for multi-line > function invocations? Honestly, I don't know what that style is.. I try to stick with a common style from clang-format using the checked-in specs. I touch so many drivers and so many subsystems it is a big ask to memorize everyone's little preferences and adjust things. I feel there is no way I could do that reliably.. Still, if you have some specific things that really bother you then please let me know them? I can make an effort, and/or you can adjust them when applying? For instance I have already been mindful to try to keep the vertical alignment in some places. I'm guessing you are looking at aligning the argument wrap to the ( ? To explain - clang-format has a heuristic for this. If the code formats nicely it will align to the (. For instance this proposed change to fit into 80 cols: - cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cd_table->cdtab_dma, - GFP_KERNEL); + cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size, + &cd_table->cdtab_dma, GFP_KERNEL); However, the hunk you clipped above doesn't fit nicely because the argument 'STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste)' would push past the 80 col limit if aligned to the (. Rather than break an argument in the middle of an expression, or go past 80 cols, it switches to one tab and keeps the argument expression together on one line. So, which alternative is your preference? > I applied the whole series, It seems not? Thanks, Jason
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index b80f3359a8d12b..a695e5f8fc2880 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1670,20 +1670,17 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab, static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) { - size_t size; - void *strtab; dma_addr_t l2ptr_dma; struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; - struct arm_smmu_strtab_l1_desc *desc = &cfg->l1_desc[sid >> STRTAB_SPLIT]; + struct arm_smmu_strtab_l1_desc *desc = + &cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)]; if (desc->l2ptr) return 0; - size = 1 << (STRTAB_SPLIT + ilog2(STRTAB_STE_DWORDS) + 3); - strtab = &cfg->strtab[(sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS]; - - desc->l2ptr = dmam_alloc_coherent(smmu->dev, size, &l2ptr_dma, - GFP_KERNEL); + desc->l2ptr = dmam_alloc_coherent( + smmu->dev, STRTAB_NUM_L2_STES * sizeof(struct arm_smmu_ste), + &l2ptr_dma, GFP_KERNEL); if (!desc->l2ptr) { dev_err(smmu->dev, "failed to allocate l2 stream table for SID %u\n", @@ -1691,8 +1688,9 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return -ENOMEM; } - arm_smmu_init_initial_stes(desc->l2ptr, 1 << STRTAB_SPLIT); - arm_smmu_write_strtab_l1_desc(strtab, l2ptr_dma); + arm_smmu_init_initial_stes(desc->l2ptr, STRTAB_NUM_L2_STES); + arm_smmu_write_strtab_l1_desc(&cfg->strtab[arm_smmu_strtab_l1_idx(sid)], + l2ptr_dma); return 0; } @@ -2449,12 +2447,9 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid) struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) { - unsigned int idx1, idx2; - /* Two-level walk */ - idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS; - idx2 = sid & ((1 << STRTAB_SPLIT) - 1); - return &cfg->l1_desc[idx1].l2ptr[idx2]; + return &cfg->l1_desc[arm_smmu_strtab_l1_idx(sid)] + .l2ptr[arm_smmu_strtab_l2_idx(sid)]; } else { /* Simple linear lookup */ return (struct arm_smmu_ste *)&cfg @@ -3158,12 +3153,10 @@ struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnode) static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid) { - unsigned long limit = smmu->strtab_cfg.num_l1_ents; - if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) - limit *= 1UL << STRTAB_SPLIT; - - return sid < limit; + return arm_smmu_strtab_l1_idx(sid) < + smmu->strtab_cfg.num_l1_ents; + return sid < smmu->strtab_cfg.num_l1_ents; } static int arm_smmu_init_sid_strtab(struct arm_smmu_device *smmu, u32 sid) @@ -3602,19 +3595,18 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu) { void *strtab; u64 reg; - u32 size, l1size; + u32 l1size; struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; + unsigned int last_sid_idx = + arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1); /* Calculate the L1 size, capped to the SIDSIZE. */ - size = STRTAB_L1_SZ_SHIFT - (ilog2(STRTAB_L1_DESC_DWORDS) + 3); - size = min(size, smmu->sid_bits - STRTAB_SPLIT); - cfg->num_l1_ents = 1 << size; - - size += STRTAB_SPLIT; - if (size < smmu->sid_bits) + cfg->num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES); + if (cfg->num_l1_ents <= last_sid_idx) dev_warn(smmu->dev, "2-level strtab only covers %u/%u bits of SID\n", - size, smmu->sid_bits); + ilog2(cfg->num_l1_ents * STRTAB_NUM_L2_STES), + smmu->sid_bits); l1size = cfg->num_l1_ents * (STRTAB_L1_DESC_DWORDS << 3); strtab = dmam_alloc_coherent(smmu->dev, l1size, &cfg->strtab_dma, @@ -3629,7 +3621,8 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu) /* Configure strtab_base_cfg for 2 levels */ reg = FIELD_PREP(STRTAB_BASE_CFG_FMT, STRTAB_BASE_CFG_FMT_2LVL); - reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, size); + reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, + ilog2(cfg->num_l1_ents) + STRTAB_SPLIT); reg |= FIELD_PREP(STRTAB_BASE_CFG_SPLIT, STRTAB_SPLIT); cfg->strtab_base_cfg = reg; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 14bca41a981b43..45ec88c9aa355f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -202,7 +202,6 @@ * 2lvl: 128k L1 entries, * 256 lazy entries per table (each table covers a PCI bus) */ -#define STRTAB_L1_SZ_SHIFT 20 #define STRTAB_SPLIT 8 #define STRTAB_L1_DESC_DWORDS 1 @@ -215,6 +214,19 @@ struct arm_smmu_ste { __le64 data[STRTAB_STE_DWORDS]; }; +#define STRTAB_NUM_L2_STES (1 << STRTAB_SPLIT) +#define STRTAB_MAX_L1_ENTRIES (1 << 17) + +static inline u32 arm_smmu_strtab_l1_idx(u32 sid) +{ + return sid / STRTAB_NUM_L2_STES; +} + +static inline u32 arm_smmu_strtab_l2_idx(u32 sid) +{ + return sid % STRTAB_NUM_L2_STES; +} + #define STRTAB_STE_0_V (1UL << 0) #define STRTAB_STE_0_CFG GENMASK_ULL(3, 1) #define STRTAB_STE_0_CFG_ABORT 0