Message ID | 20241114160721.1527934-4-quic_bibekkum@quicinc.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | iommu/arm-smmu: introduction of ACTLR implementation for Qualcomm SoCs | expand |
On Thu, Nov 14, 2024 at 8:10 AM Bibek Kumar Patro <quic_bibekkum@quicinc.com> wrote: > > Add an adreno-smmu-priv interface for drm/msm to call > into arm-smmu-qcom and initiate the PRR bit setup or reset > sequence as per request. > > This will be used by GPU to setup the PRR bit and related > configuration registers through adreno-smmu private > interface instead of directly poking the smmu hardware. > > Suggested-by: Rob Clark <robdclark@gmail.com> > Signed-off-by: Bibek Kumar Patro <quic_bibekkum@quicinc.com> > --- > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 37 ++++++++++++++++++++++ > drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ > include/linux/adreno-smmu-priv.h | 14 ++++++++ > 3 files changed, 53 insertions(+) > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > index d26f5aea248e..0e4f3fbda961 100644 > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > @@ -16,6 +16,8 @@ > > #define QCOM_DUMMY_VAL -1 > > +#define GFX_ACTLR_PRR (1 << 5) > + > static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) > { > return container_of(smmu, struct qcom_smmu, smmu); > @@ -99,6 +101,32 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina > arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); > } > > +static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) > +{ > + struct arm_smmu_domain *smmu_domain = (void *)cookie; > + struct arm_smmu_device *smmu = smmu_domain->smmu; > + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > + u32 reg = 0; > + > + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); > + reg &= ~GFX_ACTLR_PRR; > + if (set) > + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); > + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); > +} > + > +static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) > +{ > + struct arm_smmu_domain *smmu_domain = (void *)cookie; > + struct arm_smmu_device *smmu = smmu_domain->smmu; > + > + writel_relaxed(lower_32_bits(page_addr), > + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); > + > + writel_relaxed(upper_32_bits(page_addr), > + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); > +} > + > #define QCOM_ADRENO_SMMU_GPU_SID 0 > > static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) > @@ -210,6 +238,7 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) > static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) > { > + const struct device_node *np = smmu_domain->smmu->dev->of_node; > struct adreno_smmu_priv *priv; > > smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; > @@ -239,6 +268,14 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > priv->get_fault_info = qcom_adreno_smmu_get_fault_info; > priv->set_stall = qcom_adreno_smmu_set_stall; > priv->resume_translation = qcom_adreno_smmu_resume_translation; > + priv->set_prr_bit = NULL; > + priv->set_prr_addr = NULL; > + > + if (of_device_is_compatible(np, "qcom,smmu-500") && > + of_device_is_compatible(np, "qcom,adreno-smmu")) { fwiw, it seems like PRR actually works on sc7180, which is _not_ mmu-500, so I guess the support actually goes back further. I'm curious if we can just rely on this being supported by any hw that has a6xx or newer? BR, -R > + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; > + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; > + } > > return 0; > } > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h > index e2aeb511ae90..2dbf3243b5ad 100644 > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h > @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type { > #define ARM_SMMU_SCTLR_M BIT(0) > > #define ARM_SMMU_CB_ACTLR 0x4 > +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 > +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C > > #define ARM_SMMU_CB_RESUME 0x8 > #define ARM_SMMU_RESUME_TERMINATE BIT(0) > diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h > index c637e0997f6d..614665153b3e 100644 > --- a/include/linux/adreno-smmu-priv.h > +++ b/include/linux/adreno-smmu-priv.h > @@ -50,6 +50,18 @@ struct adreno_smmu_fault_info { > * the GPU driver must call resume_translation() > * @resume_translation: Resume translation after a fault > * > + * *CAUTION* : PRR callbacks (set_prr_bit/set_prr_addr) are NULL terminated for > + * targets without PRR support. Exercise caution and verify target > + * capabilities before invoking these callbacks to prevent potential > + * runtime errors or unexpected behavior. > + * > + * @set_prr_bit: Extendible interface to be used by GPU to modify the > + * ACTLR register bits, currently used to configure > + * Partially-Resident-Region (PRR) bit for feature's > + * setup and reset sequence as requested. > + * @set_prr_addr: Configure the PRR_CFG_*ADDR register with the > + * physical address of PRR page passed from > + * GPU driver. > * > * The GPU driver (drm/msm) and adreno-smmu work together for controlling > * the GPU's SMMU instance. This is by necessity, as the GPU is directly > @@ -67,6 +79,8 @@ struct adreno_smmu_priv { > void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); > void (*set_stall)(const void *cookie, bool enabled); > void (*resume_translation)(const void *cookie, bool terminate); > + void (*set_prr_bit)(const void *cookie, bool set); > + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); > }; > > #endif /* __ADRENO_SMMU_PRIV_H */ > -- > 2.34.1 >
On Wed, Nov 20, 2024 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote: > > On Thu, Nov 14, 2024 at 8:10 AM Bibek Kumar Patro > <quic_bibekkum@quicinc.com> wrote: > > > > Add an adreno-smmu-priv interface for drm/msm to call > > into arm-smmu-qcom and initiate the PRR bit setup or reset > > sequence as per request. > > > > This will be used by GPU to setup the PRR bit and related > > configuration registers through adreno-smmu private > > interface instead of directly poking the smmu hardware. > > > > Suggested-by: Rob Clark <robdclark@gmail.com> > > Signed-off-by: Bibek Kumar Patro <quic_bibekkum@quicinc.com> > > --- > > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 37 ++++++++++++++++++++++ > > drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ > > include/linux/adreno-smmu-priv.h | 14 ++++++++ > > 3 files changed, 53 insertions(+) > > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > > index d26f5aea248e..0e4f3fbda961 100644 > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > > @@ -16,6 +16,8 @@ > > > > #define QCOM_DUMMY_VAL -1 > > > > +#define GFX_ACTLR_PRR (1 << 5) > > + > > static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) > > { > > return container_of(smmu, struct qcom_smmu, smmu); > > @@ -99,6 +101,32 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina > > arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); > > } > > > > +static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) > > +{ > > + struct arm_smmu_domain *smmu_domain = (void *)cookie; > > + struct arm_smmu_device *smmu = smmu_domain->smmu; > > + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > > + u32 reg = 0; > > + > > + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); > > + reg &= ~GFX_ACTLR_PRR; > > + if (set) > > + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); > > + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); > > +} > > + > > +static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) > > +{ > > + struct arm_smmu_domain *smmu_domain = (void *)cookie; > > + struct arm_smmu_device *smmu = smmu_domain->smmu; > > + > > + writel_relaxed(lower_32_bits(page_addr), > > + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); > > + > > + writel_relaxed(upper_32_bits(page_addr), > > + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); > > +} > > + > > #define QCOM_ADRENO_SMMU_GPU_SID 0 > > > > static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) > > @@ -210,6 +238,7 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) > > static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > > struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) > > { > > + const struct device_node *np = smmu_domain->smmu->dev->of_node; > > struct adreno_smmu_priv *priv; > > > > smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; > > @@ -239,6 +268,14 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > > priv->get_fault_info = qcom_adreno_smmu_get_fault_info; > > priv->set_stall = qcom_adreno_smmu_set_stall; > > priv->resume_translation = qcom_adreno_smmu_resume_translation; > > + priv->set_prr_bit = NULL; > > + priv->set_prr_addr = NULL; > > + > > + if (of_device_is_compatible(np, "qcom,smmu-500") && > > + of_device_is_compatible(np, "qcom,adreno-smmu")) { > > fwiw, it seems like PRR actually works on sc7180, which is _not_ > mmu-500, so I guess the support actually goes back further. > > I'm curious if we can just rely on this being supported by any hw that > has a6xx or newer? Also, unrelated, but we can't assume the smmu is powered when drm driver calls set_prr_bit/addr, could you add in runpm get/put around the register access? Otherwise Conner and I have vk sparse residency working now BR, -R > > > + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; > > + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; > > + } > > > > return 0; > > } > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h > > index e2aeb511ae90..2dbf3243b5ad 100644 > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h > > @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type { > > #define ARM_SMMU_SCTLR_M BIT(0) > > > > #define ARM_SMMU_CB_ACTLR 0x4 > > +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 > > +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C > > > > #define ARM_SMMU_CB_RESUME 0x8 > > #define ARM_SMMU_RESUME_TERMINATE BIT(0) > > diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h > > index c637e0997f6d..614665153b3e 100644 > > --- a/include/linux/adreno-smmu-priv.h > > +++ b/include/linux/adreno-smmu-priv.h > > @@ -50,6 +50,18 @@ struct adreno_smmu_fault_info { > > * the GPU driver must call resume_translation() > > * @resume_translation: Resume translation after a fault > > * > > + * *CAUTION* : PRR callbacks (set_prr_bit/set_prr_addr) are NULL terminated for > > + * targets without PRR support. Exercise caution and verify target > > + * capabilities before invoking these callbacks to prevent potential > > + * runtime errors or unexpected behavior. > > + * > > + * @set_prr_bit: Extendible interface to be used by GPU to modify the > > + * ACTLR register bits, currently used to configure > > + * Partially-Resident-Region (PRR) bit for feature's > > + * setup and reset sequence as requested. > > + * @set_prr_addr: Configure the PRR_CFG_*ADDR register with the > > + * physical address of PRR page passed from > > + * GPU driver. > > * > > * The GPU driver (drm/msm) and adreno-smmu work together for controlling > > * the GPU's SMMU instance. This is by necessity, as the GPU is directly > > @@ -67,6 +79,8 @@ struct adreno_smmu_priv { > > void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); > > void (*set_stall)(const void *cookie, bool enabled); > > void (*resume_translation)(const void *cookie, bool terminate); > > + void (*set_prr_bit)(const void *cookie, bool set); > > + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); > > }; > > > > #endif /* __ADRENO_SMMU_PRIV_H */ > > -- > > 2.34.1 > >
On 11/20/2024 10:47 PM, Rob Clark wrote: > On Thu, Nov 14, 2024 at 8:10 AM Bibek Kumar Patro > <quic_bibekkum@quicinc.com> wrote: >> >> Add an adreno-smmu-priv interface for drm/msm to call >> into arm-smmu-qcom and initiate the PRR bit setup or reset >> sequence as per request. >> >> This will be used by GPU to setup the PRR bit and related >> configuration registers through adreno-smmu private >> interface instead of directly poking the smmu hardware. >> >> Suggested-by: Rob Clark <robdclark@gmail.com> >> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@quicinc.com> >> --- >> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 37 ++++++++++++++++++++++ >> drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ >> include/linux/adreno-smmu-priv.h | 14 ++++++++ >> 3 files changed, 53 insertions(+) >> >> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c >> index d26f5aea248e..0e4f3fbda961 100644 >> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c >> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c >> @@ -16,6 +16,8 @@ >> >> #define QCOM_DUMMY_VAL -1 >> >> +#define GFX_ACTLR_PRR (1 << 5) >> + >> static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) >> { >> return container_of(smmu, struct qcom_smmu, smmu); >> @@ -99,6 +101,32 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina >> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); >> } >> >> +static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) >> +{ >> + struct arm_smmu_domain *smmu_domain = (void *)cookie; >> + struct arm_smmu_device *smmu = smmu_domain->smmu; >> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >> + u32 reg = 0; >> + >> + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); >> + reg &= ~GFX_ACTLR_PRR; >> + if (set) >> + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); >> + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); >> +} >> + >> +static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) >> +{ >> + struct arm_smmu_domain *smmu_domain = (void *)cookie; >> + struct arm_smmu_device *smmu = smmu_domain->smmu; >> + >> + writel_relaxed(lower_32_bits(page_addr), >> + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); >> + >> + writel_relaxed(upper_32_bits(page_addr), >> + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); >> +} >> + >> #define QCOM_ADRENO_SMMU_GPU_SID 0 >> >> static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) >> @@ -210,6 +238,7 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) >> static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, >> struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) >> { >> + const struct device_node *np = smmu_domain->smmu->dev->of_node; >> struct adreno_smmu_priv *priv; >> >> smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; >> @@ -239,6 +268,14 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, >> priv->get_fault_info = qcom_adreno_smmu_get_fault_info; >> priv->set_stall = qcom_adreno_smmu_set_stall; >> priv->resume_translation = qcom_adreno_smmu_resume_translation; >> + priv->set_prr_bit = NULL; >> + priv->set_prr_addr = NULL; >> + >> + if (of_device_is_compatible(np, "qcom,smmu-500") && >> + of_device_is_compatible(np, "qcom,adreno-smmu")) { > > fwiw, it seems like PRR actually works on sc7180, which is _not_ > mmu-500, so I guess the support actually goes back further. > As I checked sc7180 was on previous variant of SMMU, so targets on this variant have different steps to set PRR bit. <Do not have both prr bit and PRR page registers>. It's MMU-500 targets only where PRR support is with both PRR bit and PRR page addr registers. As I was re-visiting our discussions on v13 regarding this - I remember that's why we started using the SMMU- compatible string based PRR procedure selection instead of the reserved- memory node. [1] i.e Based on SMMU variant (as selected by compatible string), specific sequence will be selected. So for now only MMU-500 based selection has been supported as part of this series and will add subsequent support for other SMMU-variants thereafter. > I'm curious if we can just rely on this being supported by any hw that > has a6xx or newer? > I'd need to check on targets which will be based on a6xx onwards, on what will be the procedure planned to support PRR feature. I'll update the information over this space. [1]: https://lore.kernel.org/all/5790afa3-f9c0-4720-9804-8a7ff3d91854@quicinc.com/#:~:text=%3E%20I%20guess%20if,part%20as%20well. Thanks & regards, Bibek > BR, > -R > >> + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; >> + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; >> + } >> >> return 0; >> } >> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h >> index e2aeb511ae90..2dbf3243b5ad 100644 >> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h >> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h >> @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type { >> #define ARM_SMMU_SCTLR_M BIT(0) >> >> #define ARM_SMMU_CB_ACTLR 0x4 >> +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 >> +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C >> >> #define ARM_SMMU_CB_RESUME 0x8 >> #define ARM_SMMU_RESUME_TERMINATE BIT(0) >> diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h >> index c637e0997f6d..614665153b3e 100644 >> --- a/include/linux/adreno-smmu-priv.h >> +++ b/include/linux/adreno-smmu-priv.h >> @@ -50,6 +50,18 @@ struct adreno_smmu_fault_info { >> * the GPU driver must call resume_translation() >> * @resume_translation: Resume translation after a fault >> * >> + * *CAUTION* : PRR callbacks (set_prr_bit/set_prr_addr) are NULL terminated for >> + * targets without PRR support. Exercise caution and verify target >> + * capabilities before invoking these callbacks to prevent potential >> + * runtime errors or unexpected behavior. >> + * >> + * @set_prr_bit: Extendible interface to be used by GPU to modify the >> + * ACTLR register bits, currently used to configure >> + * Partially-Resident-Region (PRR) bit for feature's >> + * setup and reset sequence as requested. >> + * @set_prr_addr: Configure the PRR_CFG_*ADDR register with the >> + * physical address of PRR page passed from >> + * GPU driver. >> * >> * The GPU driver (drm/msm) and adreno-smmu work together for controlling >> * the GPU's SMMU instance. This is by necessity, as the GPU is directly >> @@ -67,6 +79,8 @@ struct adreno_smmu_priv { >> void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); >> void (*set_stall)(const void *cookie, bool enabled); >> void (*resume_translation)(const void *cookie, bool terminate); >> + void (*set_prr_bit)(const void *cookie, bool set); >> + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); >> }; >> >> #endif /* __ADRENO_SMMU_PRIV_H */ >> -- >> 2.34.1 >>
On 11/21/2024 3:40 AM, Rob Clark wrote: > On Wed, Nov 20, 2024 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote: >> >> On Thu, Nov 14, 2024 at 8:10 AM Bibek Kumar Patro >> <quic_bibekkum@quicinc.com> wrote: >>> >>> Add an adreno-smmu-priv interface for drm/msm to call >>> into arm-smmu-qcom and initiate the PRR bit setup or reset >>> sequence as per request. >>> >>> This will be used by GPU to setup the PRR bit and related >>> configuration registers through adreno-smmu private >>> interface instead of directly poking the smmu hardware. >>> >>> Suggested-by: Rob Clark <robdclark@gmail.com> >>> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@quicinc.com> >>> --- >>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 37 ++++++++++++++++++++++ >>> drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ >>> include/linux/adreno-smmu-priv.h | 14 ++++++++ >>> 3 files changed, 53 insertions(+) >>> >>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c >>> index d26f5aea248e..0e4f3fbda961 100644 >>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c >>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c >>> @@ -16,6 +16,8 @@ >>> >>> #define QCOM_DUMMY_VAL -1 >>> >>> +#define GFX_ACTLR_PRR (1 << 5) >>> + >>> static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) >>> { >>> return container_of(smmu, struct qcom_smmu, smmu); >>> @@ -99,6 +101,32 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina >>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); >>> } >>> >>> +static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) >>> +{ >>> + struct arm_smmu_domain *smmu_domain = (void *)cookie; >>> + struct arm_smmu_device *smmu = smmu_domain->smmu; >>> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >>> + u32 reg = 0; >>> + >>> + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); >>> + reg &= ~GFX_ACTLR_PRR; >>> + if (set) >>> + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); >>> + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); >>> +} >>> + >>> +static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) >>> +{ >>> + struct arm_smmu_domain *smmu_domain = (void *)cookie; >>> + struct arm_smmu_device *smmu = smmu_domain->smmu; >>> + >>> + writel_relaxed(lower_32_bits(page_addr), >>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); >>> + >>> + writel_relaxed(upper_32_bits(page_addr), >>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); >>> +} >>> + >>> #define QCOM_ADRENO_SMMU_GPU_SID 0 >>> >>> static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) >>> @@ -210,6 +238,7 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) >>> static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, >>> struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) >>> { >>> + const struct device_node *np = smmu_domain->smmu->dev->of_node; >>> struct adreno_smmu_priv *priv; >>> >>> smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; >>> @@ -239,6 +268,14 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, >>> priv->get_fault_info = qcom_adreno_smmu_get_fault_info; >>> priv->set_stall = qcom_adreno_smmu_set_stall; >>> priv->resume_translation = qcom_adreno_smmu_resume_translation; >>> + priv->set_prr_bit = NULL; >>> + priv->set_prr_addr = NULL; >>> + >>> + if (of_device_is_compatible(np, "qcom,smmu-500") && >>> + of_device_is_compatible(np, "qcom,adreno-smmu")) { >> >> fwiw, it seems like PRR actually works on sc7180, which is _not_ >> mmu-500, so I guess the support actually goes back further. >> >> I'm curious if we can just rely on this being supported by any hw that >> has a6xx or newer? > > > Also, unrelated, but we can't assume the smmu is powered when drm > driver calls set_prr_bit/addr, could you add in runpm get/put around > the register access? > I see, thanks for this observation. It's surely a possible case, when they access these registers SMMU state is off. I will add the suggested runpm ops around the register access. > Otherwise Conner and I have vk sparse residency working now > Sorry, could not get this. Did you mean that vk sparse residency is working now using this patch? Thanks & regards, Bibek > BR, > -R > >> >>> + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; >>> + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; >>> + } >>> >>> return 0; >>> } >>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h >>> index e2aeb511ae90..2dbf3243b5ad 100644 >>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h >>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h >>> @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type { >>> #define ARM_SMMU_SCTLR_M BIT(0) >>> >>> #define ARM_SMMU_CB_ACTLR 0x4 >>> +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 >>> +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C >>> >>> #define ARM_SMMU_CB_RESUME 0x8 >>> #define ARM_SMMU_RESUME_TERMINATE BIT(0) >>> diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h >>> index c637e0997f6d..614665153b3e 100644 >>> --- a/include/linux/adreno-smmu-priv.h >>> +++ b/include/linux/adreno-smmu-priv.h >>> @@ -50,6 +50,18 @@ struct adreno_smmu_fault_info { >>> * the GPU driver must call resume_translation() >>> * @resume_translation: Resume translation after a fault >>> * >>> + * *CAUTION* : PRR callbacks (set_prr_bit/set_prr_addr) are NULL terminated for >>> + * targets without PRR support. Exercise caution and verify target >>> + * capabilities before invoking these callbacks to prevent potential >>> + * runtime errors or unexpected behavior. >>> + * >>> + * @set_prr_bit: Extendible interface to be used by GPU to modify the >>> + * ACTLR register bits, currently used to configure >>> + * Partially-Resident-Region (PRR) bit for feature's >>> + * setup and reset sequence as requested. >>> + * @set_prr_addr: Configure the PRR_CFG_*ADDR register with the >>> + * physical address of PRR page passed from >>> + * GPU driver. >>> * >>> * The GPU driver (drm/msm) and adreno-smmu work together for controlling >>> * the GPU's SMMU instance. This is by necessity, as the GPU is directly >>> @@ -67,6 +79,8 @@ struct adreno_smmu_priv { >>> void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); >>> void (*set_stall)(const void *cookie, bool enabled); >>> void (*resume_translation)(const void *cookie, bool terminate); >>> + void (*set_prr_bit)(const void *cookie, bool set); >>> + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); >>> }; >>> >>> #endif /* __ADRENO_SMMU_PRIV_H */ >>> -- >>> 2.34.1 >>>
On Fri, Nov 22, 2024 at 8:20 AM Bibek Kumar Patro <quic_bibekkum@quicinc.com> wrote: > > > > On 11/21/2024 3:40 AM, Rob Clark wrote: > > On Wed, Nov 20, 2024 at 9:17 AM Rob Clark <robdclark@gmail.com> wrote: > >> > >> On Thu, Nov 14, 2024 at 8:10 AM Bibek Kumar Patro > >> <quic_bibekkum@quicinc.com> wrote: > >>> > >>> Add an adreno-smmu-priv interface for drm/msm to call > >>> into arm-smmu-qcom and initiate the PRR bit setup or reset > >>> sequence as per request. > >>> > >>> This will be used by GPU to setup the PRR bit and related > >>> configuration registers through adreno-smmu private > >>> interface instead of directly poking the smmu hardware. > >>> > >>> Suggested-by: Rob Clark <robdclark@gmail.com> > >>> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@quicinc.com> > >>> --- > >>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 37 ++++++++++++++++++++++ > >>> drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ > >>> include/linux/adreno-smmu-priv.h | 14 ++++++++ > >>> 3 files changed, 53 insertions(+) > >>> > >>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >>> index d26f5aea248e..0e4f3fbda961 100644 > >>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >>> @@ -16,6 +16,8 @@ > >>> > >>> #define QCOM_DUMMY_VAL -1 > >>> > >>> +#define GFX_ACTLR_PRR (1 << 5) > >>> + > >>> static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) > >>> { > >>> return container_of(smmu, struct qcom_smmu, smmu); > >>> @@ -99,6 +101,32 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina > >>> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); > >>> } > >>> > >>> +static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) > >>> +{ > >>> + struct arm_smmu_domain *smmu_domain = (void *)cookie; > >>> + struct arm_smmu_device *smmu = smmu_domain->smmu; > >>> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > >>> + u32 reg = 0; > >>> + > >>> + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); > >>> + reg &= ~GFX_ACTLR_PRR; > >>> + if (set) > >>> + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); > >>> + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); > >>> +} > >>> + > >>> +static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) > >>> +{ > >>> + struct arm_smmu_domain *smmu_domain = (void *)cookie; > >>> + struct arm_smmu_device *smmu = smmu_domain->smmu; > >>> + > >>> + writel_relaxed(lower_32_bits(page_addr), > >>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); > >>> + > >>> + writel_relaxed(upper_32_bits(page_addr), > >>> + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); > >>> +} > >>> + > >>> #define QCOM_ADRENO_SMMU_GPU_SID 0 > >>> > >>> static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) > >>> @@ -210,6 +238,7 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) > >>> static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > >>> struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) > >>> { > >>> + const struct device_node *np = smmu_domain->smmu->dev->of_node; > >>> struct adreno_smmu_priv *priv; > >>> > >>> smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; > >>> @@ -239,6 +268,14 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > >>> priv->get_fault_info = qcom_adreno_smmu_get_fault_info; > >>> priv->set_stall = qcom_adreno_smmu_set_stall; > >>> priv->resume_translation = qcom_adreno_smmu_resume_translation; > >>> + priv->set_prr_bit = NULL; > >>> + priv->set_prr_addr = NULL; > >>> + > >>> + if (of_device_is_compatible(np, "qcom,smmu-500") && > >>> + of_device_is_compatible(np, "qcom,adreno-smmu")) { > >> > >> fwiw, it seems like PRR actually works on sc7180, which is _not_ > >> mmu-500, so I guess the support actually goes back further. > >> > >> I'm curious if we can just rely on this being supported by any hw that > >> has a6xx or newer? > > > > > > Also, unrelated, but we can't assume the smmu is powered when drm > > driver calls set_prr_bit/addr, could you add in runpm get/put around > > the register access? > > > > I see, thanks for this observation. > It's surely a possible case, when they access these registers > SMMU state is off. > I will add the suggested runpm ops around the register access. > > > Otherwise Conner and I have vk sparse residency working now > > > > Sorry, could not get this. Did you mean that vk sparse residency > is working now using this patch? Yes, vk sparse residency is working with this patch (plus addition of runpm get/put, plus drm/msm patches plus turnip patches) ;-) BR, -R > Thanks & regards, > Bibek > > > BR, > > -R > > > >> > >>> + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; > >>> + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; > >>> + } > >>> > >>> return 0; > >>> } > >>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h > >>> index e2aeb511ae90..2dbf3243b5ad 100644 > >>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h > >>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h > >>> @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type { > >>> #define ARM_SMMU_SCTLR_M BIT(0) > >>> > >>> #define ARM_SMMU_CB_ACTLR 0x4 > >>> +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 > >>> +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C > >>> > >>> #define ARM_SMMU_CB_RESUME 0x8 > >>> #define ARM_SMMU_RESUME_TERMINATE BIT(0) > >>> diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h > >>> index c637e0997f6d..614665153b3e 100644 > >>> --- a/include/linux/adreno-smmu-priv.h > >>> +++ b/include/linux/adreno-smmu-priv.h > >>> @@ -50,6 +50,18 @@ struct adreno_smmu_fault_info { > >>> * the GPU driver must call resume_translation() > >>> * @resume_translation: Resume translation after a fault > >>> * > >>> + * *CAUTION* : PRR callbacks (set_prr_bit/set_prr_addr) are NULL terminated for > >>> + * targets without PRR support. Exercise caution and verify target > >>> + * capabilities before invoking these callbacks to prevent potential > >>> + * runtime errors or unexpected behavior. > >>> + * > >>> + * @set_prr_bit: Extendible interface to be used by GPU to modify the > >>> + * ACTLR register bits, currently used to configure > >>> + * Partially-Resident-Region (PRR) bit for feature's > >>> + * setup and reset sequence as requested. > >>> + * @set_prr_addr: Configure the PRR_CFG_*ADDR register with the > >>> + * physical address of PRR page passed from > >>> + * GPU driver. > >>> * > >>> * The GPU driver (drm/msm) and adreno-smmu work together for controlling > >>> * the GPU's SMMU instance. This is by necessity, as the GPU is directly > >>> @@ -67,6 +79,8 @@ struct adreno_smmu_priv { > >>> void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); > >>> void (*set_stall)(const void *cookie, bool enabled); > >>> void (*resume_translation)(const void *cookie, bool terminate); > >>> + void (*set_prr_bit)(const void *cookie, bool set); > >>> + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); > >>> }; > >>> > >>> #endif /* __ADRENO_SMMU_PRIV_H */ > >>> -- > >>> 2.34.1 > >>> >
On Fri, Nov 22, 2024 at 8:19 AM Bibek Kumar Patro <quic_bibekkum@quicinc.com> wrote: > > > > On 11/20/2024 10:47 PM, Rob Clark wrote: > > On Thu, Nov 14, 2024 at 8:10 AM Bibek Kumar Patro > > <quic_bibekkum@quicinc.com> wrote: > >> > >> Add an adreno-smmu-priv interface for drm/msm to call > >> into arm-smmu-qcom and initiate the PRR bit setup or reset > >> sequence as per request. > >> > >> This will be used by GPU to setup the PRR bit and related > >> configuration registers through adreno-smmu private > >> interface instead of directly poking the smmu hardware. > >> > >> Suggested-by: Rob Clark <robdclark@gmail.com> > >> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@quicinc.com> > >> --- > >> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 37 ++++++++++++++++++++++ > >> drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ > >> include/linux/adreno-smmu-priv.h | 14 ++++++++ > >> 3 files changed, 53 insertions(+) > >> > >> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >> index d26f5aea248e..0e4f3fbda961 100644 > >> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > >> @@ -16,6 +16,8 @@ > >> > >> #define QCOM_DUMMY_VAL -1 > >> > >> +#define GFX_ACTLR_PRR (1 << 5) > >> + > >> static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) > >> { > >> return container_of(smmu, struct qcom_smmu, smmu); > >> @@ -99,6 +101,32 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina > >> arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); > >> } > >> > >> +static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) > >> +{ > >> + struct arm_smmu_domain *smmu_domain = (void *)cookie; > >> + struct arm_smmu_device *smmu = smmu_domain->smmu; > >> + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > >> + u32 reg = 0; > >> + > >> + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); > >> + reg &= ~GFX_ACTLR_PRR; > >> + if (set) > >> + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); > >> + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); > >> +} > >> + > >> +static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) > >> +{ > >> + struct arm_smmu_domain *smmu_domain = (void *)cookie; > >> + struct arm_smmu_device *smmu = smmu_domain->smmu; > >> + > >> + writel_relaxed(lower_32_bits(page_addr), > >> + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); > >> + > >> + writel_relaxed(upper_32_bits(page_addr), > >> + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); > >> +} > >> + > >> #define QCOM_ADRENO_SMMU_GPU_SID 0 > >> > >> static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) > >> @@ -210,6 +238,7 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) > >> static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > >> struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) > >> { > >> + const struct device_node *np = smmu_domain->smmu->dev->of_node; > >> struct adreno_smmu_priv *priv; > >> > >> smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; > >> @@ -239,6 +268,14 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, > >> priv->get_fault_info = qcom_adreno_smmu_get_fault_info; > >> priv->set_stall = qcom_adreno_smmu_set_stall; > >> priv->resume_translation = qcom_adreno_smmu_resume_translation; > >> + priv->set_prr_bit = NULL; > >> + priv->set_prr_addr = NULL; > >> + > >> + if (of_device_is_compatible(np, "qcom,smmu-500") && > >> + of_device_is_compatible(np, "qcom,adreno-smmu")) { > > > > fwiw, it seems like PRR actually works on sc7180, which is _not_ > > mmu-500, so I guess the support actually goes back further. > > > > As I checked sc7180 was on previous variant of SMMU, > so targets on this variant have different steps to set PRR bit. > <Do not have both prr bit and PRR page registers>. Experimentally, I have to set both the PRR LADDR/UADDR regs and ACTLR.PRR bit on sc7180 to get the sparse-residency tests passing. So the underlying hw seems to work in the same way as mmu-500. _But_ this is on a sc7180 chromebook, the situation might be different depending on fw on things that have QC hyp. > It's MMU-500 targets only where PRR support is with both PRR bit > and PRR page addr registers. As I was re-visiting our discussions on v13 > regarding this - I remember that's why we started using the SMMU- > compatible string based PRR procedure selection instead of the reserved- > memory node. [1] i.e Based on SMMU variant (as selected by compatible > string), specific sequence will be selected. > > So for now only MMU-500 based selection has been supported as part of > this series and will add subsequent support for other SMMU-variants > thereafter. > > > I'm curious if we can just rely on this being supported by any hw that > > has a6xx or newer? > > > > I'd need to check on targets which will be based on a6xx onwards, on > what will be the procedure planned to support PRR feature. I'll update > the information over this space. One of the open questions about the drm/msm sparse-memory patchset is whether we need to expose to userspace whether PRR is supported, or if we can just rely on sparse-binding support implying sparse residency (ie, PRR) support. All a6xx devices support per-process pgtables, which is the only requirement for basic sparseBinding support. But PRR is required to additionally expose sparseResidencyBuffer/sparseResidencyImage2D. I think, long term, turnip probably will want to drop support for older kernels and remove support for legacy buffer mapping. But if we have some a6xx devices without PRR, then to do that we'd need to decouple sparse binding and sparse residency. (Vulkan allows a driver to expose the former without the latter.) BR, -R > [1]: > https://lore.kernel.org/all/5790afa3-f9c0-4720-9804-8a7ff3d91854@quicinc.com/#:~:text=%3E%20I%20guess%20if,part%20as%20well. > > Thanks & regards, > Bibek > > > BR, > > -R > > > >> + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; > >> + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; > >> + } > >> > >> return 0; > >> } > >> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h > >> index e2aeb511ae90..2dbf3243b5ad 100644 > >> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h > >> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h > >> @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type { > >> #define ARM_SMMU_SCTLR_M BIT(0) > >> > >> #define ARM_SMMU_CB_ACTLR 0x4 > >> +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 > >> +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C > >> > >> #define ARM_SMMU_CB_RESUME 0x8 > >> #define ARM_SMMU_RESUME_TERMINATE BIT(0) > >> diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h > >> index c637e0997f6d..614665153b3e 100644 > >> --- a/include/linux/adreno-smmu-priv.h > >> +++ b/include/linux/adreno-smmu-priv.h > >> @@ -50,6 +50,18 @@ struct adreno_smmu_fault_info { > >> * the GPU driver must call resume_translation() > >> * @resume_translation: Resume translation after a fault > >> * > >> + * *CAUTION* : PRR callbacks (set_prr_bit/set_prr_addr) are NULL terminated for > >> + * targets without PRR support. Exercise caution and verify target > >> + * capabilities before invoking these callbacks to prevent potential > >> + * runtime errors or unexpected behavior. > >> + * > >> + * @set_prr_bit: Extendible interface to be used by GPU to modify the > >> + * ACTLR register bits, currently used to configure > >> + * Partially-Resident-Region (PRR) bit for feature's > >> + * setup and reset sequence as requested. > >> + * @set_prr_addr: Configure the PRR_CFG_*ADDR register with the > >> + * physical address of PRR page passed from > >> + * GPU driver. > >> * > >> * The GPU driver (drm/msm) and adreno-smmu work together for controlling > >> * the GPU's SMMU instance. This is by necessity, as the GPU is directly > >> @@ -67,6 +79,8 @@ struct adreno_smmu_priv { > >> void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); > >> void (*set_stall)(const void *cookie, bool enabled); > >> void (*resume_translation)(const void *cookie, bool terminate); > >> + void (*set_prr_bit)(const void *cookie, bool set); > >> + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); > >> }; > >> > >> #endif /* __ADRENO_SMMU_PRIV_H */ > >> -- > >> 2.34.1 > >> >
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index d26f5aea248e..0e4f3fbda961 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -16,6 +16,8 @@ #define QCOM_DUMMY_VAL -1 +#define GFX_ACTLR_PRR (1 << 5) + static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) { return container_of(smmu, struct qcom_smmu, smmu); @@ -99,6 +101,32 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); } +static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) +{ + struct arm_smmu_domain *smmu_domain = (void *)cookie; + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; + u32 reg = 0; + + reg = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR); + reg &= ~GFX_ACTLR_PRR; + if (set) + reg |= FIELD_PREP(GFX_ACTLR_PRR, 1); + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_ACTLR, reg); +} + +static void qcom_adreno_smmu_set_prr_addr(const void *cookie, phys_addr_t page_addr) +{ + struct arm_smmu_domain *smmu_domain = (void *)cookie; + struct arm_smmu_device *smmu = smmu_domain->smmu; + + writel_relaxed(lower_32_bits(page_addr), + smmu->base + ARM_SMMU_GFX_PRR_CFG_LADDR); + + writel_relaxed(upper_32_bits(page_addr), + smmu->base + ARM_SMMU_GFX_PRR_CFG_UADDR); +} + #define QCOM_ADRENO_SMMU_GPU_SID 0 static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) @@ -210,6 +238,7 @@ static bool qcom_adreno_can_do_ttbr1(struct arm_smmu_device *smmu) static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, struct io_pgtable_cfg *pgtbl_cfg, struct device *dev) { + const struct device_node *np = smmu_domain->smmu->dev->of_node; struct adreno_smmu_priv *priv; smmu_domain->cfg.flush_walk_prefer_tlbiasid = true; @@ -239,6 +268,14 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, priv->get_fault_info = qcom_adreno_smmu_get_fault_info; priv->set_stall = qcom_adreno_smmu_set_stall; priv->resume_translation = qcom_adreno_smmu_resume_translation; + priv->set_prr_bit = NULL; + priv->set_prr_addr = NULL; + + if (of_device_is_compatible(np, "qcom,smmu-500") && + of_device_is_compatible(np, "qcom,adreno-smmu")) { + priv->set_prr_bit = qcom_adreno_smmu_set_prr_bit; + priv->set_prr_addr = qcom_adreno_smmu_set_prr_addr; + } return 0; } diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h index e2aeb511ae90..2dbf3243b5ad 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h @@ -154,6 +154,8 @@ enum arm_smmu_cbar_type { #define ARM_SMMU_SCTLR_M BIT(0) #define ARM_SMMU_CB_ACTLR 0x4 +#define ARM_SMMU_GFX_PRR_CFG_LADDR 0x6008 +#define ARM_SMMU_GFX_PRR_CFG_UADDR 0x600C #define ARM_SMMU_CB_RESUME 0x8 #define ARM_SMMU_RESUME_TERMINATE BIT(0) diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h index c637e0997f6d..614665153b3e 100644 --- a/include/linux/adreno-smmu-priv.h +++ b/include/linux/adreno-smmu-priv.h @@ -50,6 +50,18 @@ struct adreno_smmu_fault_info { * the GPU driver must call resume_translation() * @resume_translation: Resume translation after a fault * + * *CAUTION* : PRR callbacks (set_prr_bit/set_prr_addr) are NULL terminated for + * targets without PRR support. Exercise caution and verify target + * capabilities before invoking these callbacks to prevent potential + * runtime errors or unexpected behavior. + * + * @set_prr_bit: Extendible interface to be used by GPU to modify the + * ACTLR register bits, currently used to configure + * Partially-Resident-Region (PRR) bit for feature's + * setup and reset sequence as requested. + * @set_prr_addr: Configure the PRR_CFG_*ADDR register with the + * physical address of PRR page passed from + * GPU driver. * * The GPU driver (drm/msm) and adreno-smmu work together for controlling * the GPU's SMMU instance. This is by necessity, as the GPU is directly @@ -67,6 +79,8 @@ struct adreno_smmu_priv { void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info *info); void (*set_stall)(const void *cookie, bool enabled); void (*resume_translation)(const void *cookie, bool terminate); + void (*set_prr_bit)(const void *cookie, bool set); + void (*set_prr_addr)(const void *cookie, phys_addr_t page_addr); }; #endif /* __ADRENO_SMMU_PRIV_H */
Add an adreno-smmu-priv interface for drm/msm to call into arm-smmu-qcom and initiate the PRR bit setup or reset sequence as per request. This will be used by GPU to setup the PRR bit and related configuration registers through adreno-smmu private interface instead of directly poking the smmu hardware. Suggested-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Bibek Kumar Patro <quic_bibekkum@quicinc.com> --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 37 ++++++++++++++++++++++ drivers/iommu/arm/arm-smmu/arm-smmu.h | 2 ++ include/linux/adreno-smmu-priv.h | 14 ++++++++ 3 files changed, 53 insertions(+) -- 2.34.1