Message ID | 20200918101852.582559-10-jean-philippe@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part) | expand |
On Fri, 18 Sep 2020 12:18:49 +0200 Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > The SMMU has a single ASID space, the union of shared and private ASID > sets. This means that the SMMU driver competes with the arch allocator > for ASIDs. Shared ASIDs are those of Linux processes, allocated by the > arch, and contribute in broadcast TLB maintenance. Private ASIDs are > allocated by the SMMU driver and used for "classic" map/unmap DMA. They > require command-queue TLB invalidations. > > When we pin down an mm_context and get an ASID that is already in use by > the SMMU, it belongs to a private context. We used to simply abort the > bind, but this is unfair to users that would be unable to bind a few > seemingly random processes. Try to allocate a new private ASID for the > context, and make the old ASID shared. > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Hi, One totally trivial comment inline that might have ever so slightly improved reviewability of the patch. However it is only minor so don't bother respinning for that. Thanks, Jonathan > --- > v10: fix ASID limit, small comment update > --- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++ > .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 35 +++++++++++++++++-- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 +++++++++++------- > 3 files changed, 57 insertions(+), 15 deletions(-) > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > index 6b06a6f19604..90c08f156b43 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > @@ -678,6 +678,9 @@ struct arm_smmu_domain { > extern struct xarray arm_smmu_asid_xa; > extern struct mutex arm_smmu_asid_lock; > > +int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, > + struct arm_smmu_ctx_desc *cd); > +void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); > bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd); > > #endif /* _ARM_SMMU_V3_H */ > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c > index 6c1113059632..ef3fcfa72187 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c > @@ -10,10 +10,18 @@ > #include "arm-smmu-v3.h" > #include "../../io-pgtable-arm.h" > > +/* > + * Check if the CPU ASID is available on the SMMU side. If a private context > + * descriptor is using it, try to replace it. > + */ > static struct arm_smmu_ctx_desc * > arm_smmu_share_asid(struct mm_struct *mm, u16 asid) > { > + int ret; > + u32 new_asid; > struct arm_smmu_ctx_desc *cd; > + struct arm_smmu_device *smmu; > + struct arm_smmu_domain *smmu_domain; > > cd = xa_load(&arm_smmu_asid_xa, asid); > if (!cd) > @@ -27,8 +35,31 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid) > return cd; > } > > - /* Ouch, ASID is already in use for a private cd. */ > - return ERR_PTR(-EBUSY); > + smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd); > + smmu = smmu_domain->smmu; > + > + ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd, > + XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); > + if (ret) > + return ERR_PTR(-ENOSPC); > + /* > + * Race with unmap: TLB invalidations will start targeting the new ASID, > + * which isn't assigned yet. We'll do an invalidate-all on the old ASID > + * later, so it doesn't matter. > + */ > + cd->asid = new_asid; > + /* > + * Update ASID and invalidate CD in all associated masters. There will > + * be some overlap between use of both ASIDs, until we invalidate the > + * TLB. > + */ > + arm_smmu_write_ctx_desc(smmu_domain, 0, cd); > + > + /* Invalidate TLB entries previously associated with that context */ > + arm_smmu_tlb_inv_asid(smmu, asid); > + > + xa_erase(&arm_smmu_asid_xa, asid); > + return NULL; > } > > __maybe_unused > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index 19af27fd183b..e99ebdd4c841 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -872,6 +872,17 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu, > } > > /* Context descriptor manipulation functions */ > +void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) > +{ > + struct arm_smmu_cmdq_ent cmd = { > + .opcode = CMDQ_OP_TLBI_NH_ASID, > + .tlbi.asid = asid, > + }; > + > + arm_smmu_cmdq_issue_cmd(smmu, &cmd); > + arm_smmu_cmdq_issue_sync(smmu); > +} > + > static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, > int ssid, bool leaf) > { > @@ -952,8 +963,8 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, > return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS; > } > > -static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, > - int ssid, struct arm_smmu_ctx_desc *cd) > +int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, > + struct arm_smmu_ctx_desc *cd) > { > /* > * This function handles the following cases: > @@ -1609,15 +1620,6 @@ static void arm_smmu_tlb_inv_context(void *cookie) > struct arm_smmu_device *smmu = smmu_domain->smmu; > struct arm_smmu_cmdq_ent cmd; > > - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { > - cmd.opcode = CMDQ_OP_TLBI_NH_ASID; > - cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; > - cmd.tlbi.vmid = 0; > - } else { > - cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; > - cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; > - } > - > /* > * NOTE: when io-pgtable is in non-strict mode, we may get here with > * PTEs previously cleared by unmaps on the current CPU not yet visible > @@ -1625,8 +1627,14 @@ static void arm_smmu_tlb_inv_context(void *cookie) > * insertion to guarantee those are observed before the TLBI. Do be > * careful, 007. > */ > - arm_smmu_cmdq_issue_cmd(smmu, &cmd); > - arm_smmu_cmdq_issue_sync(smmu); > + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { > + arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid); > + } else { > + cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; > + cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; > + arm_smmu_cmdq_issue_cmd(smmu, &cmd); > + arm_smmu_cmdq_issue_sync(smmu); > + } Nothing wrong with the code, but you could perhaps have split out the noop refactoring change from more interesting parts. > arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0); > } >
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 6b06a6f19604..90c08f156b43 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -678,6 +678,9 @@ struct arm_smmu_domain { extern struct xarray arm_smmu_asid_xa; extern struct mutex arm_smmu_asid_lock; +int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, + struct arm_smmu_ctx_desc *cd); +void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd); #endif /* _ARM_SMMU_V3_H */ diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 6c1113059632..ef3fcfa72187 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -10,10 +10,18 @@ #include "arm-smmu-v3.h" #include "../../io-pgtable-arm.h" +/* + * Check if the CPU ASID is available on the SMMU side. If a private context + * descriptor is using it, try to replace it. + */ static struct arm_smmu_ctx_desc * arm_smmu_share_asid(struct mm_struct *mm, u16 asid) { + int ret; + u32 new_asid; struct arm_smmu_ctx_desc *cd; + struct arm_smmu_device *smmu; + struct arm_smmu_domain *smmu_domain; cd = xa_load(&arm_smmu_asid_xa, asid); if (!cd) @@ -27,8 +35,31 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid) return cd; } - /* Ouch, ASID is already in use for a private cd. */ - return ERR_PTR(-EBUSY); + smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd); + smmu = smmu_domain->smmu; + + ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd, + XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); + if (ret) + return ERR_PTR(-ENOSPC); + /* + * Race with unmap: TLB invalidations will start targeting the new ASID, + * which isn't assigned yet. We'll do an invalidate-all on the old ASID + * later, so it doesn't matter. + */ + cd->asid = new_asid; + /* + * Update ASID and invalidate CD in all associated masters. There will + * be some overlap between use of both ASIDs, until we invalidate the + * TLB. + */ + arm_smmu_write_ctx_desc(smmu_domain, 0, cd); + + /* Invalidate TLB entries previously associated with that context */ + arm_smmu_tlb_inv_asid(smmu, asid); + + xa_erase(&arm_smmu_asid_xa, asid); + return NULL; } __maybe_unused diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 19af27fd183b..e99ebdd4c841 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -872,6 +872,17 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu, } /* Context descriptor manipulation functions */ +void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) +{ + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_TLBI_NH_ASID, + .tlbi.asid = asid, + }; + + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + arm_smmu_cmdq_issue_sync(smmu); +} + static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, int ssid, bool leaf) { @@ -952,8 +963,8 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS; } -static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, - int ssid, struct arm_smmu_ctx_desc *cd) +int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, + struct arm_smmu_ctx_desc *cd) { /* * This function handles the following cases: @@ -1609,15 +1620,6 @@ static void arm_smmu_tlb_inv_context(void *cookie) struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_cmdq_ent cmd; - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - cmd.opcode = CMDQ_OP_TLBI_NH_ASID; - cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; - cmd.tlbi.vmid = 0; - } else { - cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; - cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; - } - /* * NOTE: when io-pgtable is in non-strict mode, we may get here with * PTEs previously cleared by unmaps on the current CPU not yet visible @@ -1625,8 +1627,14 @@ static void arm_smmu_tlb_inv_context(void *cookie) * insertion to guarantee those are observed before the TLBI. Do be * careful, 007. */ - arm_smmu_cmdq_issue_cmd(smmu, &cmd); - arm_smmu_cmdq_issue_sync(smmu); + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { + arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid); + } else { + cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; + cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + arm_smmu_cmdq_issue_sync(smmu); + } arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0); }
The SMMU has a single ASID space, the union of shared and private ASID sets. This means that the SMMU driver competes with the arch allocator for ASIDs. Shared ASIDs are those of Linux processes, allocated by the arch, and contribute in broadcast TLB maintenance. Private ASIDs are allocated by the SMMU driver and used for "classic" map/unmap DMA. They require command-queue TLB invalidations. When we pin down an mm_context and get an ASID that is already in use by the SMMU, it belongs to a private context. We used to simply abort the bind, but this is unfair to users that would be unable to bind a few seemingly random processes. Try to allocate a new private ASID for the context, and make the old ASID shared. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> --- v10: fix ASID limit, small comment update --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++ .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 35 +++++++++++++++++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 +++++++++++------- 3 files changed, 57 insertions(+), 15 deletions(-)