Message ID | 9-v3-e2e16cd7467f+2a6a1-smmuv3_nesting_jgg@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Initial support for SMMUv3 nested translation | expand |
On Wed, Oct 09, 2024 at 01:23:15PM -0300, Jason Gunthorpe wrote: > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field > works. When S2FWB is supported and enabled the IOPTE will force cachable > access to IOMMU_CACHE memory when nesting with a S1 and deny cachable > access otherwise. > > When using a single stage of translation, a simple S2 domain, it doesn't > change things for PCI devices as it is just a different encoding for the > existing mapping of the IOMMU protection flags to cachability attributes. > For non-PCI it also changes the combining rules when incoming transactions > have inconsistent attributes. > > However, when used with a nested S1, FWB has the effect of preventing the > guest from choosing a MemAttr in it's S1 that would cause ordinary DMA to > bypass the cache. Consistent with KVM we wish to deny the guest the > ability to become incoherent with cached memory the hypervisor believes is > cachable so we don't have to flush it. > > Allow NESTED domains to be created if the SMMU has S2FWB support and use > S2FWB for NESTING_PARENTS. This is an additional option to CANWBS. > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> My HW doesn't support this S2FWB for testing, but the patch LGTM. Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> > @@ -265,6 +266,7 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid) > #define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6) > > #define STRTAB_STE_1_S1STALLD (1UL << 27) > +#define STRTAB_STE_1_S2FWB (1UL << 25) Nit: seems that it should be in ascending order. Thanks Nicolin
On Wed, Oct 09, 2024 at 10:42:04AM -0700, Nicolin Chen wrote: > > @@ -265,6 +266,7 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid) > > #define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6) > > > > #define STRTAB_STE_1_S1STALLD (1UL << 27) > > +#define STRTAB_STE_1_S2FWB (1UL << 25) > > Nit: seems that it should be in ascending order. Done Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Thursday, October 10, 2024 12:23 AM > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field > works. When S2FWB is supported and enabled the IOPTE will force cachable > access to IOMMU_CACHE memory when nesting with a S1 and deny cachable > access otherwise. didn't get the last part "deny cacheable access otherwise" > @@ -169,7 +169,8 @@ arm_smmu_domain_alloc_nesting(struct device *dev, > u32 flags, > * Must support some way to prevent the VM from bypassing the > cache > * because VFIO currently does not do any cache maintenance. > */ > - if (!arm_smmu_master_canwbs(master)) > + if (!arm_smmu_master_canwbs(master) && > + !(master->smmu->features & ARM_SMMU_FEAT_S2FWB)) > return ERR_PTR(-EOPNOTSUPP); Probably can clarify the difference between CANWBS and S2FWB here by copying some words from the previous commit message. especially about the part of PCI nosnoop. Reviewed-by: Kevin Tian <kevin.tian@intel.com>
On Thu, Oct 24, 2024 at 07:54:10AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Thursday, October 10, 2024 12:23 AM > > > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field > > works. When S2FWB is supported and enabled the IOPTE will force cachable > > access to IOMMU_CACHE memory when nesting with a S1 and deny cachable > > access otherwise. > > didn't get the last part "deny cacheable access otherwise" Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field works. When S2FWB is supported and enabled the IOPTE will force cachable access to IOMMU_CACHE memory when nesting with a S1 and deny cachable access when !IOMMU_CACHE. ? > > @@ -169,7 +169,8 @@ arm_smmu_domain_alloc_nesting(struct device *dev, > > u32 flags, > > * Must support some way to prevent the VM from bypassing the > > cache > > * because VFIO currently does not do any cache maintenance. > > */ > > - if (!arm_smmu_master_canwbs(master)) > > + if (!arm_smmu_master_canwbs(master) && > > + !(master->smmu->features & ARM_SMMU_FEAT_S2FWB)) > > return ERR_PTR(-EOPNOTSUPP); > > Probably can clarify the difference between CANWBS and S2FWB here > by copying some words from the previous commit message. especially > about the part of PCI nosnoop. /* * Must support some way to prevent the VM from bypassing the cache * because VFIO currently does not do any cache maintenance. canwbs * indicates the device is fully coherent and no cache maintenance is * ever required, even for PCI No-Snoop. S2FWB means the S1 can't make * things non-coherent using the memattr, but No-Snoop behavior is not * effected. */ if (!arm_smmu_master_canwbs(master) && !(master->smmu->features & ARM_SMMU_FEAT_S2FWB)) return ERR_PTR(-EOPNOTSUPP); ? Thanks, Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, October 25, 2024 10:01 PM > > On Thu, Oct 24, 2024 at 07:54:10AM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > Sent: Thursday, October 10, 2024 12:23 AM > > > > > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field > > > works. When S2FWB is supported and enabled the IOPTE will force > cachable > > > access to IOMMU_CACHE memory when nesting with a S1 and deny > cachable > > > access otherwise. > > > > didn't get the last part "deny cacheable access otherwise" > > Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field > works. When S2FWB is supported and enabled the IOPTE will force cachable > access to IOMMU_CACHE memory when nesting with a S1 and deny > cachable > access when !IOMMU_CACHE. > > ? Good. I read the original words as if cacheable access is denied when S2FWB is not supported or disabled. > > > > @@ -169,7 +169,8 @@ arm_smmu_domain_alloc_nesting(struct device > *dev, > > > u32 flags, > > > * Must support some way to prevent the VM from bypassing the > > > cache > > > * because VFIO currently does not do any cache maintenance. > > > */ > > > - if (!arm_smmu_master_canwbs(master)) > > > + if (!arm_smmu_master_canwbs(master) && > > > + !(master->smmu->features & ARM_SMMU_FEAT_S2FWB)) > > > return ERR_PTR(-EOPNOTSUPP); > > > > Probably can clarify the difference between CANWBS and S2FWB here > > by copying some words from the previous commit message. especially > > about the part of PCI nosnoop. > > /* > * Must support some way to prevent the VM from bypassing the > cache > * because VFIO currently does not do any cache maintenance. > canwbs > * indicates the device is fully coherent and no cache maintenance is > * ever required, even for PCI No-Snoop. S2FWB means the S1 can't > make > * things non-coherent using the memattr, but No-Snoop behavior is > not > * effected. > */ > if (!arm_smmu_master_canwbs(master) && > !(master->smmu->features & ARM_SMMU_FEAT_S2FWB)) > return ERR_PTR(-EOPNOTSUPP); > > ? > Looks good!
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c index a9aa7514e65ce4..44e1b9bef850d9 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c @@ -169,7 +169,8 @@ arm_smmu_domain_alloc_nesting(struct device *dev, u32 flags, * Must support some way to prevent the VM from bypassing the cache * because VFIO currently does not do any cache maintenance. */ - if (!arm_smmu_master_canwbs(master)) + if (!arm_smmu_master_canwbs(master) && + !(master->smmu->features & ARM_SMMU_FEAT_S2FWB)) return ERR_PTR(-EOPNOTSUPP); /* diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index eb401a4adfedc8..4e559e02514983 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1046,7 +1046,8 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits) /* S2 translates */ if (cfg & BIT(1)) { used_bits[1] |= - cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG); + cpu_to_le64(STRTAB_STE_1_S2FWB | STRTAB_STE_1_EATS | + STRTAB_STE_1_SHCFG); used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR | STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI | @@ -1654,6 +1655,8 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, FIELD_PREP(STRTAB_STE_1_EATS, ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); + if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_S2FWB) + target->data[1] |= cpu_to_le64(STRTAB_STE_1_S2FWB); if (smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR) target->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); @@ -2472,6 +2475,9 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain, pgtbl_cfg.oas = smmu->oas; fmt = ARM_64_LPAE_S2; finalise_stage_fn = arm_smmu_domain_finalise_s2; + if ((smmu->features & ARM_SMMU_FEAT_S2FWB) && + (flags & IOMMU_HWPT_ALLOC_NEST_PARENT)) + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_S2FWB; break; default: return -EINVAL; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index b5dbf5acbfc4db..e394943c0b4bfe 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -57,6 +57,7 @@ struct arm_smmu_device; #define IDR1_SIDSIZE GENMASK(5, 0) #define ARM_SMMU_IDR3 0xc +#define IDR3_FWB (1 << 8) #define IDR3_RIL (1 << 10) #define ARM_SMMU_IDR5 0x14 @@ -265,6 +266,7 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid) #define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6) #define STRTAB_STE_1_S1STALLD (1UL << 27) +#define STRTAB_STE_1_S2FWB (1UL << 25) #define STRTAB_STE_1_EATS GENMASK_ULL(29, 28) #define STRTAB_STE_1_EATS_ABT 0UL @@ -739,6 +741,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_ATTR_TYPES_OVR (1 << 20) #define ARM_SMMU_FEAT_HA (1 << 21) #define ARM_SMMU_FEAT_HD (1 << 22) +#define ARM_SMMU_FEAT_S2FWB (1 << 23) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 0e67f1721a3d98..74f58c6ac30cbd 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -106,6 +106,18 @@ #define ARM_LPAE_PTE_HAP_FAULT (((arm_lpae_iopte)0) << 6) #define ARM_LPAE_PTE_HAP_READ (((arm_lpae_iopte)1) << 6) #define ARM_LPAE_PTE_HAP_WRITE (((arm_lpae_iopte)2) << 6) +/* + * For !FWB these code to: + * 1111 = Normal outer write back cachable / Inner Write Back Cachable + * Permit S1 to override + * 0101 = Normal Non-cachable / Inner Non-cachable + * 0001 = Device / Device-nGnRE + * For S2FWB these code: + * 0110 Force Normal Write Back + * 0101 Normal* is forced Normal-NC, Device unchanged + * 0001 Force Device-nGnRE + */ +#define ARM_LPAE_PTE_MEMATTR_FWB_WB (((arm_lpae_iopte)0x6) << 2) #define ARM_LPAE_PTE_MEMATTR_OIWB (((arm_lpae_iopte)0xf) << 2) #define ARM_LPAE_PTE_MEMATTR_NC (((arm_lpae_iopte)0x5) << 2) #define ARM_LPAE_PTE_MEMATTR_DEV (((arm_lpae_iopte)0x1) << 2) @@ -458,12 +470,16 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, */ if (data->iop.fmt == ARM_64_LPAE_S2 || data->iop.fmt == ARM_32_LPAE_S2) { - if (prot & IOMMU_MMIO) + if (prot & IOMMU_MMIO) { pte |= ARM_LPAE_PTE_MEMATTR_DEV; - else if (prot & IOMMU_CACHE) - pte |= ARM_LPAE_PTE_MEMATTR_OIWB; - else + } else if (prot & IOMMU_CACHE) { + if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_S2FWB) + pte |= ARM_LPAE_PTE_MEMATTR_FWB_WB; + else + pte |= ARM_LPAE_PTE_MEMATTR_OIWB; + } else { pte |= ARM_LPAE_PTE_MEMATTR_NC; + } } else { if (prot & IOMMU_MMIO) pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV @@ -1035,8 +1051,7 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie) struct arm_lpae_io_pgtable *data; typeof(&cfg->arm_lpae_s2_cfg.vtcr) vtcr = &cfg->arm_lpae_s2_cfg.vtcr; - /* The NS quirk doesn't apply at stage 2 */ - if (cfg->quirks) + if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_S2FWB)) return NULL; data = arm_lpae_alloc_pgtable(cfg); diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index b1ecfc3cd5bcc0..ce86b09ae80f18 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -87,6 +87,7 @@ struct io_pgtable_cfg { * attributes set in the TCR for a non-coherent page-table walker. * * IO_PGTABLE_QUIRK_ARM_HD: Enables dirty tracking in stage 1 pagetable. + * IO_PGTABLE_QUIRK_ARM_S2FWB: Use the FWB format for the MemAttrs bits */ #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1) @@ -95,6 +96,7 @@ struct io_pgtable_cfg { #define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5) #define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA BIT(6) #define IO_PGTABLE_QUIRK_ARM_HD BIT(7) + #define IO_PGTABLE_QUIRK_ARM_S2FWB BIT(8) unsigned long quirks; unsigned long pgsize_bitmap; unsigned int ias;
Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field works. When S2FWB is supported and enabled the IOPTE will force cachable access to IOMMU_CACHE memory when nesting with a S1 and deny cachable access otherwise. When using a single stage of translation, a simple S2 domain, it doesn't change things for PCI devices as it is just a different encoding for the existing mapping of the IOMMU protection flags to cachability attributes. For non-PCI it also changes the combining rules when incoming transactions have inconsistent attributes. However, when used with a nested S1, FWB has the effect of preventing the guest from choosing a MemAttr in it's S1 that would cause ordinary DMA to bypass the cache. Consistent with KVM we wish to deny the guest the ability to become incoherent with cached memory the hypervisor believes is cachable so we don't have to flush it. Allow NESTED domains to be created if the SMMU has S2FWB support and use S2FWB for NESTING_PARENTS. This is an additional option to CANWBS. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 3 ++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +++ drivers/iommu/io-pgtable-arm.c | 27 ++++++++++++++----- include/linux/io-pgtable.h | 2 ++ 5 files changed, 35 insertions(+), 8 deletions(-)