Message ID | 16-v1-416f64558c7c+2a5-iommu_pages_jgg@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | iommu: Further abstract iommu-pages | expand |
On 2025-02-04 6:34 pm, Jason Gunthorpe wrote: > 1 << (get_order(x) + PAGE_SHIFT) == roundup_pow_two() ...unless x < 2048, which does seem possible here if last_bdf is sufficiently small. Not too significant in the cases where the result is only rounded back up again for an allocation or remap anyway, but I do wonder about the use of dev_table_size in iommu_set_device_table(), and whether this change might break it, or whether it's already wrong for that case and this might actually fix it... Thanks, Robin. > Use the shorter version. > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/iommu/amd/init.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c > index f1c5041647173c..7d77929bc63af3 100644 > --- a/drivers/iommu/amd/init.c > +++ b/drivers/iommu/amd/init.c > @@ -247,10 +247,7 @@ static void init_translation_status(struct amd_iommu *iommu) > > static inline unsigned long tbl_size(int entry_size, int last_bdf) > { > - unsigned shift = PAGE_SHIFT + > - get_order((last_bdf + 1) * entry_size); > - > - return 1UL << shift; > + return roundup_pow_of_two((last_bdf + 1) * entry_size); > } > > int amd_iommu_get_num_iommus(void)
On Wed, Feb 05, 2025 at 04:11:00PM +0000, Robin Murphy wrote: > On 2025-02-04 6:34 pm, Jason Gunthorpe wrote: > > 1 << (get_order(x) + PAGE_SHIFT) == roundup_pow_two() > > ...unless x < 2048, which does seem possible here if last_bdf is > sufficiently small. Yes.. I was thinking that sub page was what this stuff wanted however I missed: > but I do wonder about the use of dev_table_size in > iommu_set_device_table(), and whether this change might break it, or > whether it's already wrong for that case and this might actually fix > it... Indeed, the spec says 4k pages here: This field contains an unsigned value m that specifies the size of the Device Table for this segment in 4 Kbyte increments. The size in bytes is equal to (m + 1) * 4 Kbytes. And it certainly has to be algined, so it does need a PAGE_ALIGN() around it. I'm going to add another patch to change rlookup_table and irq_lookup_table to use a simple kvzalloc(), they are not used with HW and just need to be some simple CPU memory of the proper page size, Then do: - pci_seg->dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE, last_bdf); + pci_seg->dev_table_size = + PAGE_ALIGN(tbl_size(DEV_TABLE_ENTRY_SIZE, last_bdf)); To fixup the HW memory. Thanks, Jason
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index f1c5041647173c..7d77929bc63af3 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -247,10 +247,7 @@ static void init_translation_status(struct amd_iommu *iommu) static inline unsigned long tbl_size(int entry_size, int last_bdf) { - unsigned shift = PAGE_SHIFT + - get_order((last_bdf + 1) * entry_size); - - return 1UL << shift; + return roundup_pow_of_two((last_bdf + 1) * entry_size); } int amd_iommu_get_num_iommus(void)
1 << (get_order(x) + PAGE_SHIFT) == roundup_pow_two() Use the shorter version. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/iommu/amd/init.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)