diff mbox series

[16/19] iommu/amd: Use roundup_pow_two() instead of get_order()

Message ID 16-v1-416f64558c7c+2a5-iommu_pages_jgg@nvidia.com (mailing list archive)
State New
Headers show
Series iommu: Further abstract iommu-pages | expand

Commit Message

Jason Gunthorpe Feb. 4, 2025, 6:34 p.m. UTC
1 << (get_order(x) + PAGE_SHIFT) ==  roundup_pow_two()

Use the shorter version.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/amd/init.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Comments

Robin Murphy Feb. 5, 2025, 4:11 p.m. UTC | #1
On 2025-02-04 6:34 pm, Jason Gunthorpe wrote:
>    1 << (get_order(x) + PAGE_SHIFT) ==  roundup_pow_two()

...unless x < 2048, which does seem possible here if last_bdf is 
sufficiently small. Not too significant in the cases where the result is 
only rounded back up again for an allocation or remap anyway, but I do 
wonder about the use of dev_table_size in iommu_set_device_table(), and 
whether this change might break it, or whether it's already wrong for 
that case and this might actually fix it...

Thanks,
Robin.

> Use the shorter version.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/iommu/amd/init.c | 5 +----
>   1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index f1c5041647173c..7d77929bc63af3 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -247,10 +247,7 @@ static void init_translation_status(struct amd_iommu *iommu)
>   
>   static inline unsigned long tbl_size(int entry_size, int last_bdf)
>   {
> -	unsigned shift = PAGE_SHIFT +
> -			 get_order((last_bdf + 1) * entry_size);
> -
> -	return 1UL << shift;
> +	return roundup_pow_of_two((last_bdf + 1) * entry_size);
>   }
>   
>   int amd_iommu_get_num_iommus(void)
Jason Gunthorpe Feb. 5, 2025, 5:59 p.m. UTC | #2
On Wed, Feb 05, 2025 at 04:11:00PM +0000, Robin Murphy wrote:
> On 2025-02-04 6:34 pm, Jason Gunthorpe wrote:
> >    1 << (get_order(x) + PAGE_SHIFT) ==  roundup_pow_two()
> 
> ...unless x < 2048, which does seem possible here if last_bdf is
> sufficiently small. 

Yes.. I was thinking that sub page was what this stuff wanted however
I missed:

> but I do wonder about the use of dev_table_size in
> iommu_set_device_table(), and whether this change might break it, or
> whether it's already wrong for that case and this might actually fix
> it...

Indeed, the spec says 4k pages here:

 This field contains an unsigned value m that specifies the size of
 the Device Table for this segment in 4 Kbyte
 increments. The size in bytes is equal to (m + 1) * 4 Kbytes.

And it certainly has to be algined, so it does need a PAGE_ALIGN()
around it.

I'm going to add another patch to change rlookup_table and
irq_lookup_table to use a simple kvzalloc(), they are not used with HW
and just need to be some simple CPU memory of the proper page size,

Then do:

-       pci_seg->dev_table_size     = tbl_size(DEV_TABLE_ENTRY_SIZE, last_bdf);
+       pci_seg->dev_table_size =
+               PAGE_ALIGN(tbl_size(DEV_TABLE_ENTRY_SIZE, last_bdf));

To fixup the HW memory.

Thanks,
Jason
diff mbox series

Patch

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index f1c5041647173c..7d77929bc63af3 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -247,10 +247,7 @@  static void init_translation_status(struct amd_iommu *iommu)
 
 static inline unsigned long tbl_size(int entry_size, int last_bdf)
 {
-	unsigned shift = PAGE_SHIFT +
-			 get_order((last_bdf + 1) * entry_size);
-
-	return 1UL << shift;
+	return roundup_pow_of_two((last_bdf + 1) * entry_size);
 }
 
 int amd_iommu_get_num_iommus(void)