Message ID | 20240410024810.1707500-1-arun.r.murthy@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [PATCHv3] drm/xe/display: use mul_u32_u32 for multiplying operands | expand |
On Wed, Apr 10, 2024 at 08:18:10AM +0530, Arun R Murthy wrote: > Use mul_u32_u32 to avoid potential overflow in multiplying two u32 and > store the u64 result. > > v2: remove u64 typecast and use mul_u32_u32 (Ville) > v3: Reframe the commit message <Rodrigo> I still believe that the message is not that clear tbh. I mean, just by reading that without the understanding of the flow, one might simply ask, how can you overflow a 64bit with a u32 times u32? I was going to tweak that while pushing, but I got a conflict. So it needs a rebase anyway, so probably making sure that it is clear that compiler can use a u32 for storing the multiplication before it moves the result to the u64 and that could overflow. So, this cast forces the compiler to already use a u64 during the multiplication, before the final store. > > Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com> > --- > drivers/gpu/drm/xe/display/xe_fb_pin.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c > index 3a584bc3a0a3..c73054c09ae9 100644 > --- a/drivers/gpu/drm/xe/display/xe_fb_pin.c > +++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c > @@ -29,7 +29,7 @@ write_dpt_rotated(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs, u32 bo_ > u32 src_idx = src_stride * (height - 1) + column + bo_ofs; > > for (row = 0; row < height; row++) { > - u64 pte = ggtt->pt_ops->pte_encode_bo(bo, src_idx * XE_PAGE_SIZE, > + u64 pte = ggtt->pt_ops->pte_encode_bo(bo, mul_u32_u32(src_idx, XE_PAGE_SIZE), > xe->pat.idx[XE_CACHE_NONE]); > > iosys_map_wr(map, *dpt_ofs, u64, pte); > @@ -61,8 +61,8 @@ write_dpt_remapped(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs, > > for (column = 0; column < width; column++) { > iosys_map_wr(map, *dpt_ofs, u64, > - pte_encode_bo(bo, src_idx * XE_PAGE_SIZE, > - xe->pat.idx[XE_CACHE_NONE])); > + pte_encode_bo(bo, mul_u32_u32(src_idx, XE_PAGE_SIZE), > + xe->pat.idx[XE_CACHE_NONE])); > > *dpt_ofs += 8; > src_idx++; > @@ -121,7 +121,7 @@ static int __xe_pin_fb_vma_dpt(struct intel_framebuffer *fb, > u32 x; > > for (x = 0; x < size / XE_PAGE_SIZE; x++) { > - u64 pte = ggtt->pt_ops->pte_encode_bo(bo, x * XE_PAGE_SIZE, > + u64 pte = ggtt->pt_ops->pte_encode_bo(bo, mul_u32_u32(x, XE_PAGE_SIZE), > xe->pat.idx[XE_CACHE_NONE]); > > iosys_map_wr(&dpt->vmap, x * 8, u64, pte); > @@ -167,7 +167,7 @@ write_ggtt_rotated(struct xe_bo *bo, struct xe_ggtt *ggtt, u32 *ggtt_ofs, u32 bo > u32 src_idx = src_stride * (height - 1) + column + bo_ofs; > > for (row = 0; row < height; row++) { > - u64 pte = ggtt->pt_ops->pte_encode_bo(bo, src_idx * XE_PAGE_SIZE, > + u64 pte = ggtt->pt_ops->pte_encode_bo(bo, mul_u32_u32(src_idx, XE_PAGE_SIZE), > xe->pat.idx[XE_CACHE_NONE]); > > xe_ggtt_set_pte(ggtt, *ggtt_ofs, pte); > -- > 2.25.1 >
diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c index 3a584bc3a0a3..c73054c09ae9 100644 --- a/drivers/gpu/drm/xe/display/xe_fb_pin.c +++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c @@ -29,7 +29,7 @@ write_dpt_rotated(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs, u32 bo_ u32 src_idx = src_stride * (height - 1) + column + bo_ofs; for (row = 0; row < height; row++) { - u64 pte = ggtt->pt_ops->pte_encode_bo(bo, src_idx * XE_PAGE_SIZE, + u64 pte = ggtt->pt_ops->pte_encode_bo(bo, mul_u32_u32(src_idx, XE_PAGE_SIZE), xe->pat.idx[XE_CACHE_NONE]); iosys_map_wr(map, *dpt_ofs, u64, pte); @@ -61,8 +61,8 @@ write_dpt_remapped(struct xe_bo *bo, struct iosys_map *map, u32 *dpt_ofs, for (column = 0; column < width; column++) { iosys_map_wr(map, *dpt_ofs, u64, - pte_encode_bo(bo, src_idx * XE_PAGE_SIZE, - xe->pat.idx[XE_CACHE_NONE])); + pte_encode_bo(bo, mul_u32_u32(src_idx, XE_PAGE_SIZE), + xe->pat.idx[XE_CACHE_NONE])); *dpt_ofs += 8; src_idx++; @@ -121,7 +121,7 @@ static int __xe_pin_fb_vma_dpt(struct intel_framebuffer *fb, u32 x; for (x = 0; x < size / XE_PAGE_SIZE; x++) { - u64 pte = ggtt->pt_ops->pte_encode_bo(bo, x * XE_PAGE_SIZE, + u64 pte = ggtt->pt_ops->pte_encode_bo(bo, mul_u32_u32(x, XE_PAGE_SIZE), xe->pat.idx[XE_CACHE_NONE]); iosys_map_wr(&dpt->vmap, x * 8, u64, pte); @@ -167,7 +167,7 @@ write_ggtt_rotated(struct xe_bo *bo, struct xe_ggtt *ggtt, u32 *ggtt_ofs, u32 bo u32 src_idx = src_stride * (height - 1) + column + bo_ofs; for (row = 0; row < height; row++) { - u64 pte = ggtt->pt_ops->pte_encode_bo(bo, src_idx * XE_PAGE_SIZE, + u64 pte = ggtt->pt_ops->pte_encode_bo(bo, mul_u32_u32(src_idx, XE_PAGE_SIZE), xe->pat.idx[XE_CACHE_NONE]); xe_ggtt_set_pte(ggtt, *ggtt_ofs, pte);
Use mul_u32_u32 to avoid potential overflow in multiplying two u32 and store the u64 result. v2: remove u64 typecast and use mul_u32_u32 (Ville) v3: Reframe the commit message <Rodrigo> Signed-off-by: Arun R Murthy <arun.r.murthy@intel.com> --- drivers/gpu/drm/xe/display/xe_fb_pin.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)