Message ID | 20190708135238.651483-1-arnd@arndb.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/amd/display: avoid 64-bit division | expand |
On 7/8/19 9:52 AM, Arnd Bergmann wrote: > On 32-bit architectures, dividing a 64-bit integer in the kernel > leads to a link error: > > ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! > ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! > > Change the two recently introduced instances to a multiply+shift > operation that is also much cheaper on 32-bit architectures. > We can do that here, since both of them are really 32-bit numbers > that change a few percent. > > Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link") > Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++-- > drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c > index c17db5c144aa..8dbf759eba45 100644 > --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c > @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps( > * but the difference is minimal and is in a safe direction, > * which all works well around potential ambiguity of DP 1.4a spec. > */ > - long long fec_link_bw_kbps = link_bw_kbps * 970LL; > - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL); > + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000, > + link_bw_kbps, 32); > } > #endif > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > index b35327bafbc5..70ac8a95d2db 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_ > calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000; > > // FCLK:UCLK ratio is 1.08 > - min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 1000000; > + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 1000000, uclk_states[i], 32); Even though the mul + shift will be faster here, I would prefer that this just be a div_u64 for clarity. Nicholas Kazlauskas > > calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ? > min_dcfclk : min_fclk_required_by_uclk; >
Acked-by: Slava Abramov <slava.abramov@amd.com> Tested-by: Slava Abramov <slava.abramov@amd.com>
Hi Arnd! Thanks for bisecting this issue. I wonder whether you are going to commit your patch or planning to update it and it's still in your work queue. We have one of our 32-bit builds failing because of this issue, so that I would like either to fix it or wait to your fix if it has chances to go upstream today. Sincerely, Slava Abramov
I'll just apply Arnd's patch. If the display team wants to adjust it later to clarify the operation, they should go ahead as a follow up patch. Thanks, Alex
On Tue, Jul 9, 2019 at 6:40 PM Deucher, Alexander <Alexander.Deucher@amd.com> wrote: > > I'll just apply Arnd's patch. If the display team wants to adjust it later to clarify the > operation, they should go ahead as a follow up patch. Thanks! > From: Abramov, Slava > Sent: Tuesday, July 9, 2019 12:31 PM > > Thanks for bisecting this issue. > > > > I wonder whether you are going to commit your patch or planning to update it and it's > > still in your work queue. We have one of our 32-bit builds failing because of this > > issue, so that I would like either to fix it or wait to your fix if it has chances to go > > upstream today. I was going to update the patch, but had not gotten to that yet. Arnd
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index c17db5c144aa..8dbf759eba45 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps( * but the difference is minimal and is in a safe direction, * which all works well around potential ambiguity of DP 1.4a spec. */ - long long fec_link_bw_kbps = link_bw_kbps * 970LL; - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL); + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000, + link_bw_kbps, 32); } #endif diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c index b35327bafbc5..70ac8a95d2db 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_ calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000; // FCLK:UCLK ratio is 1.08 - min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 1000000; + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 1000000, uclk_states[i], 32); calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ? min_dcfclk : min_fclk_required_by_uclk;
On 32-bit architectures, dividing a 64-bit integer in the kernel leads to a link error: ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Change the two recently introduced instances to a multiply+shift operation that is also much cheaper on 32-bit architectures. We can do that here, since both of them are really 32-bit numbers that change a few percent. Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link") Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV") Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++-- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)