Message ID | 20220908040821.5786-1-alexander.deucher@amd.com (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | [1/2] drm/amdgpu: make sure to init common IP before gmc | expand |
On 9/8/2022 9:38 AM, Alex Deucher wrote: > Common is mainly golden register setting and HDP register > remapping, it shouldn't allocate any GPU memory. Make sure > common happens before gmc so that the HDP registers are > remapped before gmc attempts to access them. > > This fixes the Unsupported Request error reported through > AER during driver load. The error happens as a write happens > to the remap offset before real remapping is done. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 > > The error was unnoticed before and got visible because of the commit > referenced below. This doesn't fix anything in the commit below, rather > fixes the issue in amdgpu exposed by the commit. The reference is only > to associate this commit with below one so that both go together. > > Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Series is: Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Thanks, Lijo > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 899564ea8b4b..4da85ce9e3b1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2375,8 +2375,16 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) > } > adev->ip_blocks[i].status.sw = true; > > - /* need to do gmc hw init early so we can allocate gpu mem */ > - if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { > + if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON) { > + /* need to do common hw init early so everything is set up for gmc */ > + r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev); > + if (r) { > + DRM_ERROR("hw_init %d failed %d\n", i, r); > + goto init_failed; > + } > + adev->ip_blocks[i].status.hw = true; > + } else if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { > + /* need to do gmc hw init early so we can allocate gpu mem */ > /* Try to reserve bad pages early */ > if (amdgpu_sriov_vf(adev)) > amdgpu_virt_exchange_data(adev); > @@ -3062,8 +3070,8 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev) > int i, r; > > static enum amd_ip_block_type ip_order[] = { > - AMD_IP_BLOCK_TYPE_GMC, > AMD_IP_BLOCK_TYPE_COMMON, > + AMD_IP_BLOCK_TYPE_GMC, > AMD_IP_BLOCK_TYPE_PSP, > AMD_IP_BLOCK_TYPE_IH, > }; >
On Thu, Sep 8, 2022 at 1:11 AM Lazar, Lijo <lijo.lazar@amd.com> wrote: > > > > On 9/8/2022 9:38 AM, Alex Deucher wrote: > > Common is mainly golden register setting and HDP register > > remapping, it shouldn't allocate any GPU memory. Make sure > > common happens before gmc so that the HDP registers are > > remapped before gmc attempts to access them. > > > > This fixes the Unsupported Request error reported through > > AER during driver load. The error happens as a write happens > > to the remap offset before real remapping is done. > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 > > > > The error was unnoticed before and got visible because of the commit > > referenced below. This doesn't fix anything in the commit below, rather > > fixes the issue in amdgpu exposed by the commit. The reference is only > > to associate this commit with below one so that both go together. > > > > Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > Series is: > Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> @tseewald@gmail.com it would be good if you could verify that this patch fixes the issue for you as well. Thanks, Alex > > Thanks, > Lijo > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++++++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > index 899564ea8b4b..4da85ce9e3b1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > @@ -2375,8 +2375,16 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) > > } > > adev->ip_blocks[i].status.sw = true; > > > > - /* need to do gmc hw init early so we can allocate gpu mem */ > > - if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { > > + if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON) { > > + /* need to do common hw init early so everything is set up for gmc */ > > + r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev); > > + if (r) { > > + DRM_ERROR("hw_init %d failed %d\n", i, r); > > + goto init_failed; > > + } > > + adev->ip_blocks[i].status.hw = true; > > + } else if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { > > + /* need to do gmc hw init early so we can allocate gpu mem */ > > /* Try to reserve bad pages early */ > > if (amdgpu_sriov_vf(adev)) > > amdgpu_virt_exchange_data(adev); > > @@ -3062,8 +3070,8 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev) > > int i, r; > > > > static enum amd_ip_block_type ip_order[] = { > > - AMD_IP_BLOCK_TYPE_GMC, > > AMD_IP_BLOCK_TYPE_COMMON, > > + AMD_IP_BLOCK_TYPE_GMC, > > AMD_IP_BLOCK_TYPE_PSP, > > AMD_IP_BLOCK_TYPE_IH, > > }; > >
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 899564ea8b4b..4da85ce9e3b1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2375,8 +2375,16 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) } adev->ip_blocks[i].status.sw = true; - /* need to do gmc hw init early so we can allocate gpu mem */ - if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { + if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON) { + /* need to do common hw init early so everything is set up for gmc */ + r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev); + if (r) { + DRM_ERROR("hw_init %d failed %d\n", i, r); + goto init_failed; + } + adev->ip_blocks[i].status.hw = true; + } else if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC) { + /* need to do gmc hw init early so we can allocate gpu mem */ /* Try to reserve bad pages early */ if (amdgpu_sriov_vf(adev)) amdgpu_virt_exchange_data(adev); @@ -3062,8 +3070,8 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev) int i, r; static enum amd_ip_block_type ip_order[] = { - AMD_IP_BLOCK_TYPE_GMC, AMD_IP_BLOCK_TYPE_COMMON, + AMD_IP_BLOCK_TYPE_GMC, AMD_IP_BLOCK_TYPE_PSP, AMD_IP_BLOCK_TYPE_IH, };
Common is mainly golden register setting and HDP register remapping, it shouldn't allocate any GPU memory. Make sure common happens before gmc so that the HDP registers are remapped before gmc attempts to access them. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)