Message ID | 20220913144832.2784012-2-alexander.deucher@amd.com (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | fix PCI AER issues | expand |
On 9/13/2022 8:18 PM, Alex Deucher wrote: > This mirrors what we do for other asics and this way we are > sure the ih doorbell range is properly initialized. > > There is a comment about the way doorbells on gfx9 work that > requires that they are initialized for other IPs before GFX > is initialized. In this case IH is initialized before GFX, > so there should be no issue. > Not sure about the association of patch 1 and 2 with AER as in the comment below. I thought the access would go through (PCIE errors may not be reported) and the only side effect is doorbell won't be hit/routed. The comments may not be relevant to patches 1/2, apart from that - Series is: Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Thanks, Lijo > This fixes the Unsupported Request error reported through > AER during driver load. The error happens as a write happens > to the remap offset before real remapping is done. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 > > The error was unnoticed before and got visible because of the commit > referenced below. This doesn't fix anything in the commit below, rather > fixes the issue in amdgpu exposed by the commit. The reference is only > to associate this commit with below one so that both go together. > > Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/soc15.c | 3 --- > drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 4 ++++ > drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 4 ++++ > 3 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 5188da87428d..e6a4002fa67d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -1224,9 +1224,6 @@ static void soc15_doorbell_range_init(struct amdgpu_device *adev) > ring->use_doorbell, ring->doorbell_index, > adev->doorbell_index.sdma_doorbell_range); > } > - > - adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, > - adev->irq.ih.doorbell_index); > } > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > index 03b7066471f9..1e83db0c5438 100644 > --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > @@ -289,6 +289,10 @@ static int vega10_ih_irq_init(struct amdgpu_device *adev) > } > } > > + if (!amdgpu_sriov_vf(adev)) > + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, > + adev->irq.ih.doorbell_index); > + > pci_set_master(adev->pdev); > > /* enable interrupts */ > diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c > index 2022ffbb8dba..59dfca093155 100644 > --- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c > +++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c > @@ -340,6 +340,10 @@ static int vega20_ih_irq_init(struct amdgpu_device *adev) > } > } > > + if (!amdgpu_sriov_vf(adev)) > + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, > + adev->irq.ih.doorbell_index); > + > pci_set_master(adev->pdev); > > /* enable interrupts */ >
On Wed, Sep 14, 2022 at 3:05 AM Lazar, Lijo <lijo.lazar@amd.com> wrote: > > > > On 9/13/2022 8:18 PM, Alex Deucher wrote: > > This mirrors what we do for other asics and this way we are > > sure the ih doorbell range is properly initialized. > > > > There is a comment about the way doorbells on gfx9 work that > > requires that they are initialized for other IPs before GFX > > is initialized. In this case IH is initialized before GFX, > > so there should be no issue. > > > > Not sure about the association of patch 1 and 2 with AER as in the > comment below. I thought the access would go through (PCIE errors may > not be reported) and the only side effect is doorbell won't be hit/routed. > > The comments may not be relevant to patches 1/2, apart from that - Patches 1 and 2 don't fix the actual issue, but they are prerequisites for patch 3. Without patches 1 and 2, patch 3 won't work on all cards. Seemed prudent to just mark all 3, but I could clarify that 1 and 2 are just prerequisites. Thanks, Alex > > Series is: > Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> > > Thanks, > Lijo > > > This fixes the Unsupported Request error reported through > > AER during driver load. The error happens as a write happens > > to the remap offset before real remapping is done. > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 > > > > The error was unnoticed before and got visible because of the commit > > referenced below. This doesn't fix anything in the commit below, rather > > fixes the issue in amdgpu exposed by the commit. The reference is only > > to associate this commit with below one so that both go together. > > > > Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") > > > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/soc15.c | 3 --- > > drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 4 ++++ > > drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 4 ++++ > > 3 files changed, 8 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c > > index 5188da87428d..e6a4002fa67d 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > @@ -1224,9 +1224,6 @@ static void soc15_doorbell_range_init(struct amdgpu_device *adev) > > ring->use_doorbell, ring->doorbell_index, > > adev->doorbell_index.sdma_doorbell_range); > > } > > - > > - adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, > > - adev->irq.ih.doorbell_index); > > } > > } > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > > index 03b7066471f9..1e83db0c5438 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > > +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c > > @@ -289,6 +289,10 @@ static int vega10_ih_irq_init(struct amdgpu_device *adev) > > } > > } > > > > + if (!amdgpu_sriov_vf(adev)) > > + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, > > + adev->irq.ih.doorbell_index); > > + > > pci_set_master(adev->pdev); > > > > /* enable interrupts */ > > diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c > > index 2022ffbb8dba..59dfca093155 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c > > +++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c > > @@ -340,6 +340,10 @@ static int vega20_ih_irq_init(struct amdgpu_device *adev) > > } > > } > > > > + if (!amdgpu_sriov_vf(adev)) > > + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, > > + adev->irq.ih.doorbell_index); > > + > > pci_set_master(adev->pdev); > > > > /* enable interrupts */ > >
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 5188da87428d..e6a4002fa67d 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -1224,9 +1224,6 @@ static void soc15_doorbell_range_init(struct amdgpu_device *adev) ring->use_doorbell, ring->doorbell_index, adev->doorbell_index.sdma_doorbell_range); } - - adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, - adev->irq.ih.doorbell_index); } } diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c index 03b7066471f9..1e83db0c5438 100644 --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c @@ -289,6 +289,10 @@ static int vega10_ih_irq_init(struct amdgpu_device *adev) } } + if (!amdgpu_sriov_vf(adev)) + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, + adev->irq.ih.doorbell_index); + pci_set_master(adev->pdev); /* enable interrupts */ diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c index 2022ffbb8dba..59dfca093155 100644 --- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c @@ -340,6 +340,10 @@ static int vega20_ih_irq_init(struct amdgpu_device *adev) } } + if (!amdgpu_sriov_vf(adev)) + adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell, + adev->irq.ih.doorbell_index); + pci_set_master(adev->pdev); /* enable interrupts */
This mirrors what we do for other asics and this way we are sure the ih doorbell range is properly initialized. There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. In this case IH is initialized before GFX, so there should be no issue. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/soc15.c | 3 --- drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 4 ++++ 3 files changed, 8 insertions(+), 3 deletions(-)