Message ID | 20240315212104.776936-1-lyude@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/nouveau/dp: Fix incorrect return code in r535_dp_aux_xfer() | expand |
Reviewed-by: Dave Airlie <airlied@redhat.com> On Sat, Mar 16, 2024 at 7:21 AM Lyude Paul <lyude@redhat.com> wrote: > > I've recently been seeing some unexplained GSP errors on my RTX 6000 from > failed aux transactions: > > [ 132.915867] nouveau 0000:1f:00.0: gsp: cli:0xc1d00002 obj:0x00730000 > ctrl cmd:0x00731341 failed: 0x0000ffff > > While the cause of these is not yet clear, these messages made me notice > that the aux transactions causing these transactions were succeeding - not > failing. As it turns out, this is because we're currently not returning the > correct variable when r535_dp_aux_xfer() hits an error - causing us to > never propagate GSP errors for failed aux transactions to userspace. > > So, let's fix that. > > Fixes: 4ae3a20102b2 ("nouveau/gsp: don't free ctrl messages on errors") > Signed-off-by: Lyude Paul <lyude@redhat.com> > --- > drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c > index 6a0a4d3b8902d..027867c2a8c5b 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c > @@ -1080,7 +1080,7 @@ r535_dp_aux_xfer(struct nvkm_outp *outp, u8 type, u32 addr, u8 *data, u8 *psize) > ret = nvkm_gsp_rm_ctrl_push(&disp->rm.objcom, &ctrl, sizeof(*ctrl)); > if (ret) { > nvkm_gsp_rm_ctrl_done(&disp->rm.objcom, ctrl); > - return PTR_ERR(ctrl); > + return ret; > } > > memcpy(data, ctrl->data, size); > -- > 2.43.0 >
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c index 6a0a4d3b8902d..027867c2a8c5b 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c @@ -1080,7 +1080,7 @@ r535_dp_aux_xfer(struct nvkm_outp *outp, u8 type, u32 addr, u8 *data, u8 *psize) ret = nvkm_gsp_rm_ctrl_push(&disp->rm.objcom, &ctrl, sizeof(*ctrl)); if (ret) { nvkm_gsp_rm_ctrl_done(&disp->rm.objcom, ctrl); - return PTR_ERR(ctrl); + return ret; } memcpy(data, ctrl->data, size);
I've recently been seeing some unexplained GSP errors on my RTX 6000 from failed aux transactions: [ 132.915867] nouveau 0000:1f:00.0: gsp: cli:0xc1d00002 obj:0x00730000 ctrl cmd:0x00731341 failed: 0x0000ffff While the cause of these is not yet clear, these messages made me notice that the aux transactions causing these transactions were succeeding - not failing. As it turns out, this is because we're currently not returning the correct variable when r535_dp_aux_xfer() hits an error - causing us to never propagate GSP errors for failed aux transactions to userspace. So, let's fix that. Fixes: 4ae3a20102b2 ("nouveau/gsp: don't free ctrl messages on errors") Signed-off-by: Lyude Paul <lyude@redhat.com> --- drivers/gpu/drm/nouveau/nvkm/engine/disp/r535.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)