Message ID | 20230713213242.680944-5-andrealmeid@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/amdgpu: Add new reset option and rework coredump | expand |
Am 13.07.23 um 23:32 schrieb André Almeida: > If a kernel thread caused the reset, the information available to be > logged will be limited, so return early in the dump function. Why? The register values and vram lost state should still be valid. Christian. > > Signed-off-by: André Almeida <andrealmeid@igalia.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index e80670420586..07546781b8b8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -4988,10 +4988,14 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, > drm_printf(&p, "kernel: " UTS_RELEASE "\n"); > drm_printf(&p, "module: " KBUILD_MODNAME "\n"); > drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); > - if (coredump->reset_task_info.pid) > + if (coredump->reset_task_info.pid) { > drm_printf(&p, "process_name: %s PID: %d\n", > coredump->reset_task_info.process_name, > coredump->reset_task_info.pid); > + } else { > + drm_printf(&p, "GPU reset caused by a kernel thread\n"); > + return count - iter.remain; > + } > > if (coredump->reset_vram_lost) > drm_printf(&p, "VRAM is lost due to GPU reset!\n");
Em 14/07/2023 04:52, Christian König escreveu: > > > Am 13.07.23 um 23:32 schrieb André Almeida: >> If a kernel thread caused the reset, the information available to be >> logged will be limited, so return early in the dump function. > > Why? The register values and vram lost state should still be valid. > Fair enough, I was thinking about the new added information, such as ring and job, that won't be around for this type of thread. I'll drop this patch for the next version. > Christian. > >> >> Signed-off-by: André Almeida <andrealmeid@igalia.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index e80670420586..07546781b8b8 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -4988,10 +4988,14 @@ static ssize_t amdgpu_devcoredump_read(char >> *buffer, loff_t offset, >> drm_printf(&p, "kernel: " UTS_RELEASE "\n"); >> drm_printf(&p, "module: " KBUILD_MODNAME "\n"); >> drm_printf(&p, "time: %lld.%09ld\n", >> coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); >> - if (coredump->reset_task_info.pid) >> + if (coredump->reset_task_info.pid) { >> drm_printf(&p, "process_name: %s PID: %d\n", >> coredump->reset_task_info.process_name, >> coredump->reset_task_info.pid); >> + } else { >> + drm_printf(&p, "GPU reset caused by a kernel thread\n"); >> + return count - iter.remain; >> + } >> if (coredump->reset_vram_lost) >> drm_printf(&p, "VRAM is lost due to GPU reset!\n"); >
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index e80670420586..07546781b8b8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4988,10 +4988,14 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, drm_printf(&p, "kernel: " UTS_RELEASE "\n"); drm_printf(&p, "module: " KBUILD_MODNAME "\n"); drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); - if (coredump->reset_task_info.pid) + if (coredump->reset_task_info.pid) { drm_printf(&p, "process_name: %s PID: %d\n", coredump->reset_task_info.process_name, coredump->reset_task_info.pid); + } else { + drm_printf(&p, "GPU reset caused by a kernel thread\n"); + return count - iter.remain; + } if (coredump->reset_vram_lost) drm_printf(&p, "VRAM is lost due to GPU reset!\n");
If a kernel thread caused the reset, the information available to be logged will be limited, so return early in the dump function. Signed-off-by: André Almeida <andrealmeid@igalia.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)