Message ID | 20200911145927.401322-1-Rodrigo.Siqueira@amd.com (mailing list archive) |
---|---|
Headers | show |
Series | Enlarge tracepoints in the display component | expand |
On Fri, Sep 11, 2020 at 10:59:23AM -0400, Rodrigo Siqueira wrote: > Debug issues related to display can be a challenge due to the complexity > around this topic and different source of information might help in this > process. We already have support for tracepoints inside the display > component, i.e., we have the basic functionalities available and we just > need to expand it in order to make it more valuable for debugging. For > this reason, this patchset reworks part of the current tracepoint > options and add different sets of tracing inside amdgpu_dm, display > core, and DCN10. The first patch of this series just rework part of the > current tracepoints and the last set of patches introduces new > tracepoints. > > This first patchset version is functional. Please, let me know what I > can improve in the current version but also let me know what kind of > tracepoint I can add for the next version. > > Finally, I want to highlight that this work is based on a set of patches > originally made by Nicholas Kazlauskas. > > Change in V2: > - I added another patch for capturing the clock state for different display > architecture. Hm I'm not super sure tracepoints for state dumping are the right thing here. We kinda have the atomic state dumping code with all the various callbacks, and you can extend that pretty easily. Gives you full state dump in debugfs, plus a few function to dump into dmesg. Maybe what we need is a function to dump this also into printk tracepoint (otoh with Sean Paul's tracepoint work we'd get that through the dmesg stuff already), and then you could do it there? Upside is that for customers they'd get a much more consistent way to debug display issues across different drivers. For low-level hw debug what we do is give the hw guys an mmio trace, and they replay it on the fancy boxes :-) So for that I think this here is again too high level, but maybe what you have is a bit different. -Daniel > > Rodrigo Siqueira (4): > drm/amd/display: Rework registers tracepoint > drm/amd/display: Add tracepoint for amdgpu_dm > drm/amd/display: Add pipe_state tracepoint > drm/amd/display: Add tracepoint for capturing clocks state > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 + > .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++- > .../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 + > .../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 + > .../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 + > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 + > .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 + > drivers/gpu/drm/amd/display/dc/core/dc.c | 11 + > .../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 + > .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +- > 10 files changed, 747 insertions(+), 36 deletions(-) > > -- > 2.28.0 >
On 2020-09-16 5:12 a.m., Daniel Vetter wrote: > On Fri, Sep 11, 2020 at 10:59:23AM -0400, Rodrigo Siqueira wrote: >> Debug issues related to display can be a challenge due to the complexity >> around this topic and different source of information might help in this >> process. We already have support for tracepoints inside the display >> component, i.e., we have the basic functionalities available and we just >> need to expand it in order to make it more valuable for debugging. For >> this reason, this patchset reworks part of the current tracepoint >> options and add different sets of tracing inside amdgpu_dm, display >> core, and DCN10. The first patch of this series just rework part of the >> current tracepoints and the last set of patches introduces new >> tracepoints. >> >> This first patchset version is functional. Please, let me know what I >> can improve in the current version but also let me know what kind of >> tracepoint I can add for the next version. >> >> Finally, I want to highlight that this work is based on a set of patches >> originally made by Nicholas Kazlauskas. >> >> Change in V2: >> - I added another patch for capturing the clock state for different display >> architecture. > > Hm I'm not super sure tracepoints for state dumping are the right thing > here. We kinda have the atomic state dumping code with all the various > callbacks, and you can extend that pretty easily. Gives you full state > dump in debugfs, plus a few function to dump into dmesg. > > Maybe what we need is a function to dump this also into printk tracepoint > (otoh with Sean Paul's tracepoint work we'd get that through the dmesg > stuff already), and then you could do it there? > > Upside is that for customers they'd get a much more consistent way to > debug display issues across different drivers. > > For low-level hw debug what we do is give the hw guys an mmio trace, and > they replay it on the fancy boxes :-) So for that I think this here is > again too high level, but maybe what you have is a bit different. > -Daniel We have raw register traces, but what I find most useful is to be able to see are the incoming DRM IOCTLs, objects and properties per commit. Many of the bugs we see in display code is in the conversion from DRM -> DM -> DC state. The current HW state is kind of useless in most cases, but the sequence helps track down intermittent problems and understand state transitions. Tracepoints provide everything I really need to be able to track down these problems without falling back to a full debugger. The existing DRM prints (even at high logging levels) aren't enough to understand what's going on in most cases in our driver so funneling those into tracepoints to improve perf doesn't really help that much. I think this kind of idea was rejected for DRM core last year with Sean's patch series but if we can't get them into core then I'd like to get them into our driver at least. These are a cleaned up version of Sean's work + my work that I end up applying locally whenever I debug something. Regards, Nicholas Kazlauskas > >> >> Rodrigo Siqueira (4): >> drm/amd/display: Rework registers tracepoint >> drm/amd/display: Add tracepoint for amdgpu_dm >> drm/amd/display: Add pipe_state tracepoint >> drm/amd/display: Add tracepoint for capturing clocks state >> >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 + >> .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++- >> .../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 + >> .../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 + >> .../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 + >> .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 + >> .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 + >> drivers/gpu/drm/amd/display/dc/core/dc.c | 11 + >> .../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 + >> .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +- >> 10 files changed, 747 insertions(+), 36 deletions(-) >> >> -- >> 2.28.0 >> >
On Wed, Sep 16, 2020 at 11:27:27AM -0400, Kazlauskas, Nicholas wrote: > On 2020-09-16 5:12 a.m., Daniel Vetter wrote: > > On Fri, Sep 11, 2020 at 10:59:23AM -0400, Rodrigo Siqueira wrote: > > > Debug issues related to display can be a challenge due to the complexity > > > around this topic and different source of information might help in this > > > process. We already have support for tracepoints inside the display > > > component, i.e., we have the basic functionalities available and we just > > > need to expand it in order to make it more valuable for debugging. For > > > this reason, this patchset reworks part of the current tracepoint > > > options and add different sets of tracing inside amdgpu_dm, display > > > core, and DCN10. The first patch of this series just rework part of the > > > current tracepoints and the last set of patches introduces new > > > tracepoints. > > > > > > This first patchset version is functional. Please, let me know what I > > > can improve in the current version but also let me know what kind of > > > tracepoint I can add for the next version. > > > > > > Finally, I want to highlight that this work is based on a set of patches > > > originally made by Nicholas Kazlauskas. > > > > > > Change in V2: > > > - I added another patch for capturing the clock state for different display > > > architecture. > > > > Hm I'm not super sure tracepoints for state dumping are the right thing > > here. We kinda have the atomic state dumping code with all the various > > callbacks, and you can extend that pretty easily. Gives you full state > > dump in debugfs, plus a few function to dump into dmesg. > > > > Maybe what we need is a function to dump this also into printk tracepoint > > (otoh with Sean Paul's tracepoint work we'd get that through the dmesg > > stuff already), and then you could do it there? > > > > Upside is that for customers they'd get a much more consistent way to > > debug display issues across different drivers. > > > > For low-level hw debug what we do is give the hw guys an mmio trace, and > > they replay it on the fancy boxes :-) So for that I think this here is > > again too high level, but maybe what you have is a bit different. > > -Daniel > > We have raw register traces, but what I find most useful is to be able to > see are the incoming DRM IOCTLs, objects and properties per commit. > > Many of the bugs we see in display code is in the conversion from DRM -> DM > -> DC state. The current HW state is kind of useless in most cases, but the > sequence helps track down intermittent problems and understand state > transitions. > > Tracepoints provide everything I really need to be able to track down these > problems without falling back to a full debugger. The existing DRM prints > (even at high logging levels) aren't enough to understand what's going on in > most cases in our driver so funneling those into tracepoints to improve perf > doesn't really help that much. > > I think this kind of idea was rejected for DRM core last year with Sean's > patch series but if we can't get them into core then I'd like to get them > into our driver at least. These are a cleaned up version of Sean's work + my > work that I end up applying locally whenever I debug something. Nah, Sean's series wasn't rejected. It's simply stuck waiting for review. So if your goal is to get better dumping going on, I think combining this with Sean's work (and getting that reviewed), plus then tapping into the atomic state dumping code. Then you know what was requested, plus what your atomic_check code computed should be the hw state, and you can compare that with the register dumps you already grab. Feels at least like a more complete and flexible solution than ad-hoc tracepoints for debuggin in each driver. The idea behind Sean's work is also that we'd have a blackbox recorder for any drm issues which distros in the field could use. So driver doing their own debug output doesn't sound super great. I think Siqueira already chatted a bit with Sean. -Daniel > > Regards, > Nicholas Kazlauskas > > > > > > > > > Rodrigo Siqueira (4): > > > drm/amd/display: Rework registers tracepoint > > > drm/amd/display: Add tracepoint for amdgpu_dm > > > drm/amd/display: Add pipe_state tracepoint > > > drm/amd/display: Add tracepoint for capturing clocks state > > > > > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 + > > > .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++- > > > .../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 + > > > .../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 + > > > .../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 + > > > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 + > > > .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 + > > > drivers/gpu/drm/amd/display/dc/core/dc.c | 11 + > > > .../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 + > > > .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +- > > > 10 files changed, 747 insertions(+), 36 deletions(-) > > > > > > -- > > > 2.28.0 > > > > > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel