Message ID | 20180604193509.29153-1-lyude@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 04.06.2018 um 21:35 schrieb Lyude Paul: > So, unfortunately I recently made the discovery that in the upstream > kernel, the only reason that amdgpu is not currently suffering from > issues with runtime PM putting the GPU into suspend while it's driving > displays is due to the fact that on most prime systems, we have sound > devices associated with the GPU that hold their own runtime PM ref for > the GPU. > > What this means however, is that in the event that there isn't any kind > of sound device active (which can easily be reproduced by building a > kernel with sound drivers disabled), the GPU will fall asleep even when > there's displays active. This appears to be in part due to the fact that > amdgpu has not actually ever relied on it's rpm_idle() function to be > the only thing keeping it running, and normally grabs it's own power > references whenever there are displays active (as can be seen with the > original pre-DC codepath in amdgpu_display_crtc_set_config() in > amdgpu_display.c). This means it's very likely that this bug was > introduced during the switch over the DC. > > So to fix this, we start grabbing runtime PM references every time we > enable a previously disabled CRTC in atomic_commit_tail(). This appears > to be the correct solution, as it matches up with what i915 does in > i915/intel_runtime_pm.c. > > The one sideaffect of this is that we ignore the variable that the > pre-DC code used to use for tracking when it needed runtime PM refs, > adev->have_disp_power_ref. This is mainly because there's no way for a > driver to tell whether or not all of it's CRTCs are enabled or disabled > when we've begun committing an atomic state, as there may be CRTC > commits happening in parallel that aren't contained within the atomic > state being committed. So, it's safer to just get/put a reference for > each CRTC being enabled or disabled in the new atomic state. > > Signed-off-by: Lyude Paul <lyude@redhat.com> The final decision is with Harray, but at least the explanation makes perfect sense to me. Acked-by: Christian König <christian.koenig@amd.com>. Harry do you want to pick that up and push it into our internal branch while Alex is on vacation or should I do that? Regards, Christian. > --- > As a note, I'm not entirely happy with this fix and I wouldn't be > surprised if I missed something while looking through amdgpu. So, please > don't hesistate to suggest a better fix :). > > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 1dd1142246c2..361b81ef6997 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -46,6 +46,7 @@ > #include <linux/moduleparam.h> > #include <linux/version.h> > #include <linux/types.h> > +#include <linux/pm_runtime.h> > > #include <drm/drmP.h> > #include <drm/drm_atomic.h> > @@ -4211,6 +4212,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) > if (dm_old_crtc_state->stream) > remove_stream(adev, acrtc, dm_old_crtc_state->stream); > > + pm_runtime_get_noresume(dev->dev); > + > acrtc->enabled = true; > acrtc->hw_mode = new_crtc_state->mode; > crtc->hwmode = new_crtc_state->mode; > @@ -4396,6 +4399,16 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) > drm_atomic_helper_wait_for_flip_done(dev, state); > > drm_atomic_helper_cleanup_planes(dev, state); > + > + /* Finally, drop a runtime PM reference for each newly disabled CRTC, > + * so we can put the GPU into runtime suspend if we're not driving any > + * displays anymore > + */ > + pm_runtime_mark_last_busy(dev->dev); > + for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) { > + if (old_crtc_state->active && !new_crtc_state->active) > + pm_runtime_put_autosuspend(dev->dev); > + } > } > >
On 2018-06-04 03:35 PM, Lyude Paul wrote: > So, unfortunately I recently made the discovery that in the upstream > kernel, the only reason that amdgpu is not currently suffering from > issues with runtime PM putting the GPU into suspend while it's driving > displays is due to the fact that on most prime systems, we have sound > devices associated with the GPU that hold their own runtime PM ref for > the GPU. > > What this means however, is that in the event that there isn't any kind > of sound device active (which can easily be reproduced by building a > kernel with sound drivers disabled), the GPU will fall asleep even when > there's displays active. This appears to be in part due to the fact that > amdgpu has not actually ever relied on it's rpm_idle() function to be > the only thing keeping it running, and normally grabs it's own power > references whenever there are displays active (as can be seen with the > original pre-DC codepath in amdgpu_display_crtc_set_config() in > amdgpu_display.c). This means it's very likely that this bug was > introduced during the switch over the DC. > > So to fix this, we start grabbing runtime PM references every time we > enable a previously disabled CRTC in atomic_commit_tail(). This appears > to be the correct solution, as it matches up with what i915 does in > i915/intel_runtime_pm.c. > > The one sideaffect of this is that we ignore the variable that the > pre-DC code used to use for tracking when it needed runtime PM refs, > adev->have_disp_power_ref. This is mainly because there's no way for a > driver to tell whether or not all of it's CRTCs are enabled or disabled > when we've begun committing an atomic state, as there may be CRTC > commits happening in parallel that aren't contained within the atomic > state being committed. So, it's safer to just get/put a reference for > each CRTC being enabled or disabled in the new atomic state. > > Signed-off-by: Lyude Paul <lyude@redhat.com> I'm not familiar with the runtime_pm stuff, as is painfully obvious from the fact that we missed that with the DC driver. That said, from a cursory look at runtime_pm.txt, this looks right. Reviewed-by: Harry Wentland <harry.wentland@amd.com> I'll pull this in through the amd-stg tree. > --- > As a note, I'm not entirely happy with this fix and I wouldn't be > surprised if I missed something while looking through amdgpu. So, please > don't hesistate to suggest a better fix :). I still don't really like amdgpu_dm_atomic_commit_tail and related functions. We have plans to rework these and make them more straight-forward. Harry > > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 1dd1142246c2..361b81ef6997 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -46,6 +46,7 @@ > #include <linux/moduleparam.h> > #include <linux/version.h> > #include <linux/types.h> > +#include <linux/pm_runtime.h> > > #include <drm/drmP.h> > #include <drm/drm_atomic.h> > @@ -4211,6 +4212,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) > if (dm_old_crtc_state->stream) > remove_stream(adev, acrtc, dm_old_crtc_state->stream); > > + pm_runtime_get_noresume(dev->dev); > + > acrtc->enabled = true; > acrtc->hw_mode = new_crtc_state->mode; > crtc->hwmode = new_crtc_state->mode; > @@ -4396,6 +4399,16 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) > drm_atomic_helper_wait_for_flip_done(dev, state); > > drm_atomic_helper_cleanup_planes(dev, state); > + > + /* Finally, drop a runtime PM reference for each newly disabled CRTC, > + * so we can put the GPU into runtime suspend if we're not driving any > + * displays anymore > + */ > + pm_runtime_mark_last_busy(dev->dev); > + for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) { > + if (old_crtc_state->active && !new_crtc_state->active) > + pm_runtime_put_autosuspend(dev->dev); > + } > } > > >
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 1dd1142246c2..361b81ef6997 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -46,6 +46,7 @@ #include <linux/moduleparam.h> #include <linux/version.h> #include <linux/types.h> +#include <linux/pm_runtime.h> #include <drm/drmP.h> #include <drm/drm_atomic.h> @@ -4211,6 +4212,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) if (dm_old_crtc_state->stream) remove_stream(adev, acrtc, dm_old_crtc_state->stream); + pm_runtime_get_noresume(dev->dev); + acrtc->enabled = true; acrtc->hw_mode = new_crtc_state->mode; crtc->hwmode = new_crtc_state->mode; @@ -4396,6 +4399,16 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) drm_atomic_helper_wait_for_flip_done(dev, state); drm_atomic_helper_cleanup_planes(dev, state); + + /* Finally, drop a runtime PM reference for each newly disabled CRTC, + * so we can put the GPU into runtime suspend if we're not driving any + * displays anymore + */ + pm_runtime_mark_last_busy(dev->dev); + for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) { + if (old_crtc_state->active && !new_crtc_state->active) + pm_runtime_put_autosuspend(dev->dev); + } }
So, unfortunately I recently made the discovery that in the upstream kernel, the only reason that amdgpu is not currently suffering from issues with runtime PM putting the GPU into suspend while it's driving displays is due to the fact that on most prime systems, we have sound devices associated with the GPU that hold their own runtime PM ref for the GPU. What this means however, is that in the event that there isn't any kind of sound device active (which can easily be reproduced by building a kernel with sound drivers disabled), the GPU will fall asleep even when there's displays active. This appears to be in part due to the fact that amdgpu has not actually ever relied on it's rpm_idle() function to be the only thing keeping it running, and normally grabs it's own power references whenever there are displays active (as can be seen with the original pre-DC codepath in amdgpu_display_crtc_set_config() in amdgpu_display.c). This means it's very likely that this bug was introduced during the switch over the DC. So to fix this, we start grabbing runtime PM references every time we enable a previously disabled CRTC in atomic_commit_tail(). This appears to be the correct solution, as it matches up with what i915 does in i915/intel_runtime_pm.c. The one sideaffect of this is that we ignore the variable that the pre-DC code used to use for tracking when it needed runtime PM refs, adev->have_disp_power_ref. This is mainly because there's no way for a driver to tell whether or not all of it's CRTCs are enabled or disabled when we've begun committing an atomic state, as there may be CRTC commits happening in parallel that aren't contained within the atomic state being committed. So, it's safer to just get/put a reference for each CRTC being enabled or disabled in the new atomic state. Signed-off-by: Lyude Paul <lyude@redhat.com> --- As a note, I'm not entirely happy with this fix and I wouldn't be surprised if I missed something while looking through amdgpu. So, please don't hesistate to suggest a better fix :). drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)