Message ID | 20240226-rpmhpd-enable-corner-fix-v1-1-68c004cec48c@quicinc.com (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation | expand |
On 2/27/24 02:49, Bjorn Andersson via B4 Relay wrote: > From: Bjorn Andersson <quic_bjorande@quicinc.com> > > Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable > the domain")' aimed to make sure that a power-domain that is being > enabled without any particular performance-state requested will at least > turn the rail on, to avoid filling DeviceTree with otherwise unnecessary > required-opps properties. > > But in the event that aggregation happens on a disabled power-domain, with > an enabled peer without performance-state, both the local and peer > corner are 0. The peer's enabled_corner is not considered, with the > result that the underlying (shared) resource is disabled. > > One case where this can be observed is when the display stack keeps mmcx > enabled (but without a particular performance-state vote) in order to > access registers and sync_state happens in the rpmhpd driver. As mmcx_ao > is flushed the state of the peer (mmcx) is not considered and mmcx_ao > ends up turning off "mmcx.lvl" underneath mmcx. This has been observed > several times, but has been painted over in DeviceTree by adding an > explicit vote for the lowest non-disabled performance-state. > > Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain") > Reported-by: Johan Hovold <johan@kernel.org> > Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/ > Cc: <stable@vger.kernel.org> > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> > --- > This issue is the root cause of a display regression on SC8280XP boards, > resulting in the system often resetting during boot. It was exposed by > the refactoring of the DisplayPort driver in v6.8-rc1. > --- Very good find, thanks! Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Konrad
On Tue, 27 Feb 2024 at 03:45, Bjorn Andersson via B4 Relay <devnull+quic_bjorande.quicinc.com@kernel.org> wrote: > > From: Bjorn Andersson <quic_bjorande@quicinc.com> > > Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable > the domain")' aimed to make sure that a power-domain that is being > enabled without any particular performance-state requested will at least > turn the rail on, to avoid filling DeviceTree with otherwise unnecessary > required-opps properties. > > But in the event that aggregation happens on a disabled power-domain, with > an enabled peer without performance-state, both the local and peer > corner are 0. The peer's enabled_corner is not considered, with the > result that the underlying (shared) resource is disabled. > > One case where this can be observed is when the display stack keeps mmcx > enabled (but without a particular performance-state vote) in order to > access registers and sync_state happens in the rpmhpd driver. As mmcx_ao > is flushed the state of the peer (mmcx) is not considered and mmcx_ao > ends up turning off "mmcx.lvl" underneath mmcx. This has been observed > several times, but has been painted over in DeviceTree by adding an > explicit vote for the lowest non-disabled performance-state. > > Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain") > Reported-by: Johan Hovold <johan@kernel.org> > Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/ > Cc: <stable@vger.kernel.org> > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> > --- > This issue is the root cause of a display regression on SC8280XP boards, > resulting in the system often resetting during boot. It was exposed by > the refactoring of the DisplayPort driver in v6.8-rc1. > --- > drivers/pmdomain/qcom/rpmhpd.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Probably once this lands we can drop explicit required-opps properties.
On 2/26/2024 5:49 PM, Bjorn Andersson via B4 Relay wrote: > From: Bjorn Andersson <quic_bjorande@quicinc.com> > > Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable > the domain")' aimed to make sure that a power-domain that is being > enabled without any particular performance-state requested will at least > turn the rail on, to avoid filling DeviceTree with otherwise unnecessary > required-opps properties. > > But in the event that aggregation happens on a disabled power-domain, with > an enabled peer without performance-state, both the local and peer > corner are 0. The peer's enabled_corner is not considered, with the > result that the underlying (shared) resource is disabled. > > One case where this can be observed is when the display stack keeps mmcx > enabled (but without a particular performance-state vote) in order to > access registers and sync_state happens in the rpmhpd driver. As mmcx_ao > is flushed the state of the peer (mmcx) is not considered and mmcx_ao > ends up turning off "mmcx.lvl" underneath mmcx. This has been observed > several times, but has been painted over in DeviceTree by adding an > explicit vote for the lowest non-disabled performance-state. > > Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain") > Reported-by: Johan Hovold <johan@kernel.org> > Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/ > Cc: <stable@vger.kernel.org> > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> > --- > This issue is the root cause of a display regression on SC8280XP boards, > resulting in the system often resetting during boot. It was exposed by > the refactoring of the DisplayPort driver in v6.8-rc1. > --- > drivers/pmdomain/qcom/rpmhpd.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Quoting Bjorn Andersson via B4 Relay (2024-02-26 17:49:57) > From: Bjorn Andersson <quic_bjorande@quicinc.com> > > Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable > the domain")' aimed to make sure that a power-domain that is being > enabled without any particular performance-state requested will at least > turn the rail on, to avoid filling DeviceTree with otherwise unnecessary > required-opps properties. > > But in the event that aggregation happens on a disabled power-domain, with > an enabled peer without performance-state, both the local and peer > corner are 0. The peer's enabled_corner is not considered, with the > result that the underlying (shared) resource is disabled. > > One case where this can be observed is when the display stack keeps mmcx > enabled (but without a particular performance-state vote) in order to > access registers and sync_state happens in the rpmhpd driver. As mmcx_ao > is flushed the state of the peer (mmcx) is not considered and mmcx_ao > ends up turning off "mmcx.lvl" underneath mmcx. This has been observed > several times, but has been painted over in DeviceTree by adding an > explicit vote for the lowest non-disabled performance-state. > > Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain") > Reported-by: Johan Hovold <johan@kernel.org> > Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/ > Cc: <stable@vger.kernel.org> > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> > --- Reviewed-by: Stephen Boyd <swboyd@chromium.org>
On Mon, Feb 26, 2024 at 05:49:57PM -0800, Bjorn Andersson via B4 Relay wrote: > From: Bjorn Andersson <quic_bjorande@quicinc.com> > > Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable > the domain")' aimed to make sure that a power-domain that is being > enabled without any particular performance-state requested will at least > turn the rail on, to avoid filling DeviceTree with otherwise unnecessary > required-opps properties. > > But in the event that aggregation happens on a disabled power-domain, with > an enabled peer without performance-state, both the local and peer > corner are 0. The peer's enabled_corner is not considered, with the > result that the underlying (shared) resource is disabled. > > One case where this can be observed is when the display stack keeps mmcx > enabled (but without a particular performance-state vote) in order to > access registers and sync_state happens in the rpmhpd driver. As mmcx_ao > is flushed the state of the peer (mmcx) is not considered and mmcx_ao > ends up turning off "mmcx.lvl" underneath mmcx. This has been observed > several times, but has been painted over in DeviceTree by adding an > explicit vote for the lowest non-disabled performance-state. > > Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain") > Reported-by: Johan Hovold <johan@kernel.org> > Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/ > Cc: <stable@vger.kernel.org> > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> > --- > This issue is the root cause of a display regression on SC8280XP boards, > resulting in the system often resetting during boot. It was exposed by > the refactoring of the DisplayPort driver in v6.8-rc1. This fixes the hard resets I've been seeing since rc1 when initialising the display subsystem of the Lenovo ThinkPad X13s at boot. With some instrumentation added I can see the resets coinciding with the call to rpmhpd_aggregate_corner() for 'mx_ao': Tested-by: Johan Hovold <johan+linaro@kernel.org>
On Tue, 27 Feb 2024 at 02:45, Bjorn Andersson via B4 Relay <devnull+quic_bjorande.quicinc.com@kernel.org> wrote: > > From: Bjorn Andersson <quic_bjorande@quicinc.com> > > Commit 'e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable > the domain")' aimed to make sure that a power-domain that is being > enabled without any particular performance-state requested will at least > turn the rail on, to avoid filling DeviceTree with otherwise unnecessary > required-opps properties. > > But in the event that aggregation happens on a disabled power-domain, with > an enabled peer without performance-state, both the local and peer > corner are 0. The peer's enabled_corner is not considered, with the > result that the underlying (shared) resource is disabled. > > One case where this can be observed is when the display stack keeps mmcx > enabled (but without a particular performance-state vote) in order to > access registers and sync_state happens in the rpmhpd driver. As mmcx_ao > is flushed the state of the peer (mmcx) is not considered and mmcx_ao > ends up turning off "mmcx.lvl" underneath mmcx. This has been observed > several times, but has been painted over in DeviceTree by adding an > explicit vote for the lowest non-disabled performance-state. > > Fixes: e3e56c050ab6 ("soc: qcom: rpmhpd: Make power_on actually enable the domain") > Reported-by: Johan Hovold <johan@kernel.org> > Closes: https://lore.kernel.org/linux-arm-msm/ZdMwZa98L23mu3u6@hovoldconsulting.com/ > Cc: <stable@vger.kernel.org> > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Applied for fixes, thanks! Kind regards Uffe > --- > This issue is the root cause of a display regression on SC8280XP boards, > resulting in the system often resetting during boot. It was exposed by > the refactoring of the DisplayPort driver in v6.8-rc1. > --- > drivers/pmdomain/qcom/rpmhpd.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/pmdomain/qcom/rpmhpd.c b/drivers/pmdomain/qcom/rpmhpd.c > index 3078896b1300..47df910645f6 100644 > --- a/drivers/pmdomain/qcom/rpmhpd.c > +++ b/drivers/pmdomain/qcom/rpmhpd.c > @@ -692,6 +692,7 @@ static int rpmhpd_aggregate_corner(struct rpmhpd *pd, unsigned int corner) > unsigned int active_corner, sleep_corner; > unsigned int this_active_corner = 0, this_sleep_corner = 0; > unsigned int peer_active_corner = 0, peer_sleep_corner = 0; > + unsigned int peer_enabled_corner; > > if (pd->state_synced) { > to_active_sleep(pd, corner, &this_active_corner, &this_sleep_corner); > @@ -701,9 +702,11 @@ static int rpmhpd_aggregate_corner(struct rpmhpd *pd, unsigned int corner) > this_sleep_corner = pd->level_count - 1; > } > > - if (peer && peer->enabled) > - to_active_sleep(peer, peer->corner, &peer_active_corner, > + if (peer && peer->enabled) { > + peer_enabled_corner = max(peer->corner, peer->enable_corner); > + to_active_sleep(peer, peer_enabled_corner, &peer_active_corner, > &peer_sleep_corner); > + } > > active_corner = max(this_active_corner, peer_active_corner); > > > --- > base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d > change-id: 20240226-rpmhpd-enable-corner-fix-c5e07fe7b986 > > Best regards, > -- > Bjorn Andersson <quic_bjorande@quicinc.com> >
diff --git a/drivers/pmdomain/qcom/rpmhpd.c b/drivers/pmdomain/qcom/rpmhpd.c index 3078896b1300..47df910645f6 100644 --- a/drivers/pmdomain/qcom/rpmhpd.c +++ b/drivers/pmdomain/qcom/rpmhpd.c @@ -692,6 +692,7 @@ static int rpmhpd_aggregate_corner(struct rpmhpd *pd, unsigned int corner) unsigned int active_corner, sleep_corner; unsigned int this_active_corner = 0, this_sleep_corner = 0; unsigned int peer_active_corner = 0, peer_sleep_corner = 0; + unsigned int peer_enabled_corner; if (pd->state_synced) { to_active_sleep(pd, corner, &this_active_corner, &this_sleep_corner); @@ -701,9 +702,11 @@ static int rpmhpd_aggregate_corner(struct rpmhpd *pd, unsigned int corner) this_sleep_corner = pd->level_count - 1; } - if (peer && peer->enabled) - to_active_sleep(peer, peer->corner, &peer_active_corner, + if (peer && peer->enabled) { + peer_enabled_corner = max(peer->corner, peer->enable_corner); + to_active_sleep(peer, peer_enabled_corner, &peer_active_corner, &peer_sleep_corner); + } active_corner = max(this_active_corner, peer_active_corner);