Message ID | 1711656246-3483-2-git-send-email-quic_khsieh@quicinc.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [v1] drm/msm/dp: use dp_hpd_plug_handle() and dp_hpd_unplug_handle() directly | expand |
+ Johan and Bjorn for FYI On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: > For internal HPD case, hpd_event_thread is created to handle HPD > interrupts generated by HPD block of DP controller. It converts > HPD interrupts into events and executed them under hpd_event_thread > context. For external HPD case, HPD events is delivered by way of > dp_bridge_hpd_notify() under thread context. Since they are executed > under thread context already, there is no reason to hand over those > events to hpd_event_thread. Hence dp_hpd_plug_handle() and > dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). > > Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> > --- > drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") Looks right to me, Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Quoting Abhinav Kumar (2024-03-28 13:24:34) > + Johan and Bjorn for FYI > > On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: > > For internal HPD case, hpd_event_thread is created to handle HPD > > interrupts generated by HPD block of DP controller. It converts > > HPD interrupts into events and executed them under hpd_event_thread > > context. For external HPD case, HPD events is delivered by way of > > dp_bridge_hpd_notify() under thread context. Since they are executed > > under thread context already, there is no reason to hand over those > > events to hpd_event_thread. Hence dp_hpd_plug_handle() and > > dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). > > > > Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> > > --- > > drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") Is this a bug fix or an optimization? The commit text doesn't tell me. > > Looks right to me, > > Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
On 3/28/2024 1:58 PM, Stephen Boyd wrote: > Quoting Abhinav Kumar (2024-03-28 13:24:34) >> + Johan and Bjorn for FYI >> >> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: >>> For internal HPD case, hpd_event_thread is created to handle HPD >>> interrupts generated by HPD block of DP controller. It converts >>> HPD interrupts into events and executed them under hpd_event_thread >>> context. For external HPD case, HPD events is delivered by way of >>> dp_bridge_hpd_notify() under thread context. Since they are executed >>> under thread context already, there is no reason to hand over those >>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and >>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). >>> >>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> >>> --- >>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >> >> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") > > Is this a bug fix or an optimization? The commit text doesn't tell me. > I would say both. optimization as it avoids the need to go through the hpd_event thread processing. bug fix because once you go through the hpd event thread processing it exposes and often breaks the already fragile hpd handling state machine which can be avoided in this case. >> >> Looks right to me, >> >> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: > > > > On 3/28/2024 1:58 PM, Stephen Boyd wrote: > > Quoting Abhinav Kumar (2024-03-28 13:24:34) > >> + Johan and Bjorn for FYI > >> > >> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: > >>> For internal HPD case, hpd_event_thread is created to handle HPD > >>> interrupts generated by HPD block of DP controller. It converts > >>> HPD interrupts into events and executed them under hpd_event_thread > >>> context. For external HPD case, HPD events is delivered by way of > >>> dp_bridge_hpd_notify() under thread context. Since they are executed > >>> under thread context already, there is no reason to hand over those > >>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and > >>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). > >>> > >>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> > >>> --- > >>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- > >>> 1 file changed, 3 insertions(+), 2 deletions(-) > >>> > >> > >> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") > > > > Is this a bug fix or an optimization? The commit text doesn't tell me. > > > > I would say both. > > optimization as it avoids the need to go through the hpd_event thread > processing. > > bug fix because once you go through the hpd event thread processing it > exposes and often breaks the already fragile hpd handling state machine > which can be avoided in this case. Please add a description for the particular issue that was observed and how it is fixed by the patch. Otherwise consider there to be an implicit NAK for all HPD-related patches unless it is a series that moves link training to the enable path and drops the HPD state machine completely. I really mean it. We should stop beating a dead horse unless there is a grave bug that must be fixed. > > >> > >> Looks right to me, > >> > >> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote: > On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: >> >> >> >> On 3/28/2024 1:58 PM, Stephen Boyd wrote: >>> Quoting Abhinav Kumar (2024-03-28 13:24:34) >>>> + Johan and Bjorn for FYI >>>> >>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: >>>>> For internal HPD case, hpd_event_thread is created to handle HPD >>>>> interrupts generated by HPD block of DP controller. It converts >>>>> HPD interrupts into events and executed them under hpd_event_thread >>>>> context. For external HPD case, HPD events is delivered by way of >>>>> dp_bridge_hpd_notify() under thread context. Since they are executed >>>>> under thread context already, there is no reason to hand over those >>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and >>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). >>>>> >>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> >>>>> --- >>>>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- >>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>> >>>> >>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") >>> >>> Is this a bug fix or an optimization? The commit text doesn't tell me. >>> >> >> I would say both. >> >> optimization as it avoids the need to go through the hpd_event thread >> processing. >> >> bug fix because once you go through the hpd event thread processing it >> exposes and often breaks the already fragile hpd handling state machine >> which can be avoided in this case. > > Please add a description for the particular issue that was observed > and how it is fixed by the patch. > > Otherwise consider there to be an implicit NAK for all HPD-related > patches unless it is a series that moves link training to the enable > path and drops the HPD state machine completely. > > I really mean it. We should stop beating a dead horse unless there is > a grave bug that must be fixed. > I think the commit message is explaining the issue well enough. This was not fixing any issue we saw to explain you the exact scenario of things which happened but this is just from code walkthrough. Like kuogee wrote, hpd event thread was there so handle events coming out of the hpd_isr for internal hpd cases. For the hpd_notify coming from pmic_glink or any other extnernal hpd cases, there is no need to put this through the hpd event thread because this will only make things worse of exposing the race conditions of the state machine. Moving link training to enable and removal of hpd event thread will be worked on but delaying obvious things we can fix does not make sense. >> >>>> >>>> Looks right to me, >>>> >>>> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> > > >
On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: > > > > On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote: > > On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: > >> > >> > >> > >> On 3/28/2024 1:58 PM, Stephen Boyd wrote: > >>> Quoting Abhinav Kumar (2024-03-28 13:24:34) > >>>> + Johan and Bjorn for FYI > >>>> > >>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: > >>>>> For internal HPD case, hpd_event_thread is created to handle HPD > >>>>> interrupts generated by HPD block of DP controller. It converts > >>>>> HPD interrupts into events and executed them under hpd_event_thread > >>>>> context. For external HPD case, HPD events is delivered by way of > >>>>> dp_bridge_hpd_notify() under thread context. Since they are executed > >>>>> under thread context already, there is no reason to hand over those > >>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and > >>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). > >>>>> > >>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> > >>>>> --- > >>>>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- > >>>>> 1 file changed, 3 insertions(+), 2 deletions(-) > >>>>> > >>>> > >>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") > >>> > >>> Is this a bug fix or an optimization? The commit text doesn't tell me. > >>> > >> > >> I would say both. > >> > >> optimization as it avoids the need to go through the hpd_event thread > >> processing. > >> > >> bug fix because once you go through the hpd event thread processing it > >> exposes and often breaks the already fragile hpd handling state machine > >> which can be avoided in this case. > > > > Please add a description for the particular issue that was observed > > and how it is fixed by the patch. > > > > Otherwise consider there to be an implicit NAK for all HPD-related > > patches unless it is a series that moves link training to the enable > > path and drops the HPD state machine completely. > > > > I really mean it. We should stop beating a dead horse unless there is > > a grave bug that must be fixed. > > > > I think the commit message is explaining the issue well enough. > > This was not fixing any issue we saw to explain you the exact scenario > of things which happened but this is just from code walkthrough. > > Like kuogee wrote, hpd event thread was there so handle events coming > out of the hpd_isr for internal hpd cases. For the hpd_notify coming > from pmic_glink or any other extnernal hpd cases, there is no need to > put this through the hpd event thread because this will only make things > worse of exposing the race conditions of the state machine. > > Moving link training to enable and removal of hpd event thread will be > worked on but delaying obvious things we can fix does not make sense. From the commit message this feels like an optimisation rather than a fix. And granted the fragility of the HPD state machine, I'd prefer to stay away from optimisations. As far as I understood from the history of the last revert, we'd better make sure that HPD handling goes only through the HPD event thread.
On Thu, Mar 28, 2024 at 02:21:14PM -0700, Abhinav Kumar wrote: > > > On 3/28/2024 1:58 PM, Stephen Boyd wrote: > > Quoting Abhinav Kumar (2024-03-28 13:24:34) > > > + Johan and Bjorn for FYI > > > > > > On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: > > > > For internal HPD case, hpd_event_thread is created to handle HPD > > > > interrupts generated by HPD block of DP controller. It converts > > > > HPD interrupts into events and executed them under hpd_event_thread > > > > context. For external HPD case, HPD events is delivered by way of > > > > dp_bridge_hpd_notify() under thread context. Since they are executed > > > > under thread context already, there is no reason to hand over those > > > > events to hpd_event_thread. Hence dp_hpd_plug_handle() and > > > > dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). > > > > > > > > Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> > > > > --- > > > > drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > > > Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") > > > > Is this a bug fix or an optimization? The commit text doesn't tell me. > > > > I would say both. > > optimization as it avoids the need to go through the hpd_event thread > processing. > > bug fix because once you go through the hpd event thread processing it > exposes and often breaks the already fragile hpd handling state machine > which can be avoided in this case. > It removes the main users of the thread, but there's still code paths which will post events on the thread. I think I like the direction this is taking, but does it really fix the whole problem, or just patch one case? PS. Please read go/upstream and switch to b4, to avoid some practical issues with the way you posted this patch. Thanks, Bjorn > > > > > > Looks right to me, > > > > > > Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote: > On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: >> >> >> >> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote: >>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: >>>> >>>> >>>> >>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote: >>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34) >>>>>> + Johan and Bjorn for FYI >>>>>> >>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: >>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD >>>>>>> interrupts generated by HPD block of DP controller. It converts >>>>>>> HPD interrupts into events and executed them under hpd_event_thread >>>>>>> context. For external HPD case, HPD events is delivered by way of >>>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed >>>>>>> under thread context already, there is no reason to hand over those >>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and >>>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). >>>>>>> >>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- >>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>>>> >>>>>> >>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") >>>>> >>>>> Is this a bug fix or an optimization? The commit text doesn't tell me. >>>>> >>>> >>>> I would say both. >>>> >>>> optimization as it avoids the need to go through the hpd_event thread >>>> processing. >>>> >>>> bug fix because once you go through the hpd event thread processing it >>>> exposes and often breaks the already fragile hpd handling state machine >>>> which can be avoided in this case. >>> >>> Please add a description for the particular issue that was observed >>> and how it is fixed by the patch. >>> >>> Otherwise consider there to be an implicit NAK for all HPD-related >>> patches unless it is a series that moves link training to the enable >>> path and drops the HPD state machine completely. >>> >>> I really mean it. We should stop beating a dead horse unless there is >>> a grave bug that must be fixed. >>> >> >> I think the commit message is explaining the issue well enough. >> >> This was not fixing any issue we saw to explain you the exact scenario >> of things which happened but this is just from code walkthrough. >> >> Like kuogee wrote, hpd event thread was there so handle events coming >> out of the hpd_isr for internal hpd cases. For the hpd_notify coming >> from pmic_glink or any other extnernal hpd cases, there is no need to >> put this through the hpd event thread because this will only make things >> worse of exposing the race conditions of the state machine. >> >> Moving link training to enable and removal of hpd event thread will be >> worked on but delaying obvious things we can fix does not make sense. > > From the commit message this feels like an optimisation rather than a > fix. And granted the fragility of the HPD state machine, I'd prefer to > stay away from optimisations. As far as I understood from the history > of the last revert, we'd better make sure that HPD handling goes only > through the HPD event thread. > I think you are mixing the two. We tried to send the events through DRM's hpd_notify which ended up in a bad way and btw, thats still not resolved even though I have seen reports that things are fine with the revert, we are consistently able to see us ending up in a disconnected state with all the reverts and fixes in our x1e80100 DP setup. I plan to investigate that issue properly in the next week and try to make some sense of it all. In fact, this patch is removing one more user of the hpd event thread which is the direction in which we all want to head towards. On whether this is an optimization or a bug fix. I think by avoiding hpd event thread (which should have never been used for hpd_notify updates, hence a bug) we are avoiding the possibility of more race conditions. So, this has my R-b and it holds. Upto you.
On 3/28/2024 6:46 PM, Bjorn Andersson wrote: > On Thu, Mar 28, 2024 at 02:21:14PM -0700, Abhinav Kumar wrote: >> >> >> On 3/28/2024 1:58 PM, Stephen Boyd wrote: >>> Quoting Abhinav Kumar (2024-03-28 13:24:34) >>>> + Johan and Bjorn for FYI >>>> >>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: >>>>> For internal HPD case, hpd_event_thread is created to handle HPD >>>>> interrupts generated by HPD block of DP controller. It converts >>>>> HPD interrupts into events and executed them under hpd_event_thread >>>>> context. For external HPD case, HPD events is delivered by way of >>>>> dp_bridge_hpd_notify() under thread context. Since they are executed >>>>> under thread context already, there is no reason to hand over those >>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and >>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). >>>>> >>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> >>>>> --- >>>>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- >>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>> >>>> >>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") >>> >>> Is this a bug fix or an optimization? The commit text doesn't tell me. >>> >> >> I would say both. >> >> optimization as it avoids the need to go through the hpd_event thread >> processing. >> >> bug fix because once you go through the hpd event thread processing it >> exposes and often breaks the already fragile hpd handling state machine >> which can be avoided in this case. >> > > It removes the main users of the thread, but there's still code paths > which will post events on the thread. > > I think I like the direction this is taking, but does it really fix the > whole problem, or just patch one case? > So kuogee's idea behind this that NON-hpd_isr events need not go through event thread at all. We did not run into any special scenario or issue without this. It was a code walkthrough fix. > > PS. Please read go/upstream and switch to b4, to avoid some practical > issues with the way you posted this patch. > > Thanks, > Bjorn > Just to elaborate the practical issues so that developers know what you encountered: -> no need of v1 on the PATCH -> somehow this patch was linked "in-reply-to" another patch https://lore.kernel.org/all/1711656246-3483-2-git-send-email-quic_khsieh@quicinc.com/ . This is quite strange and not sure how it happened. But will double check if we did something wrong here. Thanks for sharing these. >>>> >>>> Looks right to me, >>>> >>>> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
On Fri, 29 Mar 2024 at 04:16, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: > > > > On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote: > > On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: > >> > >> > >> > >> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote: > >>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: > >>>> > >>>> > >>>> > >>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote: > >>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34) > >>>>>> + Johan and Bjorn for FYI > >>>>>> > >>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: > >>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD > >>>>>>> interrupts generated by HPD block of DP controller. It converts > >>>>>>> HPD interrupts into events and executed them under hpd_event_thread > >>>>>>> context. For external HPD case, HPD events is delivered by way of > >>>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed > >>>>>>> under thread context already, there is no reason to hand over those > >>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and > >>>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). > >>>>>>> > >>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> > >>>>>>> --- > >>>>>>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- > >>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-) > >>>>>>> > >>>>>> > >>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") > >>>>> > >>>>> Is this a bug fix or an optimization? The commit text doesn't tell me. > >>>>> > >>>> > >>>> I would say both. > >>>> > >>>> optimization as it avoids the need to go through the hpd_event thread > >>>> processing. > >>>> > >>>> bug fix because once you go through the hpd event thread processing it > >>>> exposes and often breaks the already fragile hpd handling state machine > >>>> which can be avoided in this case. > >>> > >>> Please add a description for the particular issue that was observed > >>> and how it is fixed by the patch. > >>> > >>> Otherwise consider there to be an implicit NAK for all HPD-related > >>> patches unless it is a series that moves link training to the enable > >>> path and drops the HPD state machine completely. > >>> > >>> I really mean it. We should stop beating a dead horse unless there is > >>> a grave bug that must be fixed. > >>> > >> > >> I think the commit message is explaining the issue well enough. > >> > >> This was not fixing any issue we saw to explain you the exact scenario > >> of things which happened but this is just from code walkthrough. > >> > >> Like kuogee wrote, hpd event thread was there so handle events coming > >> out of the hpd_isr for internal hpd cases. For the hpd_notify coming > >> from pmic_glink or any other extnernal hpd cases, there is no need to > >> put this through the hpd event thread because this will only make things > >> worse of exposing the race conditions of the state machine. > >> > >> Moving link training to enable and removal of hpd event thread will be > >> worked on but delaying obvious things we can fix does not make sense. > > > > From the commit message this feels like an optimisation rather than a > > fix. And granted the fragility of the HPD state machine, I'd prefer to > > stay away from optimisations. As far as I understood from the history > > of the last revert, we'd better make sure that HPD handling goes only > > through the HPD event thread. > > > > I think you are mixing the two. We tried to send the events through > DRM's hpd_notify which ended up in a bad way and btw, thats still not > resolved even though I have seen reports that things are fine with the > revert, we are consistently able to see us ending up in a disconnected > state with all the reverts and fixes in our x1e80100 DP setup. > > I plan to investigate that issue properly in the next week and try to > make some sense of it all. > > In fact, this patch is removing one more user of the hpd event thread > which is the direction in which we all want to head towards. As I stated earlier, from my point of view it doesn't make sense to rework the HPD thread in small steps. > On whether this is an optimization or a bug fix. I think by avoiding hpd > event thread (which should have never been used for hpd_notify updates, > hence a bug) we are avoiding the possibility of more race conditions. I think that the HPD event thread serializes handling of events, so avoiding it increases the possibility of a race condition. > > So, this has my R-b and it holds. Upto you. I'd wait for a proper description of the issue that was observed and how it is solved by this patch.
On 3/28/2024 8:23 PM, Dmitry Baryshkov wrote: > On Fri, 29 Mar 2024 at 04:16, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: >> >> >> >> On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote: >>> On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: >>>> >>>> >>>> >>>> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote: >>>>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote: >>>>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34) >>>>>>>> + Johan and Bjorn for FYI >>>>>>>> >>>>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: >>>>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD >>>>>>>>> interrupts generated by HPD block of DP controller. It converts >>>>>>>>> HPD interrupts into events and executed them under hpd_event_thread >>>>>>>>> context. For external HPD case, HPD events is delivered by way of >>>>>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed >>>>>>>>> under thread context already, there is no reason to hand over those >>>>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and >>>>>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). >>>>>>>>> >>>>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- >>>>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>>>>>> >>>>>>>> >>>>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") >>>>>>> >>>>>>> Is this a bug fix or an optimization? The commit text doesn't tell me. >>>>>>> >>>>>> >>>>>> I would say both. >>>>>> >>>>>> optimization as it avoids the need to go through the hpd_event thread >>>>>> processing. >>>>>> >>>>>> bug fix because once you go through the hpd event thread processing it >>>>>> exposes and often breaks the already fragile hpd handling state machine >>>>>> which can be avoided in this case. >>>>> >>>>> Please add a description for the particular issue that was observed >>>>> and how it is fixed by the patch. >>>>> >>>>> Otherwise consider there to be an implicit NAK for all HPD-related >>>>> patches unless it is a series that moves link training to the enable >>>>> path and drops the HPD state machine completely. >>>>> >>>>> I really mean it. We should stop beating a dead horse unless there is >>>>> a grave bug that must be fixed. >>>>> >>>> >>>> I think the commit message is explaining the issue well enough. >>>> >>>> This was not fixing any issue we saw to explain you the exact scenario >>>> of things which happened but this is just from code walkthrough. >>>> >>>> Like kuogee wrote, hpd event thread was there so handle events coming >>>> out of the hpd_isr for internal hpd cases. For the hpd_notify coming >>>> from pmic_glink or any other extnernal hpd cases, there is no need to >>>> put this through the hpd event thread because this will only make things >>>> worse of exposing the race conditions of the state machine. >>>> >>>> Moving link training to enable and removal of hpd event thread will be >>>> worked on but delaying obvious things we can fix does not make sense. >>> >>> From the commit message this feels like an optimisation rather than a >>> fix. And granted the fragility of the HPD state machine, I'd prefer to >>> stay away from optimisations. As far as I understood from the history >>> of the last revert, we'd better make sure that HPD handling goes only >>> through the HPD event thread. >>> >> >> I think you are mixing the two. We tried to send the events through >> DRM's hpd_notify which ended up in a bad way and btw, thats still not >> resolved even though I have seen reports that things are fine with the >> revert, we are consistently able to see us ending up in a disconnected >> state with all the reverts and fixes in our x1e80100 DP setup. >> >> I plan to investigate that issue properly in the next week and try to >> make some sense of it all. >> >> In fact, this patch is removing one more user of the hpd event thread >> which is the direction in which we all want to head towards. > > As I stated earlier, from my point of view it doesn't make sense to > rework the HPD thread in small steps. > >> On whether this is an optimization or a bug fix. I think by avoiding hpd >> event thread (which should have never been used for hpd_notify updates, >> hence a bug) we are avoiding the possibility of more race conditions. > > I think that the HPD event thread serializes handling of events, so > avoiding it increases the possibility of a race condition. > >> >> So, this has my R-b and it holds. Upto you. > > I'd wait for a proper description of the issue that was observed and > how it is solved by this patch. > This was a code walkthrough fix as I wrote a few times. If there no merit in pushing this, lets ignore it and stop discussing.
On 3/28/2024 10:47 PM, Abhinav Kumar wrote: > > > On 3/28/2024 8:23 PM, Dmitry Baryshkov wrote: >> On Fri, 29 Mar 2024 at 04:16, Abhinav Kumar >> <quic_abhinavk@quicinc.com> wrote: >>> >>> >>> >>> On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote: >>>> On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar >>>> <quic_abhinavk@quicinc.com> wrote: >>>>> >>>>> >>>>> >>>>> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote: >>>>>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar >>>>>> <quic_abhinavk@quicinc.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote: >>>>>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34) >>>>>>>>> + Johan and Bjorn for FYI >>>>>>>>> >>>>>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote: >>>>>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD >>>>>>>>>> interrupts generated by HPD block of DP controller. It converts >>>>>>>>>> HPD interrupts into events and executed them under >>>>>>>>>> hpd_event_thread >>>>>>>>>> context. For external HPD case, HPD events is delivered by way of >>>>>>>>>> dp_bridge_hpd_notify() under thread context. Since they are >>>>>>>>>> executed >>>>>>>>>> under thread context already, there is no reason to hand over >>>>>>>>>> those >>>>>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and >>>>>>>>>> dp_hpd_unplug_hanlde() are called directly at >>>>>>>>>> dp_bridge_hpd_notify(). >>>>>>>>>> >>>>>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> >>>>>>>>>> --- >>>>>>>>>> drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- >>>>>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>>>>>>> >>>>>>>>> >>>>>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()") >>>>>>>> >>>>>>>> Is this a bug fix or an optimization? The commit text doesn't >>>>>>>> tell me. >>>>>>>> >>>>>>> >>>>>>> I would say both. >>>>>>> >>>>>>> optimization as it avoids the need to go through the hpd_event >>>>>>> thread >>>>>>> processing. >>>>>>> >>>>>>> bug fix because once you go through the hpd event thread >>>>>>> processing it >>>>>>> exposes and often breaks the already fragile hpd handling state >>>>>>> machine >>>>>>> which can be avoided in this case. >>>>>> >>>>>> Please add a description for the particular issue that was observed >>>>>> and how it is fixed by the patch. >>>>>> >>>>>> Otherwise consider there to be an implicit NAK for all HPD-related >>>>>> patches unless it is a series that moves link training to the enable >>>>>> path and drops the HPD state machine completely. >>>>>> >>>>>> I really mean it. We should stop beating a dead horse unless there is >>>>>> a grave bug that must be fixed. >>>>>> >>>>> >>>>> I think the commit message is explaining the issue well enough. >>>>> >>>>> This was not fixing any issue we saw to explain you the exact scenario >>>>> of things which happened but this is just from code walkthrough. >>>>> >>>>> Like kuogee wrote, hpd event thread was there so handle events coming >>>>> out of the hpd_isr for internal hpd cases. For the hpd_notify coming >>>>> from pmic_glink or any other extnernal hpd cases, there is no need to >>>>> put this through the hpd event thread because this will only make >>>>> things >>>>> worse of exposing the race conditions of the state machine. >>>>> >>>>> Moving link training to enable and removal of hpd event thread will be >>>>> worked on but delaying obvious things we can fix does not make sense. >>>> >>>> From the commit message this feels like an optimisation rather than a >>>> fix. And granted the fragility of the HPD state machine, I'd prefer to >>>> stay away from optimisations. As far as I understood from the history >>>> of the last revert, we'd better make sure that HPD handling goes only >>>> through the HPD event thread. >>>> >>> >>> I think you are mixing the two. We tried to send the events through >>> DRM's hpd_notify which ended up in a bad way and btw, thats still not >>> resolved even though I have seen reports that things are fine with the >>> revert, we are consistently able to see us ending up in a disconnected >>> state with all the reverts and fixes in our x1e80100 DP setup. >>> >>> I plan to investigate that issue properly in the next week and try to >>> make some sense of it all. >>> >>> In fact, this patch is removing one more user of the hpd event thread >>> which is the direction in which we all want to head towards. >> >> As I stated earlier, from my point of view it doesn't make sense to >> rework the HPD thread in small steps. >> >>> On whether this is an optimization or a bug fix. I think by avoiding hpd >>> event thread (which should have never been used for hpd_notify updates, >>> hence a bug) we are avoiding the possibility of more race conditions. >> >> I think that the HPD event thread serializes handling of events, so >> avoiding it increases the possibility of a race condition. >> >>> >>> So, this has my R-b and it holds. Upto you. >> >> I'd wait for a proper description of the issue that was observed and >> how it is solved by this patch. >> > > This was a code walkthrough fix as I wrote a few times. If there no > merit in pushing this, lets ignore it and stop discussing. > Ok, so after we debugged the HPD issue on we have found the issue and why actually this change will help. I am going to post a V2 with more details on the commit text. We can discuss after that.
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c index 94dd60f..0476ad9 100644 --- a/drivers/gpu/drm/msm/dp/dp_display.c +++ b/drivers/gpu/drm/msm/dp/dp_display.c @@ -1706,7 +1706,8 @@ void dp_bridge_hpd_notify(struct drm_bridge *bridge, status &= ~0x80000000; if (!dp_display->link_ready && status == connector_status_connected) - dp_add_event(dp, EV_HPD_PLUG_INT, 0, 0); + dp_hpd_plug_handle(dp, 0); else if (dp_display->link_ready && status == connector_status_disconnected) - dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0); + dp_hpd_unplug_handle(dp, 0); + }
For internal HPD case, hpd_event_thread is created to handle HPD interrupts generated by HPD block of DP controller. It converts HPD interrupts into events and executed them under hpd_event_thread context. For external HPD case, HPD events is delivered by way of dp_bridge_hpd_notify() under thread context. Since they are executed under thread context already, there is no reason to hand over those events to hpd_event_thread. Hence dp_hpd_plug_handle() and dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify(). Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com> --- drivers/gpu/drm/msm/dp/dp_display.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)