Message ID | 1435575640-32105-1-git-send-email-sivakumar.thulasimani@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: > From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> > > HPD storm is detected in intel_hpd_irq_handler and disabled for respective > port immediately but polling is enabled only in i915_hotplug_work_func and > not in i915_digport_work_func. This will result in disabled hpd never enabled > back again. This is fixed by calling the appropriate storm disable function > that will handle the rest of the sequence (both polling enable and reenabling > of HPD later). > --- > drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c > index 3c53aac..8e18587 100644 > --- a/drivers/gpu/drm/i915/intel_hotplug.c > +++ b/drivers/gpu/drm/i915/intel_hotplug.c > @@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) > dev_priv->hotplug.long_port_mask = 0; > short_port_mask = dev_priv->hotplug.short_port_mask; > dev_priv->hotplug.short_port_mask = 0; > + > + /* Disable hotplug on connectors that hit an irq storm. */ > + intel_hpd_irq_storm_disable(dev_priv); digport_work_func schedules the hotplug handler for everything not handled, which should result in this getting called. It really shouldn't matter when exactly it gets called. Can you please provide more data and details for your analysis? Like bug reports, backtraces and dmesg traces showing that the handler is stuck and similar things. Also your patch is missing the s-o-b line. -Daniel > + > spin_unlock_irq(&dev_priv->irq_lock); > > for (i = 0; i < I915_MAX_PORTS; i++) { > -- > 1.7.9.5 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6660
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
ILK 302/302 302/302
SNB 312/316 312/316
IVB 343/343 343/343
BYT -1 287/287 286/287
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*BYT igt@gem_partial_pwrite_pread@reads-display PASS(1) FAIL(1)
Note: You need to pay more attention to line start with '*'
On 6/29/2015 10:07 PM, Daniel Vetter wrote: > On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: >> From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> >> >> HPD storm is detected in intel_hpd_irq_handler and disabled for respective >> port immediately but polling is enabled only in i915_hotplug_work_func and >> not in i915_digport_work_func. This will result in disabled hpd never enabled >> back again. This is fixed by calling the appropriate storm disable function >> that will handle the rest of the sequence (both polling enable and reenabling >> of HPD later). >> --- >> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c >> index 3c53aac..8e18587 100644 >> --- a/drivers/gpu/drm/i915/intel_hotplug.c >> +++ b/drivers/gpu/drm/i915/intel_hotplug.c >> @@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) >> dev_priv->hotplug.long_port_mask = 0; >> short_port_mask = dev_priv->hotplug.short_port_mask; >> dev_priv->hotplug.short_port_mask = 0; >> + >> + /* Disable hotplug on connectors that hit an irq storm. */ >> + intel_hpd_irq_storm_disable(dev_priv); > digport_work_func schedules the hotplug handler for everything not > handled, which should result in this getting called. It really shouldn't > matter when exactly it gets called. > > Can you please provide more data and details for your analysis? Like bug > reports, backtraces and dmesg traces showing that the handler is stuck and > similar things. > > Also your patch is missing the s-o-b line. > -Daniel > there is no bug filed for this, it was observed as part of code analysis (that is provided below) I'll try to get more info as soon as i get access to a system. short answer: the issue will be seen during hpd storm, where the last HPD is handled inside intel_dp_hpd_pulse. so i915_hotplug_work_func will not be queued thus missing the storm_disable call. long answer : To give a bit more background, lets assume that we get a call to intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B on a HSW/BDW system during HPD storm scenario. The following sequence will take place *) is_dig_port will be set and will result in queue_dig being set as well *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within the HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to HPD_MARK_DISABLED *) This will result in HPD for PORT_B being disabled immediately(masked in case of LPT) *) i915_digport_work_func will be queued at the end of this function, since queue_dig is set *) once in the i915_digport_work_func, hpd_pulse func pointer will be executed since it is defined for DP *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged in still, ISR will be high and so will return true. *) intel_dp_get_dpcd, will succeed since DP is connected *) finally IRQ_HANDLED will be returned *) once call exits intel_hpd_irq_handler, HPD on port B will never be enabled again (unmasked in case of LPT) and no more hot plug notifications. sorry for the incomplete patch , i'll reupload again once i get some more details. >> spin_unlock_irq(&dev_priv->irq_lock); >> >> for (i = 0; i < I915_MAX_PORTS; i++) { >> -- >> 1.7.9.5 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > >
On Tue, Jun 30, 2015 at 08:45:48AM +0530, Sivakumar Thulasimani wrote: > > > On 6/29/2015 10:07 PM, Daniel Vetter wrote: > >On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: > >>From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> > >> > >>HPD storm is detected in intel_hpd_irq_handler and disabled for respective > >>port immediately but polling is enabled only in i915_hotplug_work_func and > >>not in i915_digport_work_func. This will result in disabled hpd never enabled > >>back again. This is fixed by calling the appropriate storm disable function > >>that will handle the rest of the sequence (both polling enable and reenabling > >>of HPD later). > >>--- > >> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >>diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c > >>index 3c53aac..8e18587 100644 > >>--- a/drivers/gpu/drm/i915/intel_hotplug.c > >>+++ b/drivers/gpu/drm/i915/intel_hotplug.c > >>@@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) > >> dev_priv->hotplug.long_port_mask = 0; > >> short_port_mask = dev_priv->hotplug.short_port_mask; > >> dev_priv->hotplug.short_port_mask = 0; > >>+ > >>+ /* Disable hotplug on connectors that hit an irq storm. */ > >>+ intel_hpd_irq_storm_disable(dev_priv); > >digport_work_func schedules the hotplug handler for everything not > >handled, which should result in this getting called. It really shouldn't > >matter when exactly it gets called. > > > >Can you please provide more data and details for your analysis? Like bug > >reports, backtraces and dmesg traces showing that the handler is stuck and > >similar things. > > > >Also your patch is missing the s-o-b line. > >-Daniel > > > there is no bug filed for this, it was observed as part of code analysis > (that is provided below) > I'll try to get more info as soon as i get access to a system. > > short answer: > the issue will be seen during hpd storm, where the last HPD is handled > inside intel_dp_hpd_pulse. > so i915_hotplug_work_func will not be queued thus missing the storm_disable > call. > > long answer : > To give a bit more background, lets assume that we get a call to > intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B > on a HSW/BDW system during HPD storm scenario. > The following sequence will take place > *) is_dig_port will be set and will result in queue_dig being set as well > *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within > the > HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to > HPD_MARK_DISABLED > *) This will result in HPD for PORT_B being disabled immediately(masked in > case of LPT) > *) i915_digport_work_func will be queued at the end of this function, since > queue_dig is set > *) once in the i915_digport_work_func, hpd_pulse func pointer will be > executed since it is defined for DP > *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged > in still, > ISR will be high and so will return true. > *) intel_dp_get_dpcd, will succeed since DP is connected > *) finally IRQ_HANDLED will be returned > *) once call exits intel_hpd_irq_handler, HPD on port B will never be > enabled again > (unmasked in case of LPT) and no more hot plug notifications. The assumption of the storm code is that when there is a DP sink, a storm will never happen. We need that since otherwise the mst code (which creates ridiculous amounts of hpds on the DP port) will run into the storm detection code all the time. Might be better to document this design assumption somewhere, but it is baked in. Hence my question whether you've seen this happen in the real world - DP storms haven't been observed yet afaik, and it would be a much more serious problem. -Daniel > > sorry for the incomplete patch , i'll reupload again once i get some more > details. > > >> spin_unlock_irq(&dev_priv->irq_lock); > >> for (i = 0; i < I915_MAX_PORTS; i++) { > >>-- > >>1.7.9.5 > >> > >>_______________________________________________ > >>Intel-gfx mailing list > >>Intel-gfx@lists.freedesktop.org > >>http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > > > > > > -- > regards, > Sivakumar > >
On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: > On Tue, Jun 30, 2015 at 08:45:48AM +0530, Sivakumar Thulasimani wrote: >> >> >> On 6/29/2015 10:07 PM, Daniel Vetter wrote: >> >On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: >> >>From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> >> >> >> >>HPD storm is detected in intel_hpd_irq_handler and disabled for respective >> >>port immediately but polling is enabled only in i915_hotplug_work_func and >> >>not in i915_digport_work_func. This will result in disabled hpd never enabled >> >>back again. This is fixed by calling the appropriate storm disable function >> >>that will handle the rest of the sequence (both polling enable and reenabling >> >>of HPD later). >> >>--- >> >> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ >> >> 1 file changed, 4 insertions(+) >> >> >> >>diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c >> >>index 3c53aac..8e18587 100644 >> >>--- a/drivers/gpu/drm/i915/intel_hotplug.c >> >>+++ b/drivers/gpu/drm/i915/intel_hotplug.c >> >>@@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) >> >> dev_priv->hotplug.long_port_mask = 0; >> >> short_port_mask = dev_priv->hotplug.short_port_mask; >> >> dev_priv->hotplug.short_port_mask = 0; >> >>+ >> >>+ /* Disable hotplug on connectors that hit an irq storm. */ >> >>+ intel_hpd_irq_storm_disable(dev_priv); >> >digport_work_func schedules the hotplug handler for everything not >> >handled, which should result in this getting called. It really shouldn't >> >matter when exactly it gets called. >> > >> >Can you please provide more data and details for your analysis? Like bug >> >reports, backtraces and dmesg traces showing that the handler is stuck and >> >similar things. >> > >> >Also your patch is missing the s-o-b line. >> >-Daniel >> > >> there is no bug filed for this, it was observed as part of code analysis >> (that is provided below) >> I'll try to get more info as soon as i get access to a system. >> >> short answer: >> the issue will be seen during hpd storm, where the last HPD is handled >> inside intel_dp_hpd_pulse. >> so i915_hotplug_work_func will not be queued thus missing the storm_disable >> call. >> >> long answer : >> To give a bit more background, lets assume that we get a call to >> intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B >> on a HSW/BDW system during HPD storm scenario. >> The following sequence will take place >> *) is_dig_port will be set and will result in queue_dig being set as well >> *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within >> the >> HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to >> HPD_MARK_DISABLED >> *) This will result in HPD for PORT_B being disabled immediately(masked in >> case of LPT) >> *) i915_digport_work_func will be queued at the end of this function, since >> queue_dig is set >> *) once in the i915_digport_work_func, hpd_pulse func pointer will be >> executed since it is defined for DP >> *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged >> in still, >> ISR will be high and so will return true. >> *) intel_dp_get_dpcd, will succeed since DP is connected >> *) finally IRQ_HANDLED will be returned >> *) once call exits intel_hpd_irq_handler, HPD on port B will never be >> enabled again >> (unmasked in case of LPT) and no more hot plug notifications. > > The assumption of the storm code is that when there is a DP sink, a storm > will never happen. We need that since otherwise the mst code (which > creates ridiculous amounts of hpds on the DP port) will run into the storm > detection code all the time. > > Might be better to document this design assumption somewhere, but it is > baked in. Hence my question whether you've seen this happen in the real > world - DP storms haven't been observed yet afaik, and it would be a much > more serious problem. The dp short hotplug irqs (used by mst) are not caught by the irq storm code, but the long hotplug irqs are. BR, Jani. > -Daniel > >> >> sorry for the incomplete patch , i'll reupload again once i get some more >> details. >> >> >> spin_unlock_irq(&dev_priv->irq_lock); >> >> for (i = 0; i < I915_MAX_PORTS; i++) { >> >>-- >> >>1.7.9.5 >> >> >> >>_______________________________________________ >> >>Intel-gfx mailing list >> >>Intel-gfx@lists.freedesktop.org >> >>http://lists.freedesktop.org/mailman/listinfo/intel-gfx >> > >> > >> >> >> -- >> regards, >> Sivakumar >> >> > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Tue, Jun 30, 2015 at 01:19:57PM +0300, Jani Nikula wrote: > On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: > > On Tue, Jun 30, 2015 at 08:45:48AM +0530, Sivakumar Thulasimani wrote: > >> > >> > >> On 6/29/2015 10:07 PM, Daniel Vetter wrote: > >> >On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: > >> >>From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> > >> >> > >> >>HPD storm is detected in intel_hpd_irq_handler and disabled for respective > >> >>port immediately but polling is enabled only in i915_hotplug_work_func and > >> >>not in i915_digport_work_func. This will result in disabled hpd never enabled > >> >>back again. This is fixed by calling the appropriate storm disable function > >> >>that will handle the rest of the sequence (both polling enable and reenabling > >> >>of HPD later). > >> >>--- > >> >> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ > >> >> 1 file changed, 4 insertions(+) > >> >> > >> >>diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c > >> >>index 3c53aac..8e18587 100644 > >> >>--- a/drivers/gpu/drm/i915/intel_hotplug.c > >> >>+++ b/drivers/gpu/drm/i915/intel_hotplug.c > >> >>@@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) > >> >> dev_priv->hotplug.long_port_mask = 0; > >> >> short_port_mask = dev_priv->hotplug.short_port_mask; > >> >> dev_priv->hotplug.short_port_mask = 0; > >> >>+ > >> >>+ /* Disable hotplug on connectors that hit an irq storm. */ > >> >>+ intel_hpd_irq_storm_disable(dev_priv); > >> >digport_work_func schedules the hotplug handler for everything not > >> >handled, which should result in this getting called. It really shouldn't > >> >matter when exactly it gets called. > >> > > >> >Can you please provide more data and details for your analysis? Like bug > >> >reports, backtraces and dmesg traces showing that the handler is stuck and > >> >similar things. > >> > > >> >Also your patch is missing the s-o-b line. > >> >-Daniel > >> > > >> there is no bug filed for this, it was observed as part of code analysis > >> (that is provided below) > >> I'll try to get more info as soon as i get access to a system. > >> > >> short answer: > >> the issue will be seen during hpd storm, where the last HPD is handled > >> inside intel_dp_hpd_pulse. > >> so i915_hotplug_work_func will not be queued thus missing the storm_disable > >> call. > >> > >> long answer : > >> To give a bit more background, lets assume that we get a call to > >> intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B > >> on a HSW/BDW system during HPD storm scenario. > >> The following sequence will take place > >> *) is_dig_port will be set and will result in queue_dig being set as well > >> *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within > >> the > >> HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to > >> HPD_MARK_DISABLED > >> *) This will result in HPD for PORT_B being disabled immediately(masked in > >> case of LPT) > >> *) i915_digport_work_func will be queued at the end of this function, since > >> queue_dig is set > >> *) once in the i915_digport_work_func, hpd_pulse func pointer will be > >> executed since it is defined for DP > >> *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged > >> in still, > >> ISR will be high and so will return true. > >> *) intel_dp_get_dpcd, will succeed since DP is connected > >> *) finally IRQ_HANDLED will be returned > >> *) once call exits intel_hpd_irq_handler, HPD on port B will never be > >> enabled again > >> (unmasked in case of LPT) and no more hot plug notifications. > > > > The assumption of the storm code is that when there is a DP sink, a storm > > will never happen. We need that since otherwise the mst code (which > > creates ridiculous amounts of hpds on the DP port) will run into the storm > > detection code all the time. > > > > Might be better to document this design assumption somewhere, but it is > > baked in. Hence my question whether you've seen this happen in the real > > world - DP storms haven't been observed yet afaik, and it would be a much > > more serious problem. > > The dp short hotplug irqs (used by mst) are not caught by the irq storm > code, but the long hotplug irqs are. We assume there's no DP hotplug storms ever, whether long or short pulses. Trying to fix that will require serious rework since we need to wait until dig_port_work has run to know whether the hpd was a real one or just fluctuation, and only update storm statistic then. And once we do DP is essentially broken, which means we also need to enable polling for dp aux short pulses (which will probably piss off some sink device). In short: If you have a hpd storm, and there's something DP-like connected, you're screwed. Until we have real-world evidence of this happening updating comments is really the only thing we need. -Daniel
On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: > On Tue, Jun 30, 2015 at 01:19:57PM +0300, Jani Nikula wrote: >> On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: >> > On Tue, Jun 30, 2015 at 08:45:48AM +0530, Sivakumar Thulasimani wrote: >> >> >> >> >> >> On 6/29/2015 10:07 PM, Daniel Vetter wrote: >> >> >On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: >> >> >>From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> >> >> >> >> >> >>HPD storm is detected in intel_hpd_irq_handler and disabled for respective >> >> >>port immediately but polling is enabled only in i915_hotplug_work_func and >> >> >>not in i915_digport_work_func. This will result in disabled hpd never enabled >> >> >>back again. This is fixed by calling the appropriate storm disable function >> >> >>that will handle the rest of the sequence (both polling enable and reenabling >> >> >>of HPD later). >> >> >>--- >> >> >> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ >> >> >> 1 file changed, 4 insertions(+) >> >> >> >> >> >>diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c >> >> >>index 3c53aac..8e18587 100644 >> >> >>--- a/drivers/gpu/drm/i915/intel_hotplug.c >> >> >>+++ b/drivers/gpu/drm/i915/intel_hotplug.c >> >> >>@@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) >> >> >> dev_priv->hotplug.long_port_mask = 0; >> >> >> short_port_mask = dev_priv->hotplug.short_port_mask; >> >> >> dev_priv->hotplug.short_port_mask = 0; >> >> >>+ >> >> >>+ /* Disable hotplug on connectors that hit an irq storm. */ >> >> >>+ intel_hpd_irq_storm_disable(dev_priv); >> >> >digport_work_func schedules the hotplug handler for everything not >> >> >handled, which should result in this getting called. It really shouldn't >> >> >matter when exactly it gets called. >> >> > >> >> >Can you please provide more data and details for your analysis? Like bug >> >> >reports, backtraces and dmesg traces showing that the handler is stuck and >> >> >similar things. >> >> > >> >> >Also your patch is missing the s-o-b line. >> >> >-Daniel >> >> > >> >> there is no bug filed for this, it was observed as part of code analysis >> >> (that is provided below) >> >> I'll try to get more info as soon as i get access to a system. >> >> >> >> short answer: >> >> the issue will be seen during hpd storm, where the last HPD is handled >> >> inside intel_dp_hpd_pulse. >> >> so i915_hotplug_work_func will not be queued thus missing the storm_disable >> >> call. >> >> >> >> long answer : >> >> To give a bit more background, lets assume that we get a call to >> >> intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B >> >> on a HSW/BDW system during HPD storm scenario. >> >> The following sequence will take place >> >> *) is_dig_port will be set and will result in queue_dig being set as well >> >> *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within >> >> the >> >> HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to >> >> HPD_MARK_DISABLED >> >> *) This will result in HPD for PORT_B being disabled immediately(masked in >> >> case of LPT) >> >> *) i915_digport_work_func will be queued at the end of this function, since >> >> queue_dig is set >> >> *) once in the i915_digport_work_func, hpd_pulse func pointer will be >> >> executed since it is defined for DP >> >> *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged >> >> in still, >> >> ISR will be high and so will return true. >> >> *) intel_dp_get_dpcd, will succeed since DP is connected >> >> *) finally IRQ_HANDLED will be returned >> >> *) once call exits intel_hpd_irq_handler, HPD on port B will never be >> >> enabled again >> >> (unmasked in case of LPT) and no more hot plug notifications. >> > >> > The assumption of the storm code is that when there is a DP sink, a storm >> > will never happen. We need that since otherwise the mst code (which >> > creates ridiculous amounts of hpds on the DP port) will run into the storm >> > detection code all the time. >> > >> > Might be better to document this design assumption somewhere, but it is >> > baked in. Hence my question whether you've seen this happen in the real >> > world - DP storms haven't been observed yet afaik, and it would be a much >> > more serious problem. >> >> The dp short hotplug irqs (used by mst) are not caught by the irq storm >> code, but the long hotplug irqs are. > > We assume there's no DP hotplug storms ever, whether long or short pulses. > Trying to fix that will require serious rework since we need to wait until > dig_port_work has run to know whether the hpd was a real one or just > fluctuation, and only update storm statistic then. And once we do DP is > essentially broken, which means we also need to enable polling for dp aux > short pulses (which will probably piss off some sink device). > > In short: If you have a hpd storm, and there's something DP-like > connected, you're screwed. Until we have real-world evidence of this > happening updating comments is really the only thing we need. In that case we should update the code to never do hotplug irq storm detection on dp long hpd, which we currently do. BR, Jani. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Tue, Jun 30, 2015 at 03:30:16PM +0300, Jani Nikula wrote: > On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: > > On Tue, Jun 30, 2015 at 01:19:57PM +0300, Jani Nikula wrote: > >> On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: > >> > On Tue, Jun 30, 2015 at 08:45:48AM +0530, Sivakumar Thulasimani wrote: > >> >> > >> >> > >> >> On 6/29/2015 10:07 PM, Daniel Vetter wrote: > >> >> >On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: > >> >> >>From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> > >> >> >> > >> >> >>HPD storm is detected in intel_hpd_irq_handler and disabled for respective > >> >> >>port immediately but polling is enabled only in i915_hotplug_work_func and > >> >> >>not in i915_digport_work_func. This will result in disabled hpd never enabled > >> >> >>back again. This is fixed by calling the appropriate storm disable function > >> >> >>that will handle the rest of the sequence (both polling enable and reenabling > >> >> >>of HPD later). > >> >> >>--- > >> >> >> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ > >> >> >> 1 file changed, 4 insertions(+) > >> >> >> > >> >> >>diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c > >> >> >>index 3c53aac..8e18587 100644 > >> >> >>--- a/drivers/gpu/drm/i915/intel_hotplug.c > >> >> >>+++ b/drivers/gpu/drm/i915/intel_hotplug.c > >> >> >>@@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) > >> >> >> dev_priv->hotplug.long_port_mask = 0; > >> >> >> short_port_mask = dev_priv->hotplug.short_port_mask; > >> >> >> dev_priv->hotplug.short_port_mask = 0; > >> >> >>+ > >> >> >>+ /* Disable hotplug on connectors that hit an irq storm. */ > >> >> >>+ intel_hpd_irq_storm_disable(dev_priv); > >> >> >digport_work_func schedules the hotplug handler for everything not > >> >> >handled, which should result in this getting called. It really shouldn't > >> >> >matter when exactly it gets called. > >> >> > > >> >> >Can you please provide more data and details for your analysis? Like bug > >> >> >reports, backtraces and dmesg traces showing that the handler is stuck and > >> >> >similar things. > >> >> > > >> >> >Also your patch is missing the s-o-b line. > >> >> >-Daniel > >> >> > > >> >> there is no bug filed for this, it was observed as part of code analysis > >> >> (that is provided below) > >> >> I'll try to get more info as soon as i get access to a system. > >> >> > >> >> short answer: > >> >> the issue will be seen during hpd storm, where the last HPD is handled > >> >> inside intel_dp_hpd_pulse. > >> >> so i915_hotplug_work_func will not be queued thus missing the storm_disable > >> >> call. > >> >> > >> >> long answer : > >> >> To give a bit more background, lets assume that we get a call to > >> >> intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B > >> >> on a HSW/BDW system during HPD storm scenario. > >> >> The following sequence will take place > >> >> *) is_dig_port will be set and will result in queue_dig being set as well > >> >> *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within > >> >> the > >> >> HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to > >> >> HPD_MARK_DISABLED > >> >> *) This will result in HPD for PORT_B being disabled immediately(masked in > >> >> case of LPT) > >> >> *) i915_digport_work_func will be queued at the end of this function, since > >> >> queue_dig is set > >> >> *) once in the i915_digport_work_func, hpd_pulse func pointer will be > >> >> executed since it is defined for DP > >> >> *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged > >> >> in still, > >> >> ISR will be high and so will return true. > >> >> *) intel_dp_get_dpcd, will succeed since DP is connected > >> >> *) finally IRQ_HANDLED will be returned > >> >> *) once call exits intel_hpd_irq_handler, HPD on port B will never be > >> >> enabled again > >> >> (unmasked in case of LPT) and no more hot plug notifications. > >> > > >> > The assumption of the storm code is that when there is a DP sink, a storm > >> > will never happen. We need that since otherwise the mst code (which > >> > creates ridiculous amounts of hpds on the DP port) will run into the storm > >> > detection code all the time. > >> > > >> > Might be better to document this design assumption somewhere, but it is > >> > baked in. Hence my question whether you've seen this happen in the real > >> > world - DP storms haven't been observed yet afaik, and it would be a much > >> > more serious problem. > >> > >> The dp short hotplug irqs (used by mst) are not caught by the irq storm > >> code, but the long hotplug irqs are. > > > > We assume there's no DP hotplug storms ever, whether long or short pulses. > > Trying to fix that will require serious rework since we need to wait until > > dig_port_work has run to know whether the hpd was a real one or just > > fluctuation, and only update storm statistic then. And once we do DP is > > essentially broken, which means we also need to enable polling for dp aux > > short pulses (which will probably piss off some sink device). > > > > In short: If you have a hpd storm, and there's something DP-like > > connected, you're screwed. Until we have real-world evidence of this > > happening updating comments is really the only thing we need. > > In that case we should update the code to never do hotplug irq storm > detection on dp long hpd, which we currently do. The HPD pin is shared for DP and HDMI so we can't disable HPD just for HDMI when a storm is detected.
On Tue, Jun 30, 2015 at 03:47:41PM +0300, Ville Syrjälä wrote: > On Tue, Jun 30, 2015 at 03:30:16PM +0300, Jani Nikula wrote: > > On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: > > > On Tue, Jun 30, 2015 at 01:19:57PM +0300, Jani Nikula wrote: > > >> On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: > > >> > On Tue, Jun 30, 2015 at 08:45:48AM +0530, Sivakumar Thulasimani wrote: > > >> >> > > >> >> > > >> >> On 6/29/2015 10:07 PM, Daniel Vetter wrote: > > >> >> >On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: > > >> >> >>From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> > > >> >> >> > > >> >> >>HPD storm is detected in intel_hpd_irq_handler and disabled for respective > > >> >> >>port immediately but polling is enabled only in i915_hotplug_work_func and > > >> >> >>not in i915_digport_work_func. This will result in disabled hpd never enabled > > >> >> >>back again. This is fixed by calling the appropriate storm disable function > > >> >> >>that will handle the rest of the sequence (both polling enable and reenabling > > >> >> >>of HPD later). > > >> >> >>--- > > >> >> >> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ > > >> >> >> 1 file changed, 4 insertions(+) > > >> >> >> > > >> >> >>diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c > > >> >> >>index 3c53aac..8e18587 100644 > > >> >> >>--- a/drivers/gpu/drm/i915/intel_hotplug.c > > >> >> >>+++ b/drivers/gpu/drm/i915/intel_hotplug.c > > >> >> >>@@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) > > >> >> >> dev_priv->hotplug.long_port_mask = 0; > > >> >> >> short_port_mask = dev_priv->hotplug.short_port_mask; > > >> >> >> dev_priv->hotplug.short_port_mask = 0; > > >> >> >>+ > > >> >> >>+ /* Disable hotplug on connectors that hit an irq storm. */ > > >> >> >>+ intel_hpd_irq_storm_disable(dev_priv); > > >> >> >digport_work_func schedules the hotplug handler for everything not > > >> >> >handled, which should result in this getting called. It really shouldn't > > >> >> >matter when exactly it gets called. > > >> >> > > > >> >> >Can you please provide more data and details for your analysis? Like bug > > >> >> >reports, backtraces and dmesg traces showing that the handler is stuck and > > >> >> >similar things. > > >> >> > > > >> >> >Also your patch is missing the s-o-b line. > > >> >> >-Daniel > > >> >> > > > >> >> there is no bug filed for this, it was observed as part of code analysis > > >> >> (that is provided below) > > >> >> I'll try to get more info as soon as i get access to a system. > > >> >> > > >> >> short answer: > > >> >> the issue will be seen during hpd storm, where the last HPD is handled > > >> >> inside intel_dp_hpd_pulse. > > >> >> so i915_hotplug_work_func will not be queued thus missing the storm_disable > > >> >> call. > > >> >> > > >> >> long answer : > > >> >> To give a bit more background, lets assume that we get a call to > > >> >> intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B > > >> >> on a HSW/BDW system during HPD storm scenario. > > >> >> The following sequence will take place > > >> >> *) is_dig_port will be set and will result in queue_dig being set as well > > >> >> *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within > > >> >> the > > >> >> HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to > > >> >> HPD_MARK_DISABLED > > >> >> *) This will result in HPD for PORT_B being disabled immediately(masked in > > >> >> case of LPT) > > >> >> *) i915_digport_work_func will be queued at the end of this function, since > > >> >> queue_dig is set > > >> >> *) once in the i915_digport_work_func, hpd_pulse func pointer will be > > >> >> executed since it is defined for DP > > >> >> *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged > > >> >> in still, > > >> >> ISR will be high and so will return true. > > >> >> *) intel_dp_get_dpcd, will succeed since DP is connected > > >> >> *) finally IRQ_HANDLED will be returned > > >> >> *) once call exits intel_hpd_irq_handler, HPD on port B will never be > > >> >> enabled again > > >> >> (unmasked in case of LPT) and no more hot plug notifications. > > >> > > > >> > The assumption of the storm code is that when there is a DP sink, a storm > > >> > will never happen. We need that since otherwise the mst code (which > > >> > creates ridiculous amounts of hpds on the DP port) will run into the storm > > >> > detection code all the time. > > >> > > > >> > Might be better to document this design assumption somewhere, but it is > > >> > baked in. Hence my question whether you've seen this happen in the real > > >> > world - DP storms haven't been observed yet afaik, and it would be a much > > >> > more serious problem. > > >> > > >> The dp short hotplug irqs (used by mst) are not caught by the irq storm > > >> code, but the long hotplug irqs are. > > > > > > We assume there's no DP hotplug storms ever, whether long or short pulses. > > > Trying to fix that will require serious rework since we need to wait until > > > dig_port_work has run to know whether the hpd was a real one or just > > > fluctuation, and only update storm statistic then. And once we do DP is > > > essentially broken, which means we also need to enable polling for dp aux > > > short pulses (which will probably piss off some sink device). > > > > > > In short: If you have a hpd storm, and there's something DP-like > > > connected, you're screwed. Until we have real-world evidence of this > > > happening updating comments is really the only thing we need. > > > > In that case we should update the code to never do hotplug irq storm > > detection on dp long hpd, which we currently do. > > The HPD pin is shared for DP and HDMI so we can't disable HPD just for > HDMI when a storm is detected. Yup this is the crux. The real fix really is wiring up IRQ_NONE handling all the way to be able to differentiate storms from normal/expected irq load. But that really has to wait until this happens in reality somewhere with a DP sink. -Daniel
On 7/1/2015 6:08 PM, Daniel Vetter wrote: > On Tue, Jun 30, 2015 at 03:47:41PM +0300, Ville Syrjälä wrote: >> On Tue, Jun 30, 2015 at 03:30:16PM +0300, Jani Nikula wrote: >>> On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: >>>> On Tue, Jun 30, 2015 at 01:19:57PM +0300, Jani Nikula wrote: >>>>> On Tue, 30 Jun 2015, Daniel Vetter <daniel@ffwll.ch> wrote: >>>>>> On Tue, Jun 30, 2015 at 08:45:48AM +0530, Sivakumar Thulasimani wrote: >>>>>>> >>>>>>> On 6/29/2015 10:07 PM, Daniel Vetter wrote: >>>>>>>> On Mon, Jun 29, 2015 at 04:30:40PM +0530, Sivakumar Thulasimani wrote: >>>>>>>>> From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> >>>>>>>>> >>>>>>>>> HPD storm is detected in intel_hpd_irq_handler and disabled for respective >>>>>>>>> port immediately but polling is enabled only in i915_hotplug_work_func and >>>>>>>>> not in i915_digport_work_func. This will result in disabled hpd never enabled >>>>>>>>> back again. This is fixed by calling the appropriate storm disable function >>>>>>>>> that will handle the rest of the sequence (both polling enable and reenabling >>>>>>>>> of HPD later). >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ >>>>>>>>> 1 file changed, 4 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c >>>>>>>>> index 3c53aac..8e18587 100644 >>>>>>>>> --- a/drivers/gpu/drm/i915/intel_hotplug.c >>>>>>>>> +++ b/drivers/gpu/drm/i915/intel_hotplug.c >>>>>>>>> @@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) >>>>>>>>> dev_priv->hotplug.long_port_mask = 0; >>>>>>>>> short_port_mask = dev_priv->hotplug.short_port_mask; >>>>>>>>> dev_priv->hotplug.short_port_mask = 0; >>>>>>>>> + >>>>>>>>> + /* Disable hotplug on connectors that hit an irq storm. */ >>>>>>>>> + intel_hpd_irq_storm_disable(dev_priv); >>>>>>>> digport_work_func schedules the hotplug handler for everything not >>>>>>>> handled, which should result in this getting called. It really shouldn't >>>>>>>> matter when exactly it gets called. >>>>>>>> >>>>>>>> Can you please provide more data and details for your analysis? Like bug >>>>>>>> reports, backtraces and dmesg traces showing that the handler is stuck and >>>>>>>> similar things. >>>>>>>> >>>>>>>> Also your patch is missing the s-o-b line. >>>>>>>> -Daniel >>>>>>>> >>>>>>> there is no bug filed for this, it was observed as part of code analysis >>>>>>> (that is provided below) >>>>>>> I'll try to get more info as soon as i get access to a system. >>>>>>> >>>>>>> short answer: >>>>>>> the issue will be seen during hpd storm, where the last HPD is handled >>>>>>> inside intel_dp_hpd_pulse. >>>>>>> so i915_hotplug_work_func will not be queued thus missing the storm_disable >>>>>>> call. >>>>>>> >>>>>>> long answer : >>>>>>> To give a bit more background, lets assume that we get a call to >>>>>>> intel_hpd_irq_handler, with params long pulse for DP panel in PORT_B >>>>>>> on a HSW/BDW system during HPD storm scenario. >>>>>>> The following sequence will take place >>>>>>> *) is_dig_port will be set and will result in queue_dig being set as well >>>>>>> *) intel_hpd_irq_storm_detect will detect that this is 6th hpd call within >>>>>>> the >>>>>>> HPD_STORM_DETECT_PERIOD and so will mark the HPD status of PORT_B to >>>>>>> HPD_MARK_DISABLED >>>>>>> *) This will result in HPD for PORT_B being disabled immediately(masked in >>>>>>> case of LPT) >>>>>>> *) i915_digport_work_func will be queued at the end of this function, since >>>>>>> queue_dig is set >>>>>>> *) once in the i915_digport_work_func, hpd_pulse func pointer will be >>>>>>> executed since it is defined for DP >>>>>>> *) intel_dp_hpd_pulse, will have long_hpd set and since the panel is plugged >>>>>>> in still, >>>>>>> ISR will be high and so will return true. >>>>>>> *) intel_dp_get_dpcd, will succeed since DP is connected >>>>>>> *) finally IRQ_HANDLED will be returned >>>>>>> *) once call exits intel_hpd_irq_handler, HPD on port B will never be >>>>>>> enabled again >>>>>>> (unmasked in case of LPT) and no more hot plug notifications. >>>>>> The assumption of the storm code is that when there is a DP sink, a storm >>>>>> will never happen. We need that since otherwise the mst code (which >>>>>> creates ridiculous amounts of hpds on the DP port) will run into the storm >>>>>> detection code all the time. >>>>>> >>>>>> Might be better to document this design assumption somewhere, but it is >>>>>> baked in. Hence my question whether you've seen this happen in the real >>>>>> world - DP storms haven't been observed yet afaik, and it would be a much >>>>>> more serious problem. >>>>> The dp short hotplug irqs (used by mst) are not caught by the irq storm >>>>> code, but the long hotplug irqs are. >>>> We assume there's no DP hotplug storms ever, whether long or short pulses. >>>> Trying to fix that will require serious rework since we need to wait until >>>> dig_port_work has run to know whether the hpd was a real one or just >>>> fluctuation, and only update storm statistic then. And once we do DP is >>>> essentially broken, which means we also need to enable polling for dp aux >>>> short pulses (which will probably piss off some sink device). >>>> >>>> In short: If you have a hpd storm, and there's something DP-like >>>> connected, you're screwed. Until we have real-world evidence of this >>>> happening updating comments is really the only thing we need. >>> In that case we should update the code to never do hotplug irq storm >>> detection on dp long hpd, which we currently do. >> The HPD pin is shared for DP and HDMI so we can't disable HPD just for >> HDMI when a storm is detected. > Yup this is the crux. The real fix really is wiring up IRQ_NONE handling > all the way to be able to differentiate storms from normal/expected irq > load. But that really has to wait until this happens in reality somewhere > with a DP sink. > -Daniel > when checking internally i was informed that DP hotplug storm is also normal like HDMI if not more frequent, but for now i'll wait till this comes as a real world issue and drop this patch. -- regards, Sivakumar
diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c index 3c53aac..8e18587 100644 --- a/drivers/gpu/drm/i915/intel_hotplug.c +++ b/drivers/gpu/drm/i915/intel_hotplug.c @@ -205,6 +205,10 @@ static void i915_digport_work_func(struct work_struct *work) dev_priv->hotplug.long_port_mask = 0; short_port_mask = dev_priv->hotplug.short_port_mask; dev_priv->hotplug.short_port_mask = 0; + + /* Disable hotplug on connectors that hit an irq storm. */ + intel_hpd_irq_storm_disable(dev_priv); + spin_unlock_irq(&dev_priv->irq_lock); for (i = 0; i < I915_MAX_PORTS; i++) {
From: "Thulasimani, Sivakumar" <sivakumar.thulasimani@intel.com> HPD storm is detected in intel_hpd_irq_handler and disabled for respective port immediately but polling is enabled only in i915_hotplug_work_func and not in i915_digport_work_func. This will result in disabled hpd never enabled back again. This is fixed by calling the appropriate storm disable function that will handle the rest of the sequence (both polling enable and reenabling of HPD later). --- drivers/gpu/drm/i915/intel_hotplug.c | 4 ++++ 1 file changed, 4 insertions(+)