Message ID | 20181204094639.15856-1-mika.kahola@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Wait one vblank before sending hotplug event to userspace | expand |
Quoting Mika Kahola (2018-12-04 09:46:39) > Occasionally, we get the following error in our CI runs What's the actual warn here? This looks to be trimmed too much. > [853.132830] Workqueue: events i915_hotplug_work_func [i915] > [853.132844] RIP: 0010:drm_wait_one_vblank+0x19b/0x1b0 > 15 ff ff ff 89 ee 48 c7 c7 e8 03 10 82 e8 b5 4b a6 ff <0f> 0b e9 00 ff ff ff 0f 1f 40 00 66 > 2e 0f 1f 84 00 00 00 00 00 8b -Chris
Quoting Mika Kahola (2018-12-04 09:46:39) > Occasionally, we get the following error in our CI runs > > [853.132830] Workqueue: events i915_hotplug_work_func [i915] > [853.132844] RIP: 0010:drm_wait_one_vblank+0x19b/0x1b0 > [853.132852] Code: fe ff ff e8 b7 4e a6 ff 48 89 e6 4c 89 ff e8 6c 5f ab ff 45 85 ed 0f 85 > 15 ff ff ff 89 ee 48 c7 c7 e8 03 10 82 e8 b5 4b a6 ff <0f> 0b e9 00 ff ff ff 0f 1f 40 00 66 > 2e 0f 1f 84 00 00 00 00 00 8b > [853.132859] RSP: 0018:ffffc9000146bca0 EFLAGS: 00010286 > [853.132866] RAX: 0000000000000000 RBX: ffff88849ef00000 RCX: 0000000000000000 > [853.132873] RDX: 0000000000000007 RSI: ffffffff820c6f58 RDI: 00000000ffffffff > [853.132879] RBP: 0000000000000000 R08: 000000007ffc637a R09: 0000000000000000 > [853.132884] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [853.132890] R13: 0000000000000000 R14: 000000000000d0c2 R15: ffff8884a491e680 > [853.132897] FS: 0000000000000000(0000) GS:ffff8884afe80000(0000) knlGS:0000000000000000 > [853.132904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [853.132910] CR2: 00007f63bf0df000 CR3: 0000000005210006 CR4: 0000000000760ee0 > [853.132916] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [853.132922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [853.132927] PKRU: 55555554 > [853.132932] Call Trace: > [853.132949] ? wait_woken+0xa0/0xa0 > [853.133068] intel_dp_retrain_link+0x130/0x190 [i915] > [853.133176] intel_ddi_hotplug+0x54/0x2e0 [i915] > [853.133298] i915_hotplug_work_func+0x1a9/0x240 [i915] > [853.133324] process_one_work+0x262/0x630 > [853.133349] worker_thread+0x37/0x380 > [853.133365] ? process_one_work+0x630/0x630 > [853.133373] kthread+0x119/0x130 > [853.133383] ? kthread_park+0x80/0x80 > [853.133400] ret_from_fork+0x3a/0x50 > [853.133433] irq event stamp: 1426928 > > I suspect that this is caused by a racy condition when retraining the > DisplayPort link. My proposal is to wait for one additional vblank > event before we send out a hotplug event to userspace for reprobing. Problem is just by waiting for the next vblank doesn't rule out hitting the same race with another/delayed retraining. If you want serialisation, please do add some -- and it may be sensible for the hotplug to wait for the vblank after any ongoing work has finished. -Chris
On Tue, 2018-12-04 at 11:41 +0000, Chris Wilson wrote: > Quoting Mika Kahola (2018-12-04 09:46:39) > > Occasionally, we get the following error in our CI runs > > > > [853.132830] Workqueue: events i915_hotplug_work_func [i915] > > [853.132844] RIP: 0010:drm_wait_one_vblank+0x19b/0x1b0 > > [853.132852] Code: fe ff ff e8 b7 4e a6 ff 48 89 e6 4c 89 ff e8 6c > > 5f ab ff 45 85 ed 0f 85 > > 15 ff ff ff 89 ee 48 c7 c7 e8 03 10 82 e8 b5 4b a6 ff <0f> 0b e9 00 > > ff ff ff 0f 1f 40 00 66 > > 2e 0f 1f 84 00 00 00 00 00 8b > > [853.132859] RSP: 0018:ffffc9000146bca0 EFLAGS: 00010286 > > [853.132866] RAX: 0000000000000000 RBX: ffff88849ef00000 RCX: > > 0000000000000000 > > [853.132873] RDX: 0000000000000007 RSI: ffffffff820c6f58 RDI: > > 00000000ffffffff > > [853.132879] RBP: 0000000000000000 R08: 000000007ffc637a R09: > > 0000000000000000 > > [853.132884] R10: 0000000000000000 R11: 0000000000000000 R12: > > 0000000000000000 > > [853.132890] R13: 0000000000000000 R14: 000000000000d0c2 R15: > > ffff8884a491e680 > > [853.132897] FS: 0000000000000000(0000) GS:ffff8884afe80000(0000) > > knlGS:0000000000000000 > > [853.132904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [853.132910] CR2: 00007f63bf0df000 CR3: 0000000005210006 CR4: > > 0000000000760ee0 > > [853.132916] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [853.132922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > > 0000000000000400 > > [853.132927] PKRU: 55555554 > > [853.132932] Call Trace: > > [853.132949] ? wait_woken+0xa0/0xa0 > > [853.133068] intel_dp_retrain_link+0x130/0x190 [i915] > > [853.133176] intel_ddi_hotplug+0x54/0x2e0 [i915] > > [853.133298] i915_hotplug_work_func+0x1a9/0x240 [i915] > > [853.133324] process_one_work+0x262/0x630 > > [853.133349] worker_thread+0x37/0x380 > > [853.133365] ? process_one_work+0x630/0x630 > > [853.133373] kthread+0x119/0x130 > > [853.133383] ? kthread_park+0x80/0x80 > > [853.133400] ret_from_fork+0x3a/0x50 > > [853.133433] irq event stamp: 1426928 > > > > I suspect that this is caused by a racy condition when retraining > > the > > DisplayPort link. My proposal is to wait for one additional vblank > > event before we send out a hotplug event to userspace for > > reprobing. > > Problem is just by waiting for the next vblank doesn't rule out > hitting > the same race with another/delayed retraining. If you want > serialisation, please do add some -- and it may be sensible for the > hotplug to wait for the vblank after any ongoing work has finished. This bug rarely happens so that's why I suspected some racy condition. Maybe link retraining just takes too much time so we sometimes hit vblank timeout? Would it be simple enough solution to wait for one vblank between the link retrainings? -Chris
On Tue, Dec 04, 2018 at 11:46:39AM +0200, Mika Kahola wrote: > Occasionally, we get the following error in our CI runs > > [853.132830] Workqueue: events i915_hotplug_work_func [i915] > [853.132844] RIP: 0010:drm_wait_one_vblank+0x19b/0x1b0 > [853.132852] Code: fe ff ff e8 b7 4e a6 ff 48 89 e6 4c 89 ff e8 6c 5f ab ff 45 85 ed 0f 85 > 15 ff ff ff 89 ee 48 c7 c7 e8 03 10 82 e8 b5 4b a6 ff <0f> 0b e9 00 ff ff ff 0f 1f 40 00 66 > 2e 0f 1f 84 00 00 00 00 00 8b > [853.132859] RSP: 0018:ffffc9000146bca0 EFLAGS: 00010286 > [853.132866] RAX: 0000000000000000 RBX: ffff88849ef00000 RCX: 0000000000000000 > [853.132873] RDX: 0000000000000007 RSI: ffffffff820c6f58 RDI: 00000000ffffffff > [853.132879] RBP: 0000000000000000 R08: 000000007ffc637a R09: 0000000000000000 > [853.132884] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [853.132890] R13: 0000000000000000 R14: 000000000000d0c2 R15: ffff8884a491e680 > [853.132897] FS: 0000000000000000(0000) GS:ffff8884afe80000(0000) knlGS:0000000000000000 > [853.132904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [853.132910] CR2: 00007f63bf0df000 CR3: 0000000005210006 CR4: 0000000000760ee0 > [853.132916] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [853.132922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [853.132927] PKRU: 55555554 > [853.132932] Call Trace: > [853.132949] ? wait_woken+0xa0/0xa0 > [853.133068] intel_dp_retrain_link+0x130/0x190 [i915] > [853.133176] intel_ddi_hotplug+0x54/0x2e0 [i915] > [853.133298] i915_hotplug_work_func+0x1a9/0x240 [i915] > [853.133324] process_one_work+0x262/0x630 > [853.133349] worker_thread+0x37/0x380 > [853.133365] ? process_one_work+0x630/0x630 > [853.133373] kthread+0x119/0x130 > [853.133383] ? kthread_park+0x80/0x80 > [853.133400] ret_from_fork+0x3a/0x50 > [853.133433] irq event stamp: 1426928 > > I suspect that this is caused by a racy condition when retraining the > DisplayPort link. My proposal is to wait for one additional vblank > event before we send out a hotplug event to userspace for reprobing. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108835 The first problem in the log is <3> [853.020316] [drm:intel_ddi_prepare_link_retrain [i915]] *ERROR* Timeout waiting for DDI BUF A idle bit That's where one should start. Some suspects: - icl_enable/disable_phy_clock_gating() - intel_ddi_enable/disable_pipe_clock() > > Cc: Manasi Navare <manasi.d.navare@intel.com> > Signed-off-by: Mika Kahola <mika.kahola@intel.com> > --- > drivers/gpu/drm/i915/intel_dp.c | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index a6907a1761ab..6ce7d54e49af 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -6746,6 +6746,10 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work) > { > struct intel_connector *intel_connector; > struct drm_connector *connector; > + struct drm_connector_state *conn_state; > + struct drm_i915_private *dev_priv; > + struct intel_crtc *crtc; > + struct intel_crtc_state *crtc_state; > > intel_connector = container_of(work, typeof(*intel_connector), > modeset_retry_work); > @@ -6753,6 +6757,14 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work) > DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id, > connector->name); > > + dev_priv = to_i915(connector->dev); > + conn_state = intel_connector->base.state; > + > + crtc = to_intel_crtc(conn_state->crtc); > + crtc_state = to_intel_crtc_state(crtc->base.state); > + > + WARN_ON(!intel_crtc_has_dp_encoder(crtc_state)); > + > /* Grab the locks before changing connector property*/ > mutex_lock(&connector->dev->mode_config.mutex); > /* Set connector link status to BAD and send a Uevent to notify > @@ -6761,6 +6773,21 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work) > drm_connector_set_link_status_property(connector, > DRM_MODE_LINK_STATUS_BAD); > mutex_unlock(&connector->dev->mode_config.mutex); > + > + /* Suppress underruns caused by re-training */ > + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, false); > + if (crtc_state->has_pch_encoder) > + intel_set_pch_fifo_underrun_reporting(dev_priv, > + intel_crtc_pch_transcoder(crtc), false); > + > + /* Keep underrun reporting disabled until things are stable */ > + intel_wait_for_vblank(dev_priv, crtc->pipe); > + > + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, true); > + if (crtc_state->has_pch_encoder) > + intel_set_pch_fifo_underrun_reporting(dev_priv, > + intel_crtc_pch_transcoder(crtc), true); > + > /* Send Hotplug uevent so userspace can reprobe */ > drm_kms_helper_hotplug_event(connector->dev); > } > -- > 2.17.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, 2018-12-04 at 21:43 +0200, Ville Syrjälä wrote: > On Tue, Dec 04, 2018 at 11:46:39AM +0200, Mika Kahola wrote: > > Occasionally, we get the following error in our CI runs > > > > [853.132830] Workqueue: events i915_hotplug_work_func [i915] > > [853.132844] RIP: 0010:drm_wait_one_vblank+0x19b/0x1b0 > > [853.132852] Code: fe ff ff e8 b7 4e a6 ff 48 89 e6 4c 89 ff e8 6c > > 5f ab ff 45 85 ed 0f 85 > > 15 ff ff ff 89 ee 48 c7 c7 e8 03 10 82 e8 b5 4b a6 ff <0f> 0b e9 00 > > ff ff ff 0f 1f 40 00 66 > > 2e 0f 1f 84 00 00 00 00 00 8b > > [853.132859] RSP: 0018:ffffc9000146bca0 EFLAGS: 00010286 > > [853.132866] RAX: 0000000000000000 RBX: ffff88849ef00000 RCX: > > 0000000000000000 > > [853.132873] RDX: 0000000000000007 RSI: ffffffff820c6f58 RDI: > > 00000000ffffffff > > [853.132879] RBP: 0000000000000000 R08: 000000007ffc637a R09: > > 0000000000000000 > > [853.132884] R10: 0000000000000000 R11: 0000000000000000 R12: > > 0000000000000000 > > [853.132890] R13: 0000000000000000 R14: 000000000000d0c2 R15: > > ffff8884a491e680 > > [853.132897] FS: 0000000000000000(0000) GS:ffff8884afe80000(0000) > > knlGS:0000000000000000 > > [853.132904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [853.132910] CR2: 00007f63bf0df000 CR3: 0000000005210006 CR4: > > 0000000000760ee0 > > [853.132916] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [853.132922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > > 0000000000000400 > > [853.132927] PKRU: 55555554 > > [853.132932] Call Trace: > > [853.132949] ? wait_woken+0xa0/0xa0 > > [853.133068] intel_dp_retrain_link+0x130/0x190 [i915] > > [853.133176] intel_ddi_hotplug+0x54/0x2e0 [i915] > > [853.133298] i915_hotplug_work_func+0x1a9/0x240 [i915] > > [853.133324] process_one_work+0x262/0x630 > > [853.133349] worker_thread+0x37/0x380 > > [853.133365] ? process_one_work+0x630/0x630 > > [853.133373] kthread+0x119/0x130 > > [853.133383] ? kthread_park+0x80/0x80 > > [853.133400] ret_from_fork+0x3a/0x50 > > [853.133433] irq event stamp: 1426928 > > > > I suspect that this is caused by a racy condition when retraining > > the > > DisplayPort link. My proposal is to wait for one additional vblank > > event before we send out a hotplug event to userspace for > > reprobing. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108835 > > The first problem in the log is > <3> [853.020316] [drm:intel_ddi_prepare_link_retrain [i915]] *ERROR* > Timeout waiting for DDI BUF A idle bit > That's where one should start. > > Some suspects: > - icl_enable/disable_phy_clock_gating() > - intel_ddi_enable/disable_pipe_clock() Thanks! I will have a look at those too. On the other hand, this test failure as INCOMPLETE in CI might have caused by jenkins issue [21/79] ( 873s left) kms_flip (blocking-absolute-wf_vblank- interruptible) FATAL: command execution failed java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.j ava:2681) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStr eam.java:3156) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(Abstr actSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:77) Caused: java.io.IOException: Backing channel 'shard-iclb7' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationH andler.java:214) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler. java:283) at com.sun.proxy.$Proxy64.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1144) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1136) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild .java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.jav a:504) at hudson.model.Run.execute(Run.java:1810) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) FATAL: Unable to delete script file /tmp/jenkins9130735500847889838.sh java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.j ava:2681) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStr eam.java:3156) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(Abstr actSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchronou sCommandTransport.java:77) Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on shard-iclb7 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:948) at hudson.FilePath.act(FilePath.java:1070) at hudson.FilePath.act(FilePath.java:1059) at hudson.FilePath.delete(FilePath.java:1563) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:123) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild .java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.jav a:504) at hudson.model.Run.execute(Run.java:1810) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) > > > > > Cc: Manasi Navare <manasi.d.navare@intel.com> > > Signed-off-by: Mika Kahola <mika.kahola@intel.com> > > --- > > drivers/gpu/drm/i915/intel_dp.c | 27 +++++++++++++++++++++++++++ > > 1 file changed, 27 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > > b/drivers/gpu/drm/i915/intel_dp.c > > index a6907a1761ab..6ce7d54e49af 100644 > > --- a/drivers/gpu/drm/i915/intel_dp.c > > +++ b/drivers/gpu/drm/i915/intel_dp.c > > @@ -6746,6 +6746,10 @@ static void > > intel_dp_modeset_retry_work_fn(struct work_struct *work) > > { > > struct intel_connector *intel_connector; > > struct drm_connector *connector; > > + struct drm_connector_state *conn_state; > > + struct drm_i915_private *dev_priv; > > + struct intel_crtc *crtc; > > + struct intel_crtc_state *crtc_state; > > > > intel_connector = container_of(work, typeof(*intel_connector), > > modeset_retry_work); > > @@ -6753,6 +6757,14 @@ static void > > intel_dp_modeset_retry_work_fn(struct work_struct *work) > > DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id, > > connector->name); > > > > + dev_priv = to_i915(connector->dev); > > + conn_state = intel_connector->base.state; > > + > > + crtc = to_intel_crtc(conn_state->crtc); > > + crtc_state = to_intel_crtc_state(crtc->base.state); > > + > > + WARN_ON(!intel_crtc_has_dp_encoder(crtc_state)); > > + > > /* Grab the locks before changing connector property*/ > > mutex_lock(&connector->dev->mode_config.mutex); > > /* Set connector link status to BAD and send a Uevent to notify > > @@ -6761,6 +6773,21 @@ static void > > intel_dp_modeset_retry_work_fn(struct work_struct *work) > > drm_connector_set_link_status_property(connector, > > DRM_MODE_LINK_STATUS_BAD > > ); > > mutex_unlock(&connector->dev->mode_config.mutex); > > + > > + /* Suppress underruns caused by re-training */ > > + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, > > false); > > + if (crtc_state->has_pch_encoder) > > + intel_set_pch_fifo_underrun_reporting(dev_priv, > > + intel_crtc_pch_tr > > anscoder(crtc), false); > > + > > + /* Keep underrun reporting disabled until things are stable */ > > + intel_wait_for_vblank(dev_priv, crtc->pipe); > > + > > + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, > > true); > > + if (crtc_state->has_pch_encoder) > > + intel_set_pch_fifo_underrun_reporting(dev_priv, > > + intel_crtc_pch_tr > > anscoder(crtc), true); > > + > > /* Send Hotplug uevent so userspace can reprobe */ > > drm_kms_helper_hotplug_event(connector->dev); > > } > > -- > > 2.17.1 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > >
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index a6907a1761ab..6ce7d54e49af 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -6746,6 +6746,10 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work) { struct intel_connector *intel_connector; struct drm_connector *connector; + struct drm_connector_state *conn_state; + struct drm_i915_private *dev_priv; + struct intel_crtc *crtc; + struct intel_crtc_state *crtc_state; intel_connector = container_of(work, typeof(*intel_connector), modeset_retry_work); @@ -6753,6 +6757,14 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work) DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id, connector->name); + dev_priv = to_i915(connector->dev); + conn_state = intel_connector->base.state; + + crtc = to_intel_crtc(conn_state->crtc); + crtc_state = to_intel_crtc_state(crtc->base.state); + + WARN_ON(!intel_crtc_has_dp_encoder(crtc_state)); + /* Grab the locks before changing connector property*/ mutex_lock(&connector->dev->mode_config.mutex); /* Set connector link status to BAD and send a Uevent to notify @@ -6761,6 +6773,21 @@ static void intel_dp_modeset_retry_work_fn(struct work_struct *work) drm_connector_set_link_status_property(connector, DRM_MODE_LINK_STATUS_BAD); mutex_unlock(&connector->dev->mode_config.mutex); + + /* Suppress underruns caused by re-training */ + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, false); + if (crtc_state->has_pch_encoder) + intel_set_pch_fifo_underrun_reporting(dev_priv, + intel_crtc_pch_transcoder(crtc), false); + + /* Keep underrun reporting disabled until things are stable */ + intel_wait_for_vblank(dev_priv, crtc->pipe); + + intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, true); + if (crtc_state->has_pch_encoder) + intel_set_pch_fifo_underrun_reporting(dev_priv, + intel_crtc_pch_transcoder(crtc), true); + /* Send Hotplug uevent so userspace can reprobe */ drm_kms_helper_hotplug_event(connector->dev); }
Occasionally, we get the following error in our CI runs [853.132830] Workqueue: events i915_hotplug_work_func [i915] [853.132844] RIP: 0010:drm_wait_one_vblank+0x19b/0x1b0 [853.132852] Code: fe ff ff e8 b7 4e a6 ff 48 89 e6 4c 89 ff e8 6c 5f ab ff 45 85 ed 0f 85 15 ff ff ff 89 ee 48 c7 c7 e8 03 10 82 e8 b5 4b a6 ff <0f> 0b e9 00 ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 8b [853.132859] RSP: 0018:ffffc9000146bca0 EFLAGS: 00010286 [853.132866] RAX: 0000000000000000 RBX: ffff88849ef00000 RCX: 0000000000000000 [853.132873] RDX: 0000000000000007 RSI: ffffffff820c6f58 RDI: 00000000ffffffff [853.132879] RBP: 0000000000000000 R08: 000000007ffc637a R09: 0000000000000000 [853.132884] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [853.132890] R13: 0000000000000000 R14: 000000000000d0c2 R15: ffff8884a491e680 [853.132897] FS: 0000000000000000(0000) GS:ffff8884afe80000(0000) knlGS:0000000000000000 [853.132904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [853.132910] CR2: 00007f63bf0df000 CR3: 0000000005210006 CR4: 0000000000760ee0 [853.132916] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [853.132922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [853.132927] PKRU: 55555554 [853.132932] Call Trace: [853.132949] ? wait_woken+0xa0/0xa0 [853.133068] intel_dp_retrain_link+0x130/0x190 [i915] [853.133176] intel_ddi_hotplug+0x54/0x2e0 [i915] [853.133298] i915_hotplug_work_func+0x1a9/0x240 [i915] [853.133324] process_one_work+0x262/0x630 [853.133349] worker_thread+0x37/0x380 [853.133365] ? process_one_work+0x630/0x630 [853.133373] kthread+0x119/0x130 [853.133383] ? kthread_park+0x80/0x80 [853.133400] ret_from_fork+0x3a/0x50 [853.133433] irq event stamp: 1426928 I suspect that this is caused by a racy condition when retraining the DisplayPort link. My proposal is to wait for one additional vblank event before we send out a hotplug event to userspace for reprobing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108835 Cc: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Mika Kahola <mika.kahola@intel.com> --- drivers/gpu/drm/i915/intel_dp.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)