Message ID | 20160205145831.GA14084@wunner.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Feb 05, 2016 at 03:58:31PM +0100, Lukas Wunner wrote: > Hi Chris, > > On Fri, Feb 05, 2016 at 11:09:27AM +0000, Chris Wilson wrote: > > On Fri, Feb 05, 2016 at 01:27:10AM +0100, Lukas Wunner wrote: > > > On Thu, Feb 04, 2016 at 09:21:17AM +0000, Li, Weinan Z wrote: > > > > We still need this patch. Seems 54632abe8ca3 ("drm/i915: Fix oops caused by fbdev initialization > > > > failure") as well as 366e39b4d2c5 ("drm/i915: Tear down fbdev if > > > > initialization fails") this 2 patches can???t cover this case. It???s access ifbdev->fb before the initialization > > > > finished, but not initialization failed. If don???t have any other patches or code update relative, it may still have in 4.5. > > > > > > Okay, I see. > > > > > > > > > > > add info NULL check should be better, it is also initialized in the async queue > > > > > info = ifbdev->helper.fbdev; > > > > > + if (info == NULL) > > > > > + return false; > > > > > if (!info->screen_base) > > > > > > So if there's indeed a race between fbdev initialization and the call to > > > intel_fbdev_restore_mode() (on lastclose), there's more that can go awry: > > > - intel_fbdev_restore_mode() dereferences ifbdev->fb->obj > > > - it calls __drm_atomic_helper_set_config() which WARNs if ifbdev->fb is NULL > > > - it calls drm_fb_helper_set_par() which dereferences ifbdev->helper.fbdev > > > > > > Instead of checking all that it's probably simpler to call > > > async_synchronize_full() at the beginning of intel_fbdev_restore_mode(), > > > as Li Weinan suggested. I'll submit the corresponding one-liner patch > > > tomorrow if noone else does. > > > > > > Chris' patch also modified intel_fbdev_set_suspend() as well as > > > intel_fbdev_output_poll_changed(), not sure if these are racy as well. > > > At least the stack traces posted by Li Weinan and Gustav Fägerlind > > > only indicate that lastclose is racy. > > > > We called set-suspend internally, but we don't do any checks before > > hand. I was concerned we may be able to get into a situation where > > .fb_probe fails and leaves a dangling dev_priv->ifbdev for which we > > would then trip over the NULL info->screen_base. So I was looking for a > > suitable guard. > > Yes, looking at this with a fresh pair of eyeballs I think you were > totally right, i915_pm_ops is declared as part of i915_pci_driver and > it might indeed happen that i915_pm_suspend() is called before the fbdev > is fully initialized. > > As for intel_fbdev_output_poll_changed(), there's even a comment in > i915_load_modeset_init() stating that hotplug events might occur > during fdbev initial config. > > Below patch adds the requisite async_synchronize_full() to the three > functions that you also modified and updates that comment. > > However AFAICT a corner case remains where we're still vulnerable to races: > async_schedule() runs synchronously "if we're out of memory or if there's > too much work pending already" (see __async_schedule() in kernel/async.c). > In this case no entry is added to the pending list and > async_synchronize_full() might theoretically return before the synchronously > executed function has finished. > > The existing call to async_synchronize_full() in intel_fbdev_fini() > seems to be susceptible to the same race condition, but it's probably > very hard to trigger it. (Unload the module before the fbdev is fully > initialized.) > > To make it bullet proof we could declare a completion in intel_fbdev.c > and wait for that instead. Opinions? > > Thanks, > > Lukas > > -- >8 -- > > Subject: [PATCH] drm/i915: Fix races on fbdev > > The ->lastclose callback invokes intel_fbdev_restore_mode() and has > been witnessed to run before intel_fbdev_initial_config_async() > has finished. > > We might likewise receive hotplug events or be suspended before > we've had a chance to fully set up the fbdev. > > Fix by waiting for the asynchronous thread to finish. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580 > Reported-by: Gustav FÀgerlind <gustav.fagerlind@gmail.com> > Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: stable@vger.kernel.org > Signed-off-by: Lukas Wunner <lukas@wunner.de> Can you pls resubmit this patch stand-alone? Patchwork (and therefore CI) won't pick up inlined patches like this one here unfortunately. Thanks, Daniel > --- > drivers/gpu/drm/i915/i915_dma.c | 8 +++----- > drivers/gpu/drm/i915/intel_fbdev.c | 4 ++++ > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index a0f5659..a76b528 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -437,11 +437,9 @@ static int i915_load_modeset_init(struct drm_device *dev) > * Some ports require correctly set-up hpd registers for detection to > * work properly (leading to ghost connected connector status), e.g. VGA > * on gm45. Hence we can only set up the initial fbdev config after hpd > - * irqs are fully enabled. Now we should scan for the initial config > - * only once hotplug handling is enabled, but due to screwed-up locking > - * around kms/fbdev init we can't protect the fdbev initial config > - * scanning against hotplug events. Hence do this first and ignore the > - * tiny window where we will loose hotplug notifactions. > + * irqs are fully enabled. We protect the fbdev initial config scanning > + * against hotplug events by waiting in intel_fbdev_output_poll_changed > + * until the asynchronous thread has finished. > */ > intel_fbdev_initial_config_async(dev); > > diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c > index 09840f4..2002b13 100644 > --- a/drivers/gpu/drm/i915/intel_fbdev.c > +++ b/drivers/gpu/drm/i915/intel_fbdev.c > @@ -749,6 +749,7 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous > struct intel_fbdev *ifbdev = dev_priv->fbdev; > struct fb_info *info; > > + async_synchronize_full(); > if (!ifbdev) > return; > > @@ -795,6 +796,8 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous > void intel_fbdev_output_poll_changed(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > + > + async_synchronize_full(); > if (dev_priv->fbdev) > drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper); > } > @@ -806,6 +809,7 @@ void intel_fbdev_restore_mode(struct drm_device *dev) > struct intel_fbdev *ifbdev = dev_priv->fbdev; > struct drm_fb_helper *fb_helper; > > + async_synchronize_full(); > if (!ifbdev) > return; > > -- > 1.8.5.2 (Apple Git-48) > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Originally submitted inline with <20160205145831.GA14084@wunner.de>, hereby resubmitted standalone for CI as requested by Daniel with <20160215163251.GR11240@phenom.ffwll.local>. Above-mentioned message contained the following important note: > However AFAICT a corner case remains where we're still vulnerable to races: > async_schedule() runs synchronously "if we're out of memory or if there's > too much work pending already" (see __async_schedule() in kernel/async.c). > In this case no entry is added to the pending list and > async_synchronize_full() might theoretically return before the synchronously > executed function has finished. > > The existing call to async_synchronize_full() in intel_fbdev_fini() > seems to be susceptible to the same race condition, but it's probably > very hard to trigger it. (Unload the module before the fbdev is fully > initialized.) > > To make it bullet proof we could declare a completion in intel_fbdev.c > and wait for that instead. Opinions? Lukas Wunner (1): drm/i915: Fix races on fbdev drivers/gpu/drm/i915/i915_dma.c | 8 +++----- drivers/gpu/drm/i915/intel_fbdev.c | 4 ++++ 2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index a0f5659..a76b528 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -437,11 +437,9 @@ static int i915_load_modeset_init(struct drm_device *dev) * Some ports require correctly set-up hpd registers for detection to * work properly (leading to ghost connected connector status), e.g. VGA * on gm45. Hence we can only set up the initial fbdev config after hpd - * irqs are fully enabled. Now we should scan for the initial config - * only once hotplug handling is enabled, but due to screwed-up locking - * around kms/fbdev init we can't protect the fdbev initial config - * scanning against hotplug events. Hence do this first and ignore the - * tiny window where we will loose hotplug notifactions. + * irqs are fully enabled. We protect the fbdev initial config scanning + * against hotplug events by waiting in intel_fbdev_output_poll_changed + * until the asynchronous thread has finished. */ intel_fbdev_initial_config_async(dev); diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c index 09840f4..2002b13 100644 --- a/drivers/gpu/drm/i915/intel_fbdev.c +++ b/drivers/gpu/drm/i915/intel_fbdev.c @@ -749,6 +749,7 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous struct intel_fbdev *ifbdev = dev_priv->fbdev; struct fb_info *info; + async_synchronize_full(); if (!ifbdev) return; @@ -795,6 +796,8 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous void intel_fbdev_output_poll_changed(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; + + async_synchronize_full(); if (dev_priv->fbdev) drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper); } @@ -806,6 +809,7 @@ void intel_fbdev_restore_mode(struct drm_device *dev) struct intel_fbdev *ifbdev = dev_priv->fbdev; struct drm_fb_helper *fb_helper; + async_synchronize_full(); if (!ifbdev) return;