Message ID | 20161020150530.5787-1-tiwai@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Oct 20, 2016 at 05:05:30PM +0200, Takashi Iwai wrote: > Since 4.7 kernel, we've seen the error messages like > > kernel: [TTM] Buffer eviction failed > kernel: qxl 0000:00:02.0: object_init failed for (4026540032, 0x00000001) > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > on QXL when switching and accessing on VT. The culprit was the > generic deferred_io code (qxl driver switched to it since 4.7). > There is a race between the dirty clip update and the call of > callback. > > In drm_fb_helper_dirty(), the dirty clip is updated in the spinlock, > while it kicks off the update worker outside the spinlock. Meanwhile > the update worker clears the dirty clip in the spinlock, too. Thus, > when drm_fb_helper_dirty() is called concurrently, schedule_work() is > called after the clip is cleared in the first worker call. > > This patch addresses it by validating the clip before calling the > dirty fb callback. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98322 > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1003298 > Fixes: eaa434defaca ('drm/fb-helper: Add fb_deferred_io support') > Cc: <stable@vger.kernel.org> > Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > v1->v2: simplified the code as suggested by Ville > > drivers/gpu/drm/drm_fb_helper.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > index 03414bde1f15..aae7df01864d 100644 > --- a/drivers/gpu/drm/drm_fb_helper.c > +++ b/drivers/gpu/drm/drm_fb_helper.c > @@ -644,7 +644,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work) > clip->x2 = clip->y2 = 0; > spin_unlock_irqrestore(&helper->dirty_lock, flags); > > - helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); > + /* call dirty callback only when it has been really touched */ > + if (clip_copy.x1 < clip_copy.x2 && clip_copy.y1 < clip_copy.y2) > + helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); > } > > /** > -- > 2.10.1
On Fri, 21 Oct 2016 14:52:07 +0200, Ville Syrjälä wrote: > > On Thu, Oct 20, 2016 at 05:05:30PM +0200, Takashi Iwai wrote: > > Since 4.7 kernel, we've seen the error messages like > > > > kernel: [TTM] Buffer eviction failed > > kernel: qxl 0000:00:02.0: object_init failed for (4026540032, 0x00000001) > > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > > > on QXL when switching and accessing on VT. The culprit was the > > generic deferred_io code (qxl driver switched to it since 4.7). > > There is a race between the dirty clip update and the call of > > callback. > > > > In drm_fb_helper_dirty(), the dirty clip is updated in the spinlock, > > while it kicks off the update worker outside the spinlock. Meanwhile > > the update worker clears the dirty clip in the spinlock, too. Thus, > > when drm_fb_helper_dirty() is called concurrently, schedule_work() is > > called after the clip is cleared in the first worker call. > > > > This patch addresses it by validating the clip before calling the > > dirty fb callback. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98322 > > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1003298 > > Fixes: eaa434defaca ('drm/fb-helper: Add fb_deferred_io support') > > Cc: <stable@vger.kernel.org> > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Daniel, could you pick this if it's OK as a quick fix? Currently qxl driver is utterly broken, and we should recover it ASAP. On top of this, we can put a more comprehensive fix covering both this and dirtyfb ioctl code paths. thanks, Takashi > > > --- > > v1->v2: simplified the code as suggested by Ville > > > > drivers/gpu/drm/drm_fb_helper.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > > index 03414bde1f15..aae7df01864d 100644 > > --- a/drivers/gpu/drm/drm_fb_helper.c > > +++ b/drivers/gpu/drm/drm_fb_helper.c > > @@ -644,7 +644,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work) > > clip->x2 = clip->y2 = 0; > > spin_unlock_irqrestore(&helper->dirty_lock, flags); > > > > - helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); > > + /* call dirty callback only when it has been really touched */ > > + if (clip_copy.x1 < clip_copy.x2 && clip_copy.y1 < clip_copy.y2) > > + helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); > > } > > > > /** > > -- > > 2.10.1 > > -- > Ville Syrjälä > Intel OTC >
On Tue, Oct 25, 2016 at 08:46:28AM +0200, Takashi Iwai wrote: > On Fri, 21 Oct 2016 14:52:07 +0200, > Ville Syrjälä wrote: > > > > On Thu, Oct 20, 2016 at 05:05:30PM +0200, Takashi Iwai wrote: > > > Since 4.7 kernel, we've seen the error messages like > > > > > > kernel: [TTM] Buffer eviction failed > > > kernel: qxl 0000:00:02.0: object_init failed for (4026540032, 0x00000001) > > > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > > > > > on QXL when switching and accessing on VT. The culprit was the > > > generic deferred_io code (qxl driver switched to it since 4.7). > > > There is a race between the dirty clip update and the call of > > > callback. > > > > > > In drm_fb_helper_dirty(), the dirty clip is updated in the spinlock, > > > while it kicks off the update worker outside the spinlock. Meanwhile > > > the update worker clears the dirty clip in the spinlock, too. Thus, > > > when drm_fb_helper_dirty() is called concurrently, schedule_work() is > > > called after the clip is cleared in the first worker call. > > > > > > This patch addresses it by validating the clip before calling the > > > dirty fb callback. > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98322 > > > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1003298 > > > Fixes: eaa434defaca ('drm/fb-helper: Add fb_deferred_io support') > > > Cc: <stable@vger.kernel.org> > > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Daniel, could you pick this if it's OK as a quick fix? Currently > qxl driver is utterly broken, and we should recover it ASAP. On top > of this, we can put a more comprehensive fix covering both this and > dirtyfb ioctl code paths. I thought I've pinged Dave already to pick up, I'll poke him again. -Daniel > > > thanks, > > Takashi > > > > > > --- > > > v1->v2: simplified the code as suggested by Ville > > > > > > drivers/gpu/drm/drm_fb_helper.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c > > > index 03414bde1f15..aae7df01864d 100644 > > > --- a/drivers/gpu/drm/drm_fb_helper.c > > > +++ b/drivers/gpu/drm/drm_fb_helper.c > > > @@ -644,7 +644,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work) > > > clip->x2 = clip->y2 = 0; > > > spin_unlock_irqrestore(&helper->dirty_lock, flags); > > > > > > - helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); > > > + /* call dirty callback only when it has been really touched */ > > > + if (clip_copy.x1 < clip_copy.x2 && clip_copy.y1 < clip_copy.y2) > > > + helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); > > > } > > > > > > /** > > > -- > > > 2.10.1 > > > > -- > > Ville Syrjälä > > Intel OTC > >
On Tue, 25 Oct 2016 10:09:30 +0200, Daniel Vetter wrote: > > On Tue, Oct 25, 2016 at 08:46:28AM +0200, Takashi Iwai wrote: > > On Fri, 21 Oct 2016 14:52:07 +0200, > > Ville Syrjälä wrote: > > > > > > On Thu, Oct 20, 2016 at 05:05:30PM +0200, Takashi Iwai wrote: > > > > Since 4.7 kernel, we've seen the error messages like > > > > > > > > kernel: [TTM] Buffer eviction failed > > > > kernel: qxl 0000:00:02.0: object_init failed for (4026540032, 0x00000001) > > > > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > > > > > > > on QXL when switching and accessing on VT. The culprit was the > > > > generic deferred_io code (qxl driver switched to it since 4.7). > > > > There is a race between the dirty clip update and the call of > > > > callback. > > > > > > > > In drm_fb_helper_dirty(), the dirty clip is updated in the spinlock, > > > > while it kicks off the update worker outside the spinlock. Meanwhile > > > > the update worker clears the dirty clip in the spinlock, too. Thus, > > > > when drm_fb_helper_dirty() is called concurrently, schedule_work() is > > > > called after the clip is cleared in the first worker call. > > > > > > > > This patch addresses it by validating the clip before calling the > > > > dirty fb callback. > > > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98322 > > > > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1003298 > > > > Fixes: eaa434defaca ('drm/fb-helper: Add fb_deferred_io support') > > > > Cc: <stable@vger.kernel.org> > > > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > > > > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > Daniel, could you pick this if it's OK as a quick fix? Currently > > qxl driver is utterly broken, and we should recover it ASAP. On top > > of this, we can put a more comprehensive fix covering both this and > > dirtyfb ioctl code paths. > > I thought I've pinged Dave already to pick up, I'll poke him again. Thanks! Takashi
On Tue, Oct 25, 2016 at 10:17:47AM +0200, Takashi Iwai wrote: > On Tue, 25 Oct 2016 10:09:30 +0200, > Daniel Vetter wrote: > > > > On Tue, Oct 25, 2016 at 08:46:28AM +0200, Takashi Iwai wrote: > > > On Fri, 21 Oct 2016 14:52:07 +0200, > > > Ville Syrjälä wrote: > > > > > > > > On Thu, Oct 20, 2016 at 05:05:30PM +0200, Takashi Iwai wrote: > > > > > Since 4.7 kernel, we've seen the error messages like > > > > > > > > > > kernel: [TTM] Buffer eviction failed > > > > > kernel: qxl 0000:00:02.0: object_init failed for (4026540032, 0x00000001) > > > > > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > > > > > > > > > on QXL when switching and accessing on VT. The culprit was the > > > > > generic deferred_io code (qxl driver switched to it since 4.7). > > > > > There is a race between the dirty clip update and the call of > > > > > callback. > > > > > > > > > > In drm_fb_helper_dirty(), the dirty clip is updated in the spinlock, > > > > > while it kicks off the update worker outside the spinlock. Meanwhile > > > > > the update worker clears the dirty clip in the spinlock, too. Thus, > > > > > when drm_fb_helper_dirty() is called concurrently, schedule_work() is > > > > > called after the clip is cleared in the first worker call. > > > > > > > > > > This patch addresses it by validating the clip before calling the > > > > > dirty fb callback. > > > > > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98322 > > > > > Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1003298 > > > > > Fixes: eaa434defaca ('drm/fb-helper: Add fb_deferred_io support') > > > > > Cc: <stable@vger.kernel.org> > > > > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > > > > > > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > > > > > Daniel, could you pick this if it's OK as a quick fix? Currently > > > qxl driver is utterly broken, and we should recover it ASAP. On top > > > of this, we can put a more comprehensive fix covering both this and > > > dirtyfb ioctl code paths. > > > > I thought I've pinged Dave already to pick up, I'll poke him again. > > Thanks! Ok, trying something new, thrown into drm-misc-fixes. -Daniel
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 03414bde1f15..aae7df01864d 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -644,7 +644,9 @@ static void drm_fb_helper_dirty_work(struct work_struct *work) clip->x2 = clip->y2 = 0; spin_unlock_irqrestore(&helper->dirty_lock, flags); - helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); + /* call dirty callback only when it has been really touched */ + if (clip_copy.x1 < clip_copy.x2 && clip_copy.y1 < clip_copy.y2) + helper->fb->funcs->dirty(helper->fb, NULL, 0, 0, &clip_copy, 1); } /**
Since 4.7 kernel, we've seen the error messages like kernel: [TTM] Buffer eviction failed kernel: qxl 0000:00:02.0: object_init failed for (4026540032, 0x00000001) kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO on QXL when switching and accessing on VT. The culprit was the generic deferred_io code (qxl driver switched to it since 4.7). There is a race between the dirty clip update and the call of callback. In drm_fb_helper_dirty(), the dirty clip is updated in the spinlock, while it kicks off the update worker outside the spinlock. Meanwhile the update worker clears the dirty clip in the spinlock, too. Thus, when drm_fb_helper_dirty() is called concurrently, schedule_work() is called after the clip is cleared in the first worker call. This patch addresses it by validating the clip before calling the dirty fb callback. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98322 Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1003298 Fixes: eaa434defaca ('drm/fb-helper: Add fb_deferred_io support') Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> --- v1->v2: simplified the code as suggested by Ville drivers/gpu/drm/drm_fb_helper.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)