diff mbox

[1/2] drm/i915: introduce & use i915_gem_object_mark_dirty()

Message ID 1461860781-4504-1-git-send-email-david.s.gordon@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Gordon April 28, 2016, 4:26 p.m. UTC
This just hides the existing obj->dirty flag inside a trivial inline
setter, to discourage non-GEM code from looking too closely.

Existing code that sets obj->dirty is then changed to use the function
instead.

Inspired-by: http://www.spinics.net/lists/intel-gfx/msg92390.html
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            | 11 +++++++++++
 drivers/gpu/drm/i915/i915_gem.c            |  6 +++---
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  6 +++---
 5 files changed, 19 insertions(+), 8 deletions(-)

Comments

Chris Wilson April 28, 2016, 4:34 p.m. UTC | #1
On Thu, Apr 28, 2016 at 05:26:20PM +0100, Dave Gordon wrote:
> This just hides the existing obj->dirty flag inside a trivial inline
> setter, to discourage non-GEM code from looking too closely.
> 
> Existing code that sets obj->dirty is then changed to use the function
> instead.

I prefer set_dirty, unset_dirty, is_dirty which is what I used in my
patches.
-Chris
Dave Gordon April 28, 2016, 5:32 p.m. UTC | #2
On 28/04/16 17:34, Chris Wilson wrote:
> On Thu, Apr 28, 2016 at 05:26:20PM +0100, Dave Gordon wrote:
>> This just hides the existing obj->dirty flag inside a trivial inline
>> setter, to discourage non-GEM code from looking too closely.
>>
>> Existing code that sets obj->dirty is then changed to use the function
>> instead.
>
> I prefer set_dirty, unset_dirty, is_dirty which is what I used in my
> patches.
> -Chris

I wasn't going to abstract the test and clear operation because only GEM 
code needs those. In fact (apart from debugfs and error capture), only 
put_pages() implementations ever test or clear them.

Anyway, the real reason for sending this patchset was to see whether my 
local result was reproducible, namely that it will expose at least one 
path where an object is marked dirty while not pinned, in defiance of 
your previous comment about the invalidity of such an operation.

.Dave.
Dave Gordon April 28, 2016, 6:36 p.m. UTC | #3
On 28/04/16 18:48, Patchwork wrote:
> == Series Details ==
>
> Series: series starting with [1/2] drm/i915: introduce & use i915_gem_object_mark_dirty()
> URL   : https://patchwork.freedesktop.org/series/6491/
> State : warning
>
> == Summary ==
>
> Series 6491v1 Series without cover letter
> http://patchwork.freedesktop.org/api/1.0/series/6491/revisions/1/mbox/
>
> Test gem_busy:
>          Subgroup basic-blt:
>                  pass       -> DMESG-WARN (bsw-nuc-2)
>                  pass       -> DMESG-WARN (skl-nuci5)
>                  pass       -> DMESG-WARN (bdw-nuci7-2)
>                  pass       -> DMESG-WARN (ivb-t430s)
>                  pass       -> DMESG-WARN (bdw-ultra)
>                  pass       -> DMESG-WARN (skl-i7k-2)
>                  pass       -> DMESG-WARN (byt-nuc)
>                  pass       -> DMESG-WARN (snb-x220t)
>                  pass       -> DMESG-WARN (hsw-brixbox)

Well, that's as expected: it's hitting the WARN_ON() that I put in there 
to check on usage of obj->dirty vs. pages_pin_count. Stack traces are 
all the same, like this one:

[   72.459223] ------------[ cut here ]------------
[   72.459254] WARNING: CPU: 0 PID: 6012 at 
drivers/gpu/drm/i915/i915_drv.h:3027 
i915_gem_object_set_to_gtt_domain+0x21c/0x280 [i915]
[   72.459255] WARN_ON(obj->pages_pin_count == 0)
[   72.459256] Modules linked in:
[   72.459257]  snd_hda_codec_hdmi snd_hda_intel i915 
x86_pkg_temp_thermal snd_hda_codec snd_hwdep snd_hda_core 
intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul mei_me lpc_ich 
ghash_clmulni_intel mei snd_pcm r8169 mii
[   72.459266] CPU: 0 PID: 6012 Comm: gem_busy Tainted: G        W 
  4.6.0-rc5-CI-Patchwork_2105+ #1
[   72.459267] Hardware name: Gigabyte Technology Co., Ltd. 
H87M-D3H/H87M-D3H, BIOS F11 08/18/2015
[   72.459268]  0000000000000000 ffff880212053c80 ffffffff8140de35 
ffff880212053cd0
[   72.459270]  0000000000000000 ffff880212053cc0 ffffffff81079c8c 
00000bd312e5a980
[   72.459272]  ffff880212e5a980 0000000000000001 ffff8800d7c70000 
0000000000000000
[   72.459274] Call Trace:
[   72.459277]  [<ffffffff8140de35>] dump_stack+0x67/0x92
[   72.459280]  [<ffffffff81079c8c>] __warn+0xcc/0xf0
[   72.459281]  [<ffffffff81079cfa>] warn_slowpath_fmt+0x4a/0x50
[   72.459293]  [<ffffffffa01efacb>] ? 
i915_gem_object_flush_cpu_write_domain.part.47+0x14b/0x1b0 [i915]
[   72.459303]  [<ffffffffa01f113c>] 
i915_gem_object_set_to_gtt_domain+0x21c/0x280 [i915]
[   72.459313]  [<ffffffffa01f128e>] 
i915_gem_set_domain_ioctl+0xee/0x160 [i915]
[   72.459315]  [<ffffffff815282ed>] drm_ioctl+0x13d/0x590
[   72.459325]  [<ffffffffa01f11a0>] ? 
i915_gem_object_set_to_gtt_domain+0x280/0x280 [i915]
[   72.459327]  [<ffffffff81199ba7>] ? handle_mm_fault+0x47/0x1e90
[   72.459329]  [<ffffffff811ee38a>] do_vfs_ioctl+0x8a/0x670
[   72.459331]  [<ffffffff811fa21a>] ? __fget_light+0x6a/0x90
[   72.459332]  [<ffffffff811ee9ac>] SyS_ioctl+0x3c/0x70
[   72.459333]  [<ffffffff817dc7a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
[   72.459334] ---[ end trace 156adc997a22f992 ]---

So, is that a bug, marking an object dirty when pages_pin_count is 0? 
Does that mean that a program can set a BO to the GTT domain (or the CPU 
domain?), update its contents, and then it gets paged out due to memory 
pressure and the updates are lost?

Or ... no, I think the problem scenario would be
* set to GTT => mark dirty
* BO paged out => flushed to swap, marked clean
* BO paged in => still clean
* update contents => still clean?
* get paged out => not written out?

Or are we guaranteed to hit another mark_dirty during the process of 
updating the contents of the paged-in buffer?

.Dave.
Daniel Vetter May 2, 2016, 8:58 a.m. UTC | #4
On Thu, Apr 28, 2016 at 07:36:32PM +0100, Dave Gordon wrote:
> On 28/04/16 18:48, Patchwork wrote:
> >== Series Details ==
> >
> >Series: series starting with [1/2] drm/i915: introduce & use i915_gem_object_mark_dirty()
> >URL   : https://patchwork.freedesktop.org/series/6491/
> >State : warning
> >
> >== Summary ==
> >
> >Series 6491v1 Series without cover letter
> >http://patchwork.freedesktop.org/api/1.0/series/6491/revisions/1/mbox/
> >
> >Test gem_busy:
> >         Subgroup basic-blt:
> >                 pass       -> DMESG-WARN (bsw-nuc-2)
> >                 pass       -> DMESG-WARN (skl-nuci5)
> >                 pass       -> DMESG-WARN (bdw-nuci7-2)
> >                 pass       -> DMESG-WARN (ivb-t430s)
> >                 pass       -> DMESG-WARN (bdw-ultra)
> >                 pass       -> DMESG-WARN (skl-i7k-2)
> >                 pass       -> DMESG-WARN (byt-nuc)
> >                 pass       -> DMESG-WARN (snb-x220t)
> >                 pass       -> DMESG-WARN (hsw-brixbox)
> 
> Well, that's as expected: it's hitting the WARN_ON() that I put in there to
> check on usage of obj->dirty vs. pages_pin_count. Stack traces are all the
> same, like this one:
> 
> [   72.459223] ------------[ cut here ]------------
> [   72.459254] WARNING: CPU: 0 PID: 6012 at
> drivers/gpu/drm/i915/i915_drv.h:3027
> i915_gem_object_set_to_gtt_domain+0x21c/0x280 [i915]
> [   72.459255] WARN_ON(obj->pages_pin_count == 0)
> [   72.459256] Modules linked in:
> [   72.459257]  snd_hda_codec_hdmi snd_hda_intel i915 x86_pkg_temp_thermal
> snd_hda_codec snd_hwdep snd_hda_core intel_powerclamp coretemp
> crct10dif_pclmul crc32_pclmul mei_me lpc_ich ghash_clmulni_intel mei snd_pcm
> r8169 mii
> [   72.459266] CPU: 0 PID: 6012 Comm: gem_busy Tainted: G        W
> 4.6.0-rc5-CI-Patchwork_2105+ #1
> [   72.459267] Hardware name: Gigabyte Technology Co., Ltd.
> H87M-D3H/H87M-D3H, BIOS F11 08/18/2015
> [   72.459268]  0000000000000000 ffff880212053c80 ffffffff8140de35
> ffff880212053cd0
> [   72.459270]  0000000000000000 ffff880212053cc0 ffffffff81079c8c
> 00000bd312e5a980
> [   72.459272]  ffff880212e5a980 0000000000000001 ffff8800d7c70000
> 0000000000000000
> [   72.459274] Call Trace:
> [   72.459277]  [<ffffffff8140de35>] dump_stack+0x67/0x92
> [   72.459280]  [<ffffffff81079c8c>] __warn+0xcc/0xf0
> [   72.459281]  [<ffffffff81079cfa>] warn_slowpath_fmt+0x4a/0x50
> [   72.459293]  [<ffffffffa01efacb>] ?
> i915_gem_object_flush_cpu_write_domain.part.47+0x14b/0x1b0 [i915]
> [   72.459303]  [<ffffffffa01f113c>]
> i915_gem_object_set_to_gtt_domain+0x21c/0x280 [i915]
> [   72.459313]  [<ffffffffa01f128e>] i915_gem_set_domain_ioctl+0xee/0x160
> [i915]
> [   72.459315]  [<ffffffff815282ed>] drm_ioctl+0x13d/0x590
> [   72.459325]  [<ffffffffa01f11a0>] ?
> i915_gem_object_set_to_gtt_domain+0x280/0x280 [i915]
> [   72.459327]  [<ffffffff81199ba7>] ? handle_mm_fault+0x47/0x1e90
> [   72.459329]  [<ffffffff811ee38a>] do_vfs_ioctl+0x8a/0x670
> [   72.459331]  [<ffffffff811fa21a>] ? __fget_light+0x6a/0x90
> [   72.459332]  [<ffffffff811ee9ac>] SyS_ioctl+0x3c/0x70
> [   72.459333]  [<ffffffff817dc7a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
> [   72.459334] ---[ end trace 156adc997a22f992 ]---
> 
> So, is that a bug, marking an object dirty when pages_pin_count is 0? Does
> that mean that a program can set a BO to the GTT domain (or the CPU
> domain?), update its contents, and then it gets paged out due to memory
> pressure and the updates are lost?
> 
> Or ... no, I think the problem scenario would be
> * set to GTT => mark dirty
> * BO paged out => flushed to swap, marked clean
> * BO paged in => still clean
> * update contents => still clean?
> * get paged out => not written out?
> 
> Or are we guaranteed to hit another mark_dirty during the process of
> updating the contents of the paged-in buffer?

I didn't think through the details, but these kind of scenarios are
general the really "fun" gem bugs that bite hard 2 years later. Typing an
igt to repro your scenario and just check whether there's a problem would
be great. Worst case it's a good exercise in what kind of tricks are
needed to pull the kernel over the table.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 32f0597..dfa45ef 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3004,6 +3004,17 @@  static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 	obj->pages_pin_count++;
 }
 
+/*
+ * Flag the object content as having changed since the last call to
+ * i915_gem_object_pin_pages() above, so that the new content is not
+ * lost after the next call to i915_gem_object_unpin_pages() below
+ */
+static inline void i915_gem_object_mark_dirty(struct drm_i915_gem_object *obj)
+{
+	WARN_ON(obj->pages_pin_count == 0);
+	obj->dirty = true;
+}
+
 static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pages_pin_count == 0);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d493e79..3fd8cb4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -931,9 +931,9 @@  int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
 	intel_fb_obj_invalidate(obj, ORIGIN_CPU);
 
 	i915_gem_object_pin_pages(obj);
+	i915_gem_object_mark_dirty(obj);
 
 	offset = args->offset;
-	obj->dirty = 1;
 
 	for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents,
 			 offset >> PAGE_SHIFT) {
@@ -3791,7 +3791,7 @@  static bool i915_gem_valid_gtt_space(struct i915_vma *vma,
 	if (write) {
 		obj->base.read_domains = I915_GEM_DOMAIN_GTT;
 		obj->base.write_domain = I915_GEM_DOMAIN_GTT;
-		obj->dirty = 1;
+		i915_gem_object_mark_dirty(obj);
 	}
 
 	trace_i915_gem_object_change_domain(obj,
@@ -5362,7 +5362,7 @@  struct drm_i915_gem_object *
 	i915_gem_object_pin_pages(obj);
 	sg = obj->pages;
 	bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
-	obj->dirty = 1;		/* Backing store is now out of date */
+	i915_gem_object_mark_dirty(obj); /* Backing store is out of date */
 	i915_gem_object_unpin_pages(obj);
 
 	if (WARN_ON(bytes != size)) {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4e12bae..69b8391 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -747,7 +747,7 @@  static int do_rcs_switch(struct drm_i915_gem_request *req)
 		 * able to defer doing this until we know the object would be
 		 * swapped, but there is no way to do that yet.
 		 */
-		from->legacy_hw_ctx.rcs_state->dirty = 1;
+		i915_gem_object_mark_dirty(from->legacy_hw_ctx.rcs_state);
 
 		/* obj is kept alive until the next request by its active ref */
 		i915_gem_object_ggtt_unpin(from->legacy_hw_ctx.rcs_state);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 6f4f2a6..584b329 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1109,7 +1109,7 @@  static bool only_mappable_for_reloc(unsigned int flags)
 		u32 old_read = obj->base.read_domains;
 		u32 old_write = obj->base.write_domain;
 
-		obj->dirty = 1; /* be paranoid  */
+		i915_gem_object_mark_dirty(obj); /* be paranoid  */
 		obj->base.write_domain = obj->base.pending_write_domain;
 		if (obj->base.write_domain == 0)
 			obj->base.pending_read_domains |= obj->base.read_domains;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2b7e6bb..15d341e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2451,7 +2451,7 @@  static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
 		DRM_DEBUG_DRIVER("Could not map object pages! (%d)\n", ret);
 		return ret;
 	}
-	ctx_obj->dirty = true;
+	i915_gem_object_mark_dirty(ctx_obj);
 
 	/* The second page of the context object contains some fields which must
 	 * be set up prior to the first execution. */
@@ -2735,9 +2735,9 @@  void intel_lr_context_reset(struct drm_i915_private *dev_priv,
 		if (WARN_ON(IS_ERR(vaddr)))
 			continue;
 
-		reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
-		ctx_obj->dirty = true;
+		i915_gem_object_mark_dirty(ctx_obj);
 
+		reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
 		reg_state[CTX_RING_HEAD+1] = 0;
 		reg_state[CTX_RING_TAIL+1] = 0;