Message ID | 20210519074323.665872-1-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] drm/i915/cmdparser: No-op failed batches on all platforms | expand |
On Wed, May 19, 2021 at 2:43 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > On gen9 for blt cmd parser we relied on the magic fence error > propagation which: > - doesn't work on gen7, because there's no scheduler with ringbuffers > there yet > - fence error propagation can be weaponized to attack other things, so > not a good design idea > > Instead of magic, do the same thing on gen9 as on gen7. I think the commit message could be improved. Maybe something like this? When we re-introduced the command parser on Gen9 platforms to protect against BLT CS register writes, we did things a bit differently than on previous platforms. On Gen7 platforms, if a batch contains unsupported commands, we smash the start of the shadow batch to MI_BATCH_BUFFER_END to cancel the batch. If it's mostly ok (-EACCESS), we trampoline to run in unprivileged mode and let the limited HW parser handle security. On Gen9, we only care about rejecting batches because we don't trust the HW parser for a few cases so we don't need this second trampoline case. However, instead of stopping there and avoiding the trampoline, we chose to avoid executing the new batch all together on Gen9 by use of dma-fence error propagation. When the batch parser fails, it returns a non-zero error and we would propgate that through the chain of fences and trust the scheduler to know to cancel anything dependent on a fence with an error. However, fence error propagation is sketchy at best and can be weaponized to attack other things so it's not really a good design. This commit restores a bit of the Gen7 functionality on Gen9 (smashing the start of the shadow batch to MI_BB_END) so that it's always safe to run the batch post-parser. A later commit will get rid of the error propagation nonsense. > > Kudos to Jason for figuring this out. > > Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences") > Cc: <stable@vger.kernel.org> # v5.6+ > Cc: Jason Ekstrand <jason.ekstrand@intel.com> > Cc: Marcin Slusarz <marcin.slusarz@intel.com> > Cc: Jon Bloomfield <jon.bloomfield@intel.com> > Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080 > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++------------- > 1 file changed, 17 insertions(+), 17 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 5b4b2bd46e7c..2d3336ab7ba3 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, > } > } > > + /* Batch unsafe to execute with privileges, cancel! */ > + if (ret) { > + cmd = page_mask_bits(shadow->obj->mm.mapping); > + *cmd = MI_BATCH_BUFFER_END; > + } > + > if (trampoline) { > /* > * With the trampoline, the shadow is executed twice. > @@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, > */ > *batch_end = MI_BATCH_BUFFER_END; Bit of a bike shed but, given the new structure of the code, I think it makes it more clear if we do if (ret == -EACCESS) { /* stuff */ __gen6_emit_bb_start(...); } else { *batch_end = MI_BATCH_BUFFER_END; } That way it's clear that we're making a choice between firing off the client batch in privileged mode and ending early. > > - if (ret) { > - /* Batch unsafe to execute with privileges, cancel! */ > - cmd = page_mask_bits(shadow->obj->mm.mapping); > - *cmd = MI_BATCH_BUFFER_END; > + /* If batch is unsafe but valid, jump to the original */ > + if (ret == -EACCES) { > + unsigned int flags; > > - /* If batch is unsafe but valid, jump to the original */ > - if (ret == -EACCES) { > - unsigned int flags; > + flags = MI_BATCH_NON_SECURE_I965; > + if (IS_HASWELL(engine->i915)) > + flags = MI_BATCH_NON_SECURE_HSW; > > - flags = MI_BATCH_NON_SECURE_I965; > - if (IS_HASWELL(engine->i915)) > - flags = MI_BATCH_NON_SECURE_HSW; > + GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7)); > + __gen6_emit_bb_start(batch_end, > + batch_addr, > + flags); > > - GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7)); > - __gen6_emit_bb_start(batch_end, > - batch_addr, > - flags); > - > - ret = 0; /* allow execution */ > - } > + ret = 0; /* allow execution */ > } > } > > -- > 2.31.0 >
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 5b4b2bd46e7c..2d3336ab7ba3 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, } } + /* Batch unsafe to execute with privileges, cancel! */ + if (ret) { + cmd = page_mask_bits(shadow->obj->mm.mapping); + *cmd = MI_BATCH_BUFFER_END; + } + if (trampoline) { /* * With the trampoline, the shadow is executed twice. @@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, */ *batch_end = MI_BATCH_BUFFER_END; - if (ret) { - /* Batch unsafe to execute with privileges, cancel! */ - cmd = page_mask_bits(shadow->obj->mm.mapping); - *cmd = MI_BATCH_BUFFER_END; + /* If batch is unsafe but valid, jump to the original */ + if (ret == -EACCES) { + unsigned int flags; - /* If batch is unsafe but valid, jump to the original */ - if (ret == -EACCES) { - unsigned int flags; + flags = MI_BATCH_NON_SECURE_I965; + if (IS_HASWELL(engine->i915)) + flags = MI_BATCH_NON_SECURE_HSW; - flags = MI_BATCH_NON_SECURE_I965; - if (IS_HASWELL(engine->i915)) - flags = MI_BATCH_NON_SECURE_HSW; + GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7)); + __gen6_emit_bb_start(batch_end, + batch_addr, + flags); - GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7)); - __gen6_emit_bb_start(batch_end, - batch_addr, - flags); - - ret = 0; /* allow execution */ - } + ret = 0; /* allow execution */ } }
On gen9 for blt cmd parser we relied on the magic fence error propagation which: - doesn't work on gen7, because there's no scheduler with ringbuffers there yet - fence error propagation can be weaponized to attack other things, so not a good design idea Instead of magic, do the same thing on gen9 as on gen7. Kudos to Jason for figuring this out. Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences") Cc: <stable@vger.kernel.org> # v5.6+ Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Marcin Slusarz <marcin.slusarz@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++------------- 1 file changed, 17 insertions(+), 17 deletions(-)