Message ID | 20180316183105.16027-3-jeff.mcgee@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting jeff.mcgee@intel.com (2018-03-16 18:30:59) > From: Jeff McGee <jeff.mcgee@intel.com> > > Engine reset is fast. A context switch interrupt may be generated just > prior to the reset such that the top half handler is racing with reset > post-processing. The handler may set the irq_posted bit again after > the reset code has cleared it to start fresh. Then the re-enabled > tasklet will read the CSB head and tail from MMIO, which will be at > the hardware reset values of 0 and 7 respectively, given that no > actual CSB event has occurred since the reset. Mayhem then ensues as > the tasklet starts processing invalid CSB entries. > > We can handle this corner case without adding any new synchronization > between the irq top half and the reset work item. The tasklet can > just skip CSB processing if the tail is not sane. > > This patch is required to support the force preemption feature. Please note how this is handled currently, because this is not a new problem. -Chris
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 7d93fcd56d34..5f63d1d6a2d6 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -851,6 +851,14 @@ static void intel_lrc_irq_handler(unsigned long data) head = readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine))); tail = GEN8_CSB_WRITE_PTR(head); head = GEN8_CSB_READ_PTR(head); + + /* The MMIO read CSB tail may be at the reset value of + * 0x7 if there hasn't been a valid CSB event since + * the engine reset. Skip on to dequeue if so. + */ + if (tail >= GEN8_CSB_ENTRIES) + break; + execlists->csb_head = head; } else { const int write_idx = @@ -859,6 +867,7 @@ static void intel_lrc_irq_handler(unsigned long data) head = execlists->csb_head; tail = READ_ONCE(buf[write_idx]); + GEM_BUG_ON(tail >= GEN8_CSB_ENTRIES); } while (head != tail) {