Message ID | 1434393394-21002-9-git-send-email-david.s.gordon@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jun 15, 2015 at 07:36:26PM +0100, Dave Gordon wrote: > From: "Michael H. Nguyen" <michael.h.nguyen@intel.com> > > Move defines from intel_lrc.c to i915_reg.h so they are accessible > to the GuC submission code; and expose a previously static function > in the execlist code which will also be required for GuC submission. What would have been better would have to been to split the lrc code from the execlists code so that the sharing is more obvious and the overloading separate from the common code. -Chris
On 16/06/15 10:37, Chris Wilson wrote: > On Mon, Jun 15, 2015 at 07:36:26PM +0100, Dave Gordon wrote: >> From: "Michael H. Nguyen" <michael.h.nguyen@intel.com> >> >> Move defines from intel_lrc.c to i915_reg.h so they are accessible >> to the GuC submission code; and expose a previously static function >> in the execlist code which will also be required for GuC submission. > > What would have been better would have to been to split the lrc code > from the execlists code so that the sharing is more obvious and the > overloading separate from the common code. > -Chris What would have been better is not to have put these fairly generic details about the hardware into a C file in the first place. And not to have split execlist and ringbuffer modes into two entirely different paths. And various other historical decisions. But we can only fix the code as it stands, not as it ought to have been. Anyway, this is just a bulk cut-n-paste, so I'm not inclined to do any restructuring on it during this process. But someone working on execlists could certainly tidy it up later, perhaps as part of a general drive towards deduplicating the code paths and partitioning (context vs ringbuffer vs engine) functionality in a more coherent way. .Dave.
On Wed, Jun 17, 2015 at 08:31:59AM +0100, Dave Gordon wrote: > On 16/06/15 10:37, Chris Wilson wrote: > > On Mon, Jun 15, 2015 at 07:36:26PM +0100, Dave Gordon wrote: > >> From: "Michael H. Nguyen" <michael.h.nguyen@intel.com> > >> > >> Move defines from intel_lrc.c to i915_reg.h so they are accessible > >> to the GuC submission code; and expose a previously static function > >> in the execlist code which will also be required for GuC submission. > > > > What would have been better would have to been to split the lrc code > > from the execlists code so that the sharing is more obvious and the > > overloading separate from the common code. > > -Chris > > What would have been better is not to have put these fairly generic > details about the hardware into a C file in the first place. And not to > have split execlist and ringbuffer modes into two entirely different > paths. And various other historical decisions. But we can only fix the > code as it stands, not as it ought to have been. You know I sent patches... -Chris
On Wed, Jun 17, 2015 at 08:31:59AM +0100, Dave Gordon wrote: > On 16/06/15 10:37, Chris Wilson wrote: > > On Mon, Jun 15, 2015 at 07:36:26PM +0100, Dave Gordon wrote: > >> From: "Michael H. Nguyen" <michael.h.nguyen@intel.com> > >> > >> Move defines from intel_lrc.c to i915_reg.h so they are accessible > >> to the GuC submission code; and expose a previously static function > >> in the execlist code which will also be required for GuC submission. > > > > What would have been better would have to been to split the lrc code > > from the execlists code so that the sharing is more obvious and the > > overloading separate from the common code. > > -Chris > > What would have been better is not to have put these fairly generic > details about the hardware into a C file in the first place. And not to > have split execlist and ringbuffer modes into two entirely different > paths. And various other historical decisions. But we can only fix the > code as it stands, not as it ought to have been. > > Anyway, this is just a bulk cut-n-paste, so I'm not inclined to do any > restructuring on it during this process. But someone working on > execlists could certainly tidy it up later, perhaps as part of a general > drive towards deduplicating the code paths and partitioning (context vs > ringbuffer vs engine) functionality in a more coherent way. More to the point, you are increasing the technical debt of the code rather than reducing it. Code will just become less and less maintainable. -Chris
On 17/06/15 08:59, Chris Wilson wrote: > On Wed, Jun 17, 2015 at 08:31:59AM +0100, Dave Gordon wrote: >> On 16/06/15 10:37, Chris Wilson wrote: >>> On Mon, Jun 15, 2015 at 07:36:26PM +0100, Dave Gordon wrote: >>>> From: "Michael H. Nguyen" <michael.h.nguyen@intel.com> >>>> >>>> Move defines from intel_lrc.c to i915_reg.h so they are accessible >>>> to the GuC submission code; and expose a previously static function >>>> in the execlist code which will also be required for GuC submission. >>> >>> What would have been better would have to been to split the lrc code >>> from the execlists code so that the sharing is more obvious and the >>> overloading separate from the common code. >>> -Chris >> >> What would have been better is not to have put these fairly generic >> details about the hardware into a C file in the first place. And not to >> have split execlist and ringbuffer modes into two entirely different >> paths. And various other historical decisions. But we can only fix the >> code as it stands, not as it ought to have been. >> >> Anyway, this is just a bulk cut-n-paste, so I'm not inclined to do any >> restructuring on it during this process. But someone working on >> execlists could certainly tidy it up later, perhaps as part of a general >> drive towards deduplicating the code paths and partitioning (context vs >> ringbuffer vs engine) functionality in a more coherent way. > > More to the point, you are increasing the technical debt of the code > rather than reducing it. Code will just become less and less > maintainable. > -Chris OK, I have abolished the bulk cut'n'paste :) Turns out we really only needed a bit of it, and then I spotted a way to reuse some of the "execlists" code (which is really LRC code) instead of having a GuC version of same (which is what needed some of these #defines). So in the end, this patch is replaced by simply renaming-and-exposing the /two/ functions in intel_lrc.c that we actually need for GuC submission. Better still, one of them may go away entirely once we eat some of Chris' low-hanging fruit :) .Dave.
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 0e4589e..56f81de 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7833,4 +7833,81 @@ enum skl_disp_power_wells { #define _PALETTE_A (dev_priv->info.display_mmio_offset + 0xa000) #define _PALETTE_B (dev_priv->info.display_mmio_offset + 0xa800) +/* Exec Lists */ +#define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE) +#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE) +#define GEN8_LR_CONTEXT_OTHER_SIZE (2 * PAGE_SIZE) + +#define RING_EXECLIST_QFULL (1 << 0x2) +#define RING_EXECLIST1_VALID (1 << 0x3) +#define RING_EXECLIST0_VALID (1 << 0x4) +#define RING_EXECLIST_ACTIVE_STATUS (3 << 0xE) +#define RING_EXECLIST1_ACTIVE (1 << 0x11) +#define RING_EXECLIST0_ACTIVE (1 << 0x12) + +#define GEN8_CTX_STATUS_IDLE_ACTIVE (1 << 0) +#define GEN8_CTX_STATUS_PREEMPTED (1 << 1) +#define GEN8_CTX_STATUS_ELEMENT_SWITCH (1 << 2) +#define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3) +#define GEN8_CTX_STATUS_COMPLETE (1 << 4) +#define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15) + +#define CTX_LRI_HEADER_0 0x01 +#define CTX_CONTEXT_CONTROL 0x02 +#define CTX_RING_HEAD 0x04 +#define CTX_RING_TAIL 0x06 +#define CTX_RING_BUFFER_START 0x08 +#define CTX_RING_BUFFER_CONTROL 0x0a +#define CTX_BB_HEAD_U 0x0c +#define CTX_BB_HEAD_L 0x0e +#define CTX_BB_STATE 0x10 +#define CTX_SECOND_BB_HEAD_U 0x12 +#define CTX_SECOND_BB_HEAD_L 0x14 +#define CTX_SECOND_BB_STATE 0x16 +#define CTX_BB_PER_CTX_PTR 0x18 +#define CTX_RCS_INDIRECT_CTX 0x1a +#define CTX_RCS_INDIRECT_CTX_OFFSET 0x1c +#define CTX_LRI_HEADER_1 0x21 +#define CTX_CTX_TIMESTAMP 0x22 +#define CTX_PDP3_UDW 0x24 +#define CTX_PDP3_LDW 0x26 +#define CTX_PDP2_UDW 0x28 +#define CTX_PDP2_LDW 0x2a +#define CTX_PDP1_UDW 0x2c +#define CTX_PDP1_LDW 0x2e +#define CTX_PDP0_UDW 0x30 +#define CTX_PDP0_LDW 0x32 +#define CTX_LRI_HEADER_2 0x41 +#define CTX_R_PWR_CLK_STATE 0x42 +#define CTX_GPGPU_CSR_BASE_ADDRESS 0x44 + +#define GEN8_CTX_VALID (1<<0) +#define GEN8_CTX_FORCE_PD_RESTORE (1<<1) +#define GEN8_CTX_FORCE_RESTORE (1<<2) +#define GEN8_CTX_L3LLC_COHERENT (1<<5) +#define GEN8_CTX_PRIVILEGE (1<<8) + +#define ASSIGN_CTX_PDP(ppgtt, reg_state, n) { \ + const u64 _addr = test_bit(n, ppgtt->pdp.used_pdpes) ? \ + ppgtt->pdp.page_directory[n]->daddr : \ + ppgtt->scratch_pd->daddr; \ + reg_state[CTX_PDP ## n ## _UDW+1] = upper_32_bits(_addr); \ + reg_state[CTX_PDP ## n ## _LDW+1] = lower_32_bits(_addr); \ +} + +enum { + ADVANCED_CONTEXT = 0, + LEGACY_CONTEXT, + ADVANCED_AD_CONTEXT, + LEGACY_64B_CONTEXT +}; +#define GEN8_CTX_MODE_SHIFT 3 +enum { + FAULT_AND_HANG = 0, + FAULT_AND_HALT, /* Debug only */ + FAULT_AND_STREAM, + FAULT_AND_CONTINUE /* Unsupported */ +}; +#define GEN8_CTX_ID_SHIFT 32 + #endif /* _I915_REG_H_ */ diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index d5cfab3..4fd1941 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -136,82 +136,6 @@ #include <drm/i915_drm.h> #include "i915_drv.h" -#define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE) -#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE) -#define GEN8_LR_CONTEXT_OTHER_SIZE (2 * PAGE_SIZE) - -#define RING_EXECLIST_QFULL (1 << 0x2) -#define RING_EXECLIST1_VALID (1 << 0x3) -#define RING_EXECLIST0_VALID (1 << 0x4) -#define RING_EXECLIST_ACTIVE_STATUS (3 << 0xE) -#define RING_EXECLIST1_ACTIVE (1 << 0x11) -#define RING_EXECLIST0_ACTIVE (1 << 0x12) - -#define GEN8_CTX_STATUS_IDLE_ACTIVE (1 << 0) -#define GEN8_CTX_STATUS_PREEMPTED (1 << 1) -#define GEN8_CTX_STATUS_ELEMENT_SWITCH (1 << 2) -#define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3) -#define GEN8_CTX_STATUS_COMPLETE (1 << 4) -#define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15) - -#define CTX_LRI_HEADER_0 0x01 -#define CTX_CONTEXT_CONTROL 0x02 -#define CTX_RING_HEAD 0x04 -#define CTX_RING_TAIL 0x06 -#define CTX_RING_BUFFER_START 0x08 -#define CTX_RING_BUFFER_CONTROL 0x0a -#define CTX_BB_HEAD_U 0x0c -#define CTX_BB_HEAD_L 0x0e -#define CTX_BB_STATE 0x10 -#define CTX_SECOND_BB_HEAD_U 0x12 -#define CTX_SECOND_BB_HEAD_L 0x14 -#define CTX_SECOND_BB_STATE 0x16 -#define CTX_BB_PER_CTX_PTR 0x18 -#define CTX_RCS_INDIRECT_CTX 0x1a -#define CTX_RCS_INDIRECT_CTX_OFFSET 0x1c -#define CTX_LRI_HEADER_1 0x21 -#define CTX_CTX_TIMESTAMP 0x22 -#define CTX_PDP3_UDW 0x24 -#define CTX_PDP3_LDW 0x26 -#define CTX_PDP2_UDW 0x28 -#define CTX_PDP2_LDW 0x2a -#define CTX_PDP1_UDW 0x2c -#define CTX_PDP1_LDW 0x2e -#define CTX_PDP0_UDW 0x30 -#define CTX_PDP0_LDW 0x32 -#define CTX_LRI_HEADER_2 0x41 -#define CTX_R_PWR_CLK_STATE 0x42 -#define CTX_GPGPU_CSR_BASE_ADDRESS 0x44 - -#define GEN8_CTX_VALID (1<<0) -#define GEN8_CTX_FORCE_PD_RESTORE (1<<1) -#define GEN8_CTX_FORCE_RESTORE (1<<2) -#define GEN8_CTX_L3LLC_COHERENT (1<<5) -#define GEN8_CTX_PRIVILEGE (1<<8) - -#define ASSIGN_CTX_PDP(ppgtt, reg_state, n) { \ - const u64 _addr = test_bit(n, ppgtt->pdp.used_pdpes) ? \ - ppgtt->pdp.page_directory[n]->daddr : \ - ppgtt->scratch_pd->daddr; \ - reg_state[CTX_PDP ## n ## _UDW+1] = upper_32_bits(_addr); \ - reg_state[CTX_PDP ## n ## _LDW+1] = lower_32_bits(_addr); \ -} - -enum { - ADVANCED_CONTEXT = 0, - LEGACY_CONTEXT, - ADVANCED_AD_CONTEXT, - LEGACY_64B_CONTEXT -}; -#define GEN8_CTX_MODE_SHIFT 3 -enum { - FAULT_AND_HANG = 0, - FAULT_AND_HALT, /* Debug only */ - FAULT_AND_STREAM, - FAULT_AND_CONTINUE /* Unsupported */ -}; -#define GEN8_CTX_ID_SHIFT 32 - static int intel_lr_context_pin(struct intel_engine_cs *ring, struct intel_context *ctx); @@ -263,8 +187,8 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj) return lrca >> 12; } -static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring, - struct drm_i915_gem_object *ctx_obj) +uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring, + struct drm_i915_gem_object *ctx_obj) { struct drm_device *dev = ring->dev; uint64_t desc; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 04d3a6d..19c9a02 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -85,6 +85,8 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file, struct drm_i915_gem_object *batch_obj, u64 exec_start, u32 dispatch_flags); u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj); +uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring, + struct drm_i915_gem_object *ctx_obj); void intel_lrc_irq_handler(struct intel_engine_cs *ring); void intel_execlists_retire_requests(struct intel_engine_cs *ring);