From patchwork Tue Jul 7 15:14:52 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michel Thierry X-Patchwork-Id: 6734191 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 2D3DE9F2F0 for ; Tue, 7 Jul 2015 15:15:17 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id EBB8620751 for ; Tue, 7 Jul 2015 15:15:15 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id A3B35206FC for ; Tue, 7 Jul 2015 15:15:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 311F26EA43; Tue, 7 Jul 2015 08:15:14 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTP id 9DC7A6EA45 for ; Tue, 7 Jul 2015 08:15:12 -0700 (PDT) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP; 07 Jul 2015 08:15:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,424,1432623600"; d="scan'208";a="601641186" Received: from michelth-linux2.isw.intel.com ([10.102.226.189]) by orsmga003.jf.intel.com with ESMTP; 07 Jul 2015 08:15:12 -0700 From: Michel Thierry To: intel-gfx@lists.freedesktop.org Date: Tue, 7 Jul 2015 16:14:52 +0100 Message-Id: <1436282103-5854-8-git-send-email-michel.thierry@intel.com> X-Mailer: git-send-email 2.4.5 In-Reply-To: <1436282103-5854-1-git-send-email-michel.thierry@intel.com> References: <1436282103-5854-1-git-send-email-michel.thierry@intel.com> Cc: akash.goel@intel.com Subject: [Intel-gfx] [PATCH v4 07/18] drm/i915/gen8: Add 4 level switching infrastructure and lrc support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains the base address to PML4, while the other PDP registers are ignored. In LRC, the addressing mode must be specified in every context descriptor. v2: PML4 update in legacy context switch is left for historic reasons, the preferred mode of operation is with lrc context based submission. v3: s/gen8_map_page_directory/gen8_setup_page_directory and s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer. Also, clflush will be needed for bxt. (Akash) v4: Squashed lrc-specific code and use a macro to set PML4 register. v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. PDP update in bb_start is only for legacy 32b mode. v6: Rebase after final merged version of Mika's ppgtt/scratch patches. Cc: Akash Goel Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 54 +++++++++++++++++++++++++++--- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 ++ drivers/gpu/drm/i915/i915_reg.h | 1 + drivers/gpu/drm/i915/intel_lrc.c | 65 +++++++++++++++++++++++++++---------- 4 files changed, 99 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 33df9b7..844ca6e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -211,6 +211,9 @@ static gen8_pde_t gen8_pde_encode(const dma_addr_t addr, return pde; } +#define gen8_pdpe_encode gen8_pde_encode +#define gen8_pml4e_encode gen8_pde_encode + static gen6_pte_t snb_pte_encode(dma_addr_t addr, enum i915_cache_level level, bool valid, u32 unused) @@ -600,6 +603,35 @@ static void free_pdp(struct drm_device *dev, } } +static void +gen8_setup_page_directory(struct i915_hw_ppgtt *ppgtt, + struct i915_page_directory_pointer *pdp, + struct i915_page_directory *pd, + int index) +{ + gen8_ppgtt_pdpe_t *page_directorypo; + + if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) + return; + + page_directorypo = kmap_px(pdp); + page_directorypo[index] = gen8_pdpe_encode(px_dma(pd), I915_CACHE_LLC); + kunmap_px(ppgtt, page_directorypo); +} + +static void +gen8_setup_page_directory_pointer(struct i915_hw_ppgtt *ppgtt, + struct i915_pml4 *pml4, + struct i915_page_directory_pointer *pdp, + int index) +{ + gen8_ppgtt_pml4e_t *pagemap = kmap_px(pml4); + + WARN_ON(!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)); + pagemap[index] = gen8_pml4e_encode(px_dma(pdp), I915_CACHE_LLC); + kunmap_px(ppgtt, pagemap); +} + /* Broadwell Page Directory Pointer Descriptors */ static int gen8_write_pdp(struct drm_i915_gem_request *req, unsigned entry, @@ -625,8 +657,8 @@ static int gen8_write_pdp(struct drm_i915_gem_request *req, return 0; } -static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, - struct drm_i915_gem_request *req) +static int gen8_legacy_mm_switch(struct i915_hw_ppgtt *ppgtt, + struct drm_i915_gem_request *req) { int i, ret; @@ -641,6 +673,12 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, return 0; } +static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt, + struct drm_i915_gem_request *req) +{ + return gen8_write_pdp(req, 0, px_dma(&ppgtt->pml4)); +} + static void gen8_ppgtt_clear_range(struct i915_address_space *vm, uint64_t start, uint64_t length, @@ -1100,6 +1138,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, kunmap_px(ppgtt, page_directory); __set_bit(pdpe, pdp->used_pdpes); + gen8_setup_page_directory(ppgtt, pdp, pd, pdpe); } free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes); @@ -1126,6 +1165,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, uint64_t length) { DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4); + struct i915_hw_ppgtt *ppgtt = + container_of(vm, struct i915_hw_ppgtt, base); struct i915_page_directory_pointer *pdp; const uint64_t orig_start = start; const uint64_t orig_length = length; @@ -1168,6 +1209,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length); if (ret) goto err_out; + + gen8_setup_page_directory_pointer(ppgtt, pml4, pdp, pml4e); } bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es, @@ -1217,14 +1260,13 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt) ppgtt->base.unbind_vma = ppgtt_unbind_vma; ppgtt->base.bind_vma = ppgtt_bind_vma; - ppgtt->switch_mm = gen8_mm_switch; - if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { ret = setup_px(ppgtt->base.dev, &ppgtt->pml4); if (ret) goto free_scratch; ppgtt->base.total = 1ULL << 48; + ppgtt->switch_mm = gen8_48b_mm_switch; } else { ret = __pdp_init(false, &ppgtt->pdp); if (ret) @@ -1239,6 +1281,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt) */ ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total; + ppgtt->switch_mm = gen8_legacy_mm_switch; trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, 0, 0, GEN8_PML4E_SHIFT); @@ -1436,8 +1479,9 @@ static void gen8_ppgtt_enable(struct drm_device *dev) int j; for_each_ring(ring, dev_priv, j) { + u32 four_level = USES_FULL_48BIT_PPGTT(dev) ? GEN8_GFX_PPGTT_48B : 0; I915_WRITE(RING_MODE_GEN7(ring), - _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE)); + _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE | four_level)); } } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 660c9a5..ae9463e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -39,6 +39,8 @@ struct drm_i915_file_private; typedef uint32_t gen6_pte_t; typedef uint64_t gen8_pte_t; typedef uint64_t gen8_pde_t; +typedef uint64_t gen8_ppgtt_pdpe_t; +typedef uint64_t gen8_ppgtt_pml4e_t; #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 2a29bcc..d7301fc 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -1669,6 +1669,7 @@ enum skl_disp_power_wells { #define GFX_REPLAY_MODE (1<<11) #define GFX_PSMI_GRANULARITY (1<<10) #define GFX_PPGTT_ENABLE (1<<9) +#define GEN8_GFX_PPGTT_48B (1<<7) #define VLV_DISPLAY_BASE 0x180000 #define VLV_MIPI_BASE VLV_DISPLAY_BASE diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index d4f8b43..d2a95af 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -195,13 +195,21 @@ reg_state[CTX_PDP ## n ## _LDW+1] = lower_32_bits(_addr); \ } +#define ASSIGN_CTX_PML4(ppgtt, reg_state) { \ + reg_state[CTX_PDP0_UDW + 1] = upper_32_bits(px_dma(&ppgtt->pml4)); \ + reg_state[CTX_PDP0_LDW + 1] = lower_32_bits(px_dma(&ppgtt->pml4)); \ +} + enum { ADVANCED_CONTEXT = 0, - LEGACY_CONTEXT, + LEGACY_32B_CONTEXT, ADVANCED_AD_CONTEXT, LEGACY_64B_CONTEXT }; -#define GEN8_CTX_MODE_SHIFT 3 +#define GEN8_CTX_ADDRESSING_MODE_SHIFT 3 +#define GEN8_CTX_ADDRESSING_MODE(dev) (USES_FULL_48BIT_PPGTT(dev) ?\ + LEGACY_64B_CONTEXT :\ + LEGACY_32B_CONTEXT) enum { FAULT_AND_HANG = 0, FAULT_AND_HALT, /* Debug only */ @@ -272,7 +280,7 @@ static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_request *rq) WARN_ON(lrca & 0xFFFFFFFF00000FFFULL); desc = GEN8_CTX_VALID; - desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT; + desc |= GEN8_CTX_ADDRESSING_MODE(dev) << GEN8_CTX_ADDRESSING_MODE_SHIFT; if (IS_GEN8(ctx_obj->base.dev)) desc |= GEN8_CTX_L3LLC_COHERENT; desc |= GEN8_CTX_PRIVILEGE; @@ -347,10 +355,16 @@ static int execlists_update_context(struct drm_i915_gem_request *rq) reg_state[CTX_RING_TAIL+1] = rq->tail; reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj); - /* True PPGTT with dynamic page allocation: update PDP registers and - * point the unallocated PDPs to the scratch page - */ - if (ppgtt) { + if (ppgtt && USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { + /* True 64b PPGTT (48bit canonical) + * PDP0_DESCRIPTOR contains the base address to PML4 and + * other PDP Descriptors are ignored + */ + ASSIGN_CTX_PML4(ppgtt, reg_state); + } else if (ppgtt) { + /* True 32b PPGTT with dynamic page allocation: update PDP + * registers and point the unallocated PDPs to the scratch page + */ ASSIGN_CTX_PDP(ppgtt, reg_state, 3); ASSIGN_CTX_PDP(ppgtt, reg_state, 2); ASSIGN_CTX_PDP(ppgtt, reg_state, 1); @@ -1432,12 +1446,16 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req, * Ideally, we should set Force PD Restore in ctx descriptor, * but we can't. Force Restore would be a second option, but * it is unsafe in case of lite-restore (because the ctx is - * not idle). */ + * not idle). PML4 is allocated during ppgtt init so this is + * not needed in 48-bit.*/ if (req->ctx->ppgtt && (intel_ring_flag(req->ring) & req->ctx->ppgtt->pd_dirty_rings)) { - ret = intel_logical_ring_emit_pdps(req); - if (ret) - return ret; + if (GEN8_CTX_ADDRESSING_MODE(req->i915) == LEGACY_32B_CONTEXT){ + ret = intel_logical_ring_emit_pdps(req); + + if (ret) + return ret; + } req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring); } @@ -2104,13 +2122,24 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0); reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0); - /* With dynamic page allocation, PDPs may not be allocated at this point, - * Point the unallocated PDPs to the scratch page - */ - ASSIGN_CTX_PDP(ppgtt, reg_state, 3); - ASSIGN_CTX_PDP(ppgtt, reg_state, 2); - ASSIGN_CTX_PDP(ppgtt, reg_state, 1); - ASSIGN_CTX_PDP(ppgtt, reg_state, 0); + if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { + /* 64b PPGTT (48bit canonical) + * PDP0_DESCRIPTOR contains the base address to PML4 and + * other PDP Descriptors are ignored. + */ + ASSIGN_CTX_PML4(ppgtt, reg_state); + } else { + /* 32b PPGTT + * PDP*_DESCRIPTOR contains the base address of space supported. + * With dynamic page allocation, PDPs may not be allocated at + * this point. Point the unallocated PDPs to the scratch page + */ + ASSIGN_CTX_PDP(ppgtt, reg_state, 3); + ASSIGN_CTX_PDP(ppgtt, reg_state, 2); + ASSIGN_CTX_PDP(ppgtt, reg_state, 1); + ASSIGN_CTX_PDP(ppgtt, reg_state, 0); + } + if (ring->id == RCS) { reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1); reg_state[CTX_R_PWR_CLK_STATE] = GEN8_R_PWR_CLK_STATE;