From patchwork Fri Oct 7 13:17:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Micha=C5=82_Winiarski?= X-Patchwork-Id: 9366093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 68B966075E for ; Fri, 7 Oct 2016 13:18:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B01C295F4 for ; Fri, 7 Oct 2016 13:18:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4CD3C295EA; Fri, 7 Oct 2016 13:18:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CE404295EA for ; Fri, 7 Oct 2016 13:18:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 564436EB8F; Fri, 7 Oct 2016 13:18:16 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 459D56EB8C for ; Fri, 7 Oct 2016 13:18:15 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP; 07 Oct 2016 06:18:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,308,1473145200"; d="scan'208";a="17684480" Received: from irsmsx102.ger.corp.intel.com ([163.33.3.155]) by orsmga004.jf.intel.com with ESMTP; 07 Oct 2016 06:18:13 -0700 Received: from mwiniars-desk1.ger.corp.intel.com (172.28.171.143) by IRSMSX102.ger.corp.intel.com (163.33.3.155) with Microsoft SMTP Server id 14.3.248.2; Fri, 7 Oct 2016 14:18:13 +0100 From: =?UTF-8?q?Micha=C5=82=20Winiarski?= To: Date: Fri, 7 Oct 2016 15:17:52 +0200 Message-ID: <1475846272-16963-3-git-send-email-michal.winiarski@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1475846272-16963-1-git-send-email-michal.winiarski@intel.com> References: <1475846272-16963-1-git-send-email-michal.winiarski@intel.com> MIME-Version: 1.0 X-Originating-IP: [172.28.171.143] Cc: Mika Kuoppala Subject: [Intel-gfx] [PATCH v2 3/3] drm/i915/gtt: Free unused lower-level page tables X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP Since "Dynamic page table allocations" were introduced, our page tables can grow (being dynamically allocated) with address space range usage. Unfortunately, their lifetime is bound to vm. This is not a huge problem when we're not using softpin - drm_mm is creating an upper bound on used range by causing addresses for our VMAs to eventually be reused. With softpin, long lived contexts can drain the system out of memory even with a single "small" object. For example: bo = bo_alloc(size); while(true) offset += size; exec(bo, offset); Will cause us to create new allocations until all memory in the system is used for tracking GPU pages (even though almost all PTEs in this vm are pointing to scratch). Let's free unused page tables in clear_range to prevent this - if no entries are used, we can safely free it and return this information to the caller (so that higher-level entry is pointing to scratch). v2: Document return value and free semantics (Joonas) Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Michel Thierry Cc: Mika Kuoppala Signed-off-by: MichaƂ Winiarski --- drivers/gpu/drm/i915/i915_gem_gtt.c | 89 ++++++++++++++++++++++++++++++++++--- 1 file changed, 82 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index adabf58..7850094 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -704,7 +704,9 @@ static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt, return gen8_write_pdp(req, 0, px_dma(&ppgtt->pml4)); } -static void gen8_ppgtt_clear_pt(struct i915_address_space *vm, +/* Removes entries from a single page table, releasing it if it's empty. + * Caller can use the return value to update higher-level entries */ +static bool gen8_ppgtt_clear_pt(struct i915_address_space *vm, struct i915_page_table *pt, uint64_t start, uint64_t length) @@ -719,63 +721,136 @@ static void gen8_ppgtt_clear_pt(struct i915_address_space *vm, I915_CACHE_LLC); if (WARN_ON(!px_page(pt))) - return; + return false; bitmap_clear(pt->used_ptes, pte_start, num_entries); + if (bitmap_empty(pt->used_ptes, GEN8_PTES)) { + free_pt(vm->dev, pt); + return true; + } + pt_vaddr = kmap_px(pt); for (pte = pte_start; pte < num_entries; pte++) pt_vaddr[pte] = scratch_pte; kunmap_px(ppgtt, pt_vaddr); + + return false; } -static void gen8_ppgtt_clear_pd(struct i915_address_space *vm, +/* Removes entries from a single page dir, releasing it if it's empty. + * Caller can use the return value to update higher-level entries + */ +static bool gen8_ppgtt_clear_pd(struct i915_address_space *vm, struct i915_page_directory *pd, uint64_t start, uint64_t length) { + struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm); + struct i915_page_table *pt; uint64_t pde; + gen8_pde_t *pde_vaddr; + gen8_pde_t scratch_pde = gen8_pde_encode(px_dma(vm->scratch_pt), + I915_CACHE_LLC); + bool reduce; + gen8_for_each_pde(pt, pd, start, length, pde) { if (WARN_ON(!pd->page_table[pde])) break; - gen8_ppgtt_clear_pt(vm, pt, start, length); + reduce = gen8_ppgtt_clear_pt(vm, pt, start, length); + + if (reduce) { + __clear_bit(pde, pd->used_pdes); + pde_vaddr = kmap_px(pd); + pde_vaddr[pde] = scratch_pde; + kunmap_px(ppgtt, pde_vaddr); + } } + + if (bitmap_empty(pd->used_pdes, I915_PDES)) { + free_pd(vm->dev, pd); + return true; + } + + return false; } -static void gen8_ppgtt_clear_pdp(struct i915_address_space *vm, +/* Removes entries from a single page dir pointer, releasing it if it's empty. + * Caller can use the return value to update higher-level entries + */ +static bool gen8_ppgtt_clear_pdp(struct i915_address_space *vm, struct i915_page_directory_pointer *pdp, uint64_t start, uint64_t length) { + struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm); + struct i915_page_directory *pd; uint64_t pdpe; + gen8_ppgtt_pdpe_t *pdpe_vaddr; + gen8_ppgtt_pdpe_t scratch_pdpe = + gen8_pdpe_encode(px_dma(vm->scratch_pd), I915_CACHE_LLC); + bool reduce; + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { if (WARN_ON(!pdp->page_directory[pdpe])) break; - gen8_ppgtt_clear_pd(vm, pd, start, length); + reduce = gen8_ppgtt_clear_pd(vm, pd, start, length); + + if (reduce) { + __clear_bit(pdpe, pdp->used_pdpes); + pdpe_vaddr = kmap_px(pdp); + pdpe_vaddr[pdpe] = scratch_pdpe; + kunmap_px(ppgtt, pdpe_vaddr); + } + } + + if (bitmap_empty(pdp->used_pdpes, I915_PDPES_PER_PDP(vm->dev))) { + free_pdp(vm->dev, pdp); + return true; } + + return false; } +/* Removes entries from a single pml4. + * This is the top-level structure in 4-level page tables used on gen8+. + * Empty entries are always scratch pml4e. + */ static void gen8_ppgtt_clear_pml4(struct i915_address_space *vm, struct i915_pml4 *pml4, uint64_t start, uint64_t length) { + struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm); + struct i915_page_directory_pointer *pdp; uint64_t pml4e; + gen8_ppgtt_pml4e_t *pml4e_vaddr; + gen8_ppgtt_pml4e_t scratch_pml4e = + gen8_pml4e_encode(px_dma(vm->scratch_pdp), I915_CACHE_LLC); + bool reduce; + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { if (WARN_ON(!pml4->pdps[pml4e])) break; - gen8_ppgtt_clear_pdp(vm, pdp, start, length); + reduce = gen8_ppgtt_clear_pdp(vm, pdp, start, length); + + if (reduce) { + __clear_bit(pml4e, pml4->used_pml4es); + pml4e_vaddr = kmap_px(pml4); + pml4e_vaddr[pml4e] = scratch_pml4e; + kunmap_px(ppgtt, pml4e_vaddr); + } } }