From patchwork Thu Feb 20 06:05:45 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ben Widawsky <benjamin.widawsky@intel.com>
X-Patchwork-Id: 3684411
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
X-Original-To: patchwork-intel-gfx@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.19.201])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id 22852BF13A
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 20 Feb 2014 06:06:13 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id EC73F2017B
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 20 Feb 2014 06:06:11 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	by mail.kernel.org (Postfix) with ESMTP id CACDD201B6
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Thu, 20 Feb 2014 06:06:09 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 7A478FB0C6;
	Wed, 19 Feb 2014 22:06:01 -0800 (PST)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
	by gabe.freedesktop.org (Postfix) with ESMTP id 6CF41FB0BF
	for <intel-gfx@lists.freedesktop.org>;
	Wed, 19 Feb 2014 22:05:57 -0800 (PST)
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
	by orsmga101.jf.intel.com with ESMTP; 19 Feb 2014 22:05:56 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="4.97,510,1389772800"; d="scan'208";a="478230172"
Received: from unknown (HELO ironside.amr.corp.intel.com) ([10.255.14.6])
	by fmsmga001.fm.intel.com with ESMTP; 19 Feb 2014 22:05:55 -0800
From: Ben Widawsky <benjamin.widawsky@intel.com>
To: Intel GFX <intel-gfx@lists.freedesktop.org>
Date: Wed, 19 Feb 2014 22:05:45 -0800
Message-Id: <1392876349-24684-6-git-send-email-benjamin.widawsky@intel.com>
X-Mailer: git-send-email 1.9.0
In-Reply-To: <1392876349-24684-1-git-send-email-benjamin.widawsky@intel.com>
References: <1392876349-24684-1-git-send-email-benjamin.widawsky@intel.com>
In-Reply-To: <1392244132-6806-1-git-send-email-benjamin.widawsky@intel.com>
References: <1392244132-6806-1-git-send-email-benjamin.widawsky@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>,
	Ben Widawsky <benjamin.widawsky@intel.com>
Subject: [Intel-gfx] [PATCH 5/9] drm/i915/bdw: Reorganize PT allocations
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
MIME-Version: 1.0
Sender: intel-gfx-bounces@lists.freedesktop.org
Errors-To: intel-gfx-bounces@lists.freedesktop.org
X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
	RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 31 15:50:31 2013 +0000

    drm/i915: Avoid dereference past end of page arr

It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)

v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)

v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h     |   5 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 130 ++++++++++++++++++++++++++++--------
 2 files changed, 106 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f3379ea..2dbdd34 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -691,6 +691,7 @@ struct i915_gtt {
 };
 #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
 
+#define GEN8_LEGACY_PDPS 4
 struct i915_hw_ppgtt {
 	struct i915_address_space base;
 	struct kref ref;
@@ -698,14 +699,14 @@ struct i915_hw_ppgtt {
 	unsigned num_pd_entries;
 	union {
 		struct page **pt_pages;
-		struct page *gen8_pt_pages;
+		struct page **gen8_pt_pages[GEN8_LEGACY_PDPS];
 	};
 	struct page *pd_pages;
 	int num_pd_pages;
 	int num_pt_pages;
 	union {
 		uint32_t pd_offset;
-		dma_addr_t pd_dma_addr[4];
+		dma_addr_t pd_dma_addr[GEN8_LEGACY_PDPS];
 	};
 	union {
 		dma_addr_t *pt_dma_addr;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ef5e90c..fcde3c7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -64,7 +64,19 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
 
 #define GEN8_PTES_PER_PAGE		(PAGE_SIZE / sizeof(gen8_gtt_pte_t))
 #define GEN8_PDES_PER_PAGE		(PAGE_SIZE / sizeof(gen8_ppgtt_pde_t))
-#define GEN8_LEGACY_PDPS		4
+
+/* GEN8 legacy style addressis defined as a 3 level page table:
+ * 31:30 | 29:21 | 20:12 |  11:0
+ * PDPE  |  PDE  |  PTE  | offset
+ * The difference as compared to normal x86 3 level page table is the PDPEs are
+ * programmed via register.
+ */
+#define GEN8_PDPE_SHIFT			30
+#define GEN8_PDPE_MASK			0x3
+#define GEN8_PDE_SHIFT			21
+#define GEN8_PDE_MASK			0x1ff
+#define GEN8_PTE_SHIFT			12
+#define GEN8_PTE_MASK			0x1ff
 
 #define PPAT_UNCACHED_INDEX		(_PAGE_PWT | _PAGE_PCD)
 #define PPAT_CACHED_PDE_INDEX		0 /* WB LLC */
@@ -261,32 +273,36 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_gtt_pte_t *pt_vaddr, scratch_pte;
-	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
+	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
+	unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
 	unsigned num_entries = length >> PAGE_SHIFT;
-	unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE;
-	unsigned first_pte = first_entry % GEN8_PTES_PER_PAGE;
 	unsigned last_pte, i;
 
 	scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
 				      I915_CACHE_LLC, use_scratch);
 
 	while (num_entries) {
-		struct page *page_table = &ppgtt->gen8_pt_pages[act_pt];
+		struct page *page_table = ppgtt->gen8_pt_pages[pdpe][pde];
 
-		last_pte = first_pte + num_entries;
+		last_pte = pte + num_entries;
 		if (last_pte > GEN8_PTES_PER_PAGE)
 			last_pte = GEN8_PTES_PER_PAGE;
 
 		pt_vaddr = kmap_atomic(page_table);
 
-		for (i = first_pte; i < last_pte; i++)
+		for (i = pte; i < last_pte; i++) {
 			pt_vaddr[i] = scratch_pte;
+			num_entries--;
+		}
 
 		kunmap_atomic(pt_vaddr);
 
-		num_entries -= last_pte - first_pte;
-		first_pte = 0;
-		act_pt++;
+		pte = 0;
+		if (++pde == GEN8_PDES_PER_PAGE) {
+			pdpe++;
+			pde = 0;
+		}
 	}
 }
 
@@ -298,38 +314,57 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_gtt_pte_t *pt_vaddr;
-	unsigned first_entry = start >> PAGE_SHIFT;
-	unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE;
-	unsigned act_pte = first_entry % GEN8_PTES_PER_PAGE;
+	unsigned which_pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
+	unsigned which_pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
+	unsigned which_pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
 	struct sg_page_iter sg_iter;
 
 	pt_vaddr = NULL;
+
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
+		if (WARN_ON(which_pdpe >= GEN8_LEGACY_PDPS))
+			break;
+
 		if (pt_vaddr == NULL)
-			pt_vaddr = kmap_atomic(&ppgtt->gen8_pt_pages[act_pt]);
+			pt_vaddr = kmap_atomic(ppgtt->gen8_pt_pages[which_pdpe][which_pde]);
 
-		pt_vaddr[act_pte] =
+		pt_vaddr[which_pte] =
 			gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
 					cache_level, true);
-		if (++act_pte == GEN8_PTES_PER_PAGE) {
+		if (++which_pte == GEN8_PTES_PER_PAGE) {
 			kunmap_atomic(pt_vaddr);
 			pt_vaddr = NULL;
-			act_pt++;
-			act_pte = 0;
+			if (which_pde + 1 == GEN8_PDES_PER_PAGE)
+				which_pdpe++;
+			which_pte = 0;
 		}
 	}
 	if (pt_vaddr)
 		kunmap_atomic(pt_vaddr);
 }
 
-static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
+static void gen8_free_page_tables(struct page **pt_pages)
+{
+	int i;
+
+	if (pt_pages == NULL)
+		return;
+
+	for (i = 0; i < GEN8_PDES_PER_PAGE; i++)
+		if (pt_pages[i])
+			__free_pages(pt_pages[i], 0);
+}
+
+static void gen8_ppgtt_free(const struct i915_hw_ppgtt *ppgtt)
 {
 	int i;
 
-	for (i = 0; i < ppgtt->num_pd_pages ; i++)
+	for (i = 0; i < ppgtt->num_pd_pages; i++) {
+		gen8_free_page_tables(ppgtt->gen8_pt_pages[i]);
+		kfree(ppgtt->gen8_pt_pages[i]);
 		kfree(ppgtt->gen8_pt_dma_addr[i]);
+	}
 
-	__free_pages(ppgtt->gen8_pt_pages, get_order(ppgtt->num_pt_pages << PAGE_SHIFT));
 	__free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT));
 	kfree(ppgtt);
 }
@@ -369,20 +404,61 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 	gen8_ppgtt_free(ppgtt);
 }
 
+static struct page **__gen8_alloc_page_tables(void)
+{
+	struct page **pt_pages;
+	int i;
+
+	pt_pages = kcalloc(GEN8_PDES_PER_PAGE, sizeof(struct page *), GFP_KERNEL);
+	if (!pt_pages)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < GEN8_PDES_PER_PAGE; i++) {
+		pt_pages[i] = alloc_page(GFP_KERNEL);
+		if (!pt_pages[i])
+			goto bail;
+	}
+
+	return pt_pages;
+
+bail:
+	gen8_free_page_tables(pt_pages);
+	kfree(pt_pages);
+	return ERR_PTR(-ENOMEM);
+}
+
 static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt,
 					   const int max_pdp)
 {
-	struct page *pt_pages;
+	struct page **pt_pages[GEN8_LEGACY_PDPS];
 	const int num_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
+	int i, ret;
 
-	pt_pages = alloc_pages(GFP_KERNEL, get_order(num_pt_pages << PAGE_SHIFT));
-	if (!pt_pages)
-		return -ENOMEM;
+	for (i = 0; i < max_pdp; i++) {
+		pt_pages[i] = __gen8_alloc_page_tables();
+		if (IS_ERR(pt_pages[i])) {
+			ret = PTR_ERR(pt_pages[i]);
+			goto unwind_out;
+		}
+	}
+
+	/* NB: Avoid touching gen8_pt_pages until last to keep the allocation,
+	 * "atomic" - for cleanup purposes.
+	 */
+	for (i = 0; i < max_pdp; i++)
+		ppgtt->gen8_pt_pages[i] = pt_pages[i];
 
-	ppgtt->gen8_pt_pages = pt_pages;
 	ppgtt->num_pt_pages = 1 << get_order(num_pt_pages << PAGE_SHIFT);
 
 	return 0;
+
+unwind_out:
+	while (i--) {
+		gen8_free_page_tables(pt_pages[i]);
+		kfree(pt_pages[i]);
+	}
+
+	return ret;
 }
 
 static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt)
@@ -464,7 +540,7 @@ static int gen8_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt,
 	struct page *p;
 	int ret;
 
-	p = &ppgtt->gen8_pt_pages[pd * GEN8_PDES_PER_PAGE + pt];
+	p = ppgtt->gen8_pt_pages[pd][pt];
 	pt_addr = pci_map_page(ppgtt->base.dev->pdev,
 			       p, 0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
 	ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);