diff mbox

i915 regression in kernel 4.10

Message ID 20161219122934.GM29871@nuc-i3427.alporthouse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Dec. 19, 2016, 12:29 p.m. UTC
On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote:
> With recent 4.10 kernel the graphics isn't coming up under Xen. First
> failure message is:
> 
> [   46.656649] i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes)

Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we
call dma_map_sg() and pass the sg to swiotlb (in this case) for
remapping, and we do check for an error value of 0. After that error,
SWIOTLB_MAP_ERROR is propagated back and converted to 0 for
dma_map_sg(). That looks valid, and we should report ENOMEM back to the
caller.

> Later I see splats like:
> 
> [   49.393583] general protection fault: 0000 [#1] SMP

What was the faulting address? RAX is particularly non-pointer-like so I
wonder if we walked onto an uninitialised portion of the sgtable. We may
have tripped over a bug in our sg_page iterator.

The attached patch should prevent an early ENOMEM following the swiotlb
allocation failure. But I suspect that we will still be tripping up the
failure in the sg walker when binding to the GPU.
-Chris

Comments

Jürgen Groß Dec. 19, 2016, 2:16 p.m. UTC | #1
On 19/12/16 13:29, Chris Wilson wrote:
> On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote:
>> With recent 4.10 kernel the graphics isn't coming up under Xen. First
>> failure message is:
>>
>> [   46.656649] i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes)
> 
> Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we
> call dma_map_sg() and pass the sg to swiotlb (in this case) for
> remapping, and we do check for an error value of 0. After that error,
> SWIOTLB_MAP_ERROR is propagated back and converted to 0 for
> dma_map_sg(). That looks valid, and we should report ENOMEM back to the
> caller.
> 
>> Later I see splats like:
>>
>> [   49.393583] general protection fault: 0000 [#1] SMP
> 
> What was the faulting address? RAX is particularly non-pointer-like so I
> wonder if we walked onto an uninitialised portion of the sgtable. We may
> have tripped over a bug in our sg_page iterator.

During the bisect process there have been either GP or NULL pointer
dereferences or other page faults. Typical addresses where:

xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000000000018
xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000003020118

> 
> The attached patch should prevent an early ENOMEM following the swiotlb
> allocation failure. But I suspect that we will still be tripping up the
> failure in the sg walker when binding to the GPU.
> -Chris
> 

The patch is working not too bad. :-)

Still several "swiotlb buffer is full" messages (some with sz:, most
without), but no faults any more (neither GP nor NULL pointer
dereference). Graphical login is working now.

What I do see, however, is (no idea whether this is related):

[  735.826492] INFO: task systemd-udevd:484 blocked for more than 120
seconds.
[  735.826497]       Tainted: G        W       4.9.0-pv+ #767
[  735.826499] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  735.826501] systemd-udevd   D    0   484    443 0x00000000
[  735.826507] Call Trace:
[  735.826522]  ? __schedule+0x192/0x640
[  735.826530]  ? kmem_cache_free+0x45/0x150
[  735.826535]  ? schedule+0x2d/0x80
[  735.826539]  ? schedule_timeout+0x1f3/0x380
[  735.826545]  ? error_exit+0x9/0x20
[  735.826555]  ? sg_pool_index.part.0+0x2/0x2
[  735.826561]  ? wait_for_completion+0xa4/0x110
[  735.826569]  ? wake_up_q+0x70/0x70
[  735.826577]  ? cpufreq_boost_online+0x10/0x10 [acpi_cpufreq]
[  735.826585]  ? cpuhp_issue_call+0x9c/0xe0
[  735.826590]  ? __cpuhp_setup_state+0xd5/0x1d0
[  735.826599]  ? acpi_cpufreq_init+0x1cd/0x1000 [acpi_cpufreq]
[  735.826601]  ? 0xffffffffa00b1000
[  735.826607]  ? do_one_initcall+0x38/0x180
[  735.826611]  ? kmem_cache_alloc_trace+0x98/0x1e0
[  735.826620]  ? do_init_module+0x55/0x1e5
[  735.826629]  ? load_module+0x2088/0x26b0
[  735.826633]  ? __symbol_put+0x30/0x30
[  735.826639]  ? SYSC_finit_module+0x80/0xb0
[  735.826644]  ? entry_SYSCALL_64_fastpath+0x1e/0xad

I guess it is _not_ related, OTOH there is sg_pool_index() involved...


Juergen
Konrad Rzeszutek Wilk Dec. 20, 2016, 2:42 p.m. UTC | #2
On Mon, Dec 19, 2016 at 03:16:44PM +0100, Juergen Gross wrote:
> On 19/12/16 13:29, Chris Wilson wrote:
> > On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote:
> >> With recent 4.10 kernel the graphics isn't coming up under Xen. First
> >> failure message is:
> >>
> >> [   46.656649] i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes)
> > 
> > Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we
> > call dma_map_sg() and pass the sg to swiotlb (in this case) for
> > remapping, and we do check for an error value of 0. After that error,
> > SWIOTLB_MAP_ERROR is propagated back and converted to 0 for
> > dma_map_sg(). That looks valid, and we should report ENOMEM back to the
> > caller.
> > 
> >> Later I see splats like:
> >>
> >> [   49.393583] general protection fault: 0000 [#1] SMP
> > 
> > What was the faulting address? RAX is particularly non-pointer-like so I
> > wonder if we walked onto an uninitialised portion of the sgtable. We may
> > have tripped over a bug in our sg_page iterator.
> 
> During the bisect process there have been either GP or NULL pointer
> dereferences or other page faults. Typical addresses where:
> 
> xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000000000018
> xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000003020118
> 
> > 
> > The attached patch should prevent an early ENOMEM following the swiotlb
> > allocation failure. But I suspect that we will still be tripping up the
> > failure in the sg walker when binding to the GPU.
> > -Chris
> > 
> 
> The patch is working not too bad. :-)
> 
> Still several "swiotlb buffer is full" messages (some with sz:, most
> without), but no faults any more (neither GP nor NULL pointer
> dereference). Graphical login is working now.


I think I know why. The optimization that was added assumes that
bus addresses is the same as physical address. Hence it packs all
of the virtual addresses in the sg, and hands it off to SWIOTLB
which walks each one and realizes that it has to use the bounce
buffer.

I am wondering if would make sense to pull 'swiotlb_max_size' inside
of SWIOTLB and make it an library-ish - so Xen-SWIOTLB can register
as well and report say that it can only provide one page
(unless it is running under baremtal).

Or make the usage of 'max_segement' and 'page_to_pfn(page) != last_pfn + 1'
in i915_gem_object_Get_pages_gtt use something similar to xen_biovec_phys_mergeable?
diff mbox

Patch

From e3f9268d467768a31e19d21e2f45e5c9ddd9a0f8 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon, 19 Dec 2016 12:23:43 +0000
Subject: [PATCH] drm/i915: Fallback to single PAGE_SIZE segments for DMA
 remapping

If we at first do not succeed with attempting to remap our physical
pages using a coalesced scattergather list, try again with one
scattergather entry per page. This should help with swiotlb as it uses a
limited buffer size and only searches for contiguous chunks within its
buffer aligned up to the next boundary - i.e. we may prematurely cause a
failure as we are unable to utilize the unused space between large
chunks and trigger an error such as:

	 i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes)

Reported-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 412f3513f269..509d98887e04 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2326,7 +2326,8 @@  static struct sg_table *
 i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
-	int page_count, i;
+	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	unsigned long i;
 	struct address_space *mapping;
 	struct sg_table *st;
 	struct scatterlist *sg;
@@ -2352,7 +2353,7 @@  i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	if (st == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	page_count = obj->base.size / PAGE_SIZE;
+rebuild_st:
 	if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
 		kfree(st);
 		return ERR_PTR(-ENOMEM);
@@ -2411,8 +2412,21 @@  i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	i915_sg_trim(st);
 
 	ret = i915_gem_gtt_prepare_pages(obj, st);
-	if (ret)
-		goto err_pages;
+	if (ret) {
+		/* DMA remapping failed? One possible cause is that
+		 * it could not reserve enough large entries, asking
+		 * for PAGE_SIZE chunks may be helpful.
+		 */
+		if (max_segment > PAGE_SIZE) {
+			for_each_sgt_page(page, sgt_iter, st)
+				put_page(page);
+			sg_free_table(st);
+
+			max_segment = PAGE_SIZE;
+			goto rebuild_st;
+		} else
+			goto err_pages;
+	}
 
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_do_bit_17_swizzle(obj, st);
-- 
2.11.0