From patchwork Tue May 27 21:55:21 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?TWFyZWsgT2zFocOhaw==?= X-Patchwork-Id: 4251801 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 5B2BCBF90B for ; Tue, 27 May 2014 21:56:11 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5BFE4201EC for ; Tue, 27 May 2014 21:56:10 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id C7936201DE for ; Tue, 27 May 2014 21:56:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CF8086E796; Tue, 27 May 2014 14:56:04 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com [209.85.220.180]) by gabe.freedesktop.org (Postfix) with ESMTP id 091616E796 for ; Tue, 27 May 2014 14:56:03 -0700 (PDT) Received: by mail-vc0-f180.google.com with SMTP id hy4so11450888vcb.11 for ; Tue, 27 May 2014 14:56:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=au8CSVVQSO1Mo/WNNm3AmkNh2k2rKlQ+FsSP+kyL2iQ=; b=ffW50L4CGH2zerGBJFBhEwJruv4+IQrKvRZpQAAq0qpRhZD5GF/UNIrDRrZUdjgjna BEs+izt0uI6bOF98i+iaNWtYeevhqC1ZGX/Ib6Ek0ysBfYUh0d6iDBjs7buWOMZ2h6wh QvwvLHI4Qk1n1wET4zYLiK+lpCQKJKYi3lS3gfpnTBVyLs3vLGVqwCpGe9dITuQPhH97 Mnf2H5bs9gKCzs70aweQ+F76Xf8Bp2yFIgIf93NF6TbAHrJvHUNnaRdYKL6kpVq7fIM+ G8e1j7fuBP1wvZ+njMPOtcYPP9OsIF6h+f9aYO/aSBX8ieV/R63rUt1bFPkGhFHDL9so 69rA== X-Received: by 10.220.98.143 with SMTP id q15mr4677098vcn.38.1401227763248; Tue, 27 May 2014 14:56:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.49.5 with HTTP; Tue, 27 May 2014 14:55:21 -0700 (PDT) In-Reply-To: <53735D79.6050904@vodafone.de> References: <53735D79.6050904@vodafone.de> From: =?UTF-8?B?TWFyZWsgT2zFocOhaw==?= Date: Tue, 27 May 2014 23:55:21 +0200 Message-ID: Subject: Re: CIK hangs with kernel 3.15, bisected To: =?UTF-8?Q?Christian_K=C3=B6nig?= Cc: dri-devel X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Christian, I test on Bonaire (ChipID = 0x665c). Unfortunately, the hangs are not fixed yet. They are very rare and very random. Therefore, I have come up with a patch which evicts page tables between IBs. See the attachment. With that patch applied, the system starts fine, compiz and glxgears work, but once I start playing openarena, it locks up pretty quickly. The patch shouldn't do anything in theory, because pages are moved back to VRAM immediately after that. However, the VRAM address of page tables may end up being different from before, which might be the root cause. Marek On Wed, May 14, 2014 at 2:11 PM, Christian König wrote: > Crap, any chance you can narrow it down a bit more? > > I've just tried a piglit quick test on my Bonaire and it seems to work > perfectly fine. > > What hw do you test on? > > Regards, > Christian. > > Am 13.05.2014 23:21, schrieb Marek Olšák: > >> Hi Christian, >> >> Even though some regressions are fixed by these patches: >> >> drm/radeon: fix page directory update size estimation >> drm/radeon: fix buffer placement under memory pressure v2 >> >> and indeed, the texelFetch tests no longer hang, there is one more >> hang which needs to be fixed. :( All I know is the exact same commit >> causes it and it can only be reproduced by running whole piglit with >> concurrency enabled. >> >> My kernel git log: >> >> * 2ba22c8 - drm/radeon: fix buffer placement under memory pressure v2 >> (10 hours ago) >> * 3af91e5 - drm/radeon: fix page directory update size estimation (21 >> hours ago) >> * 6d2f294 - drm/radeon: use normal BOs for the page tables v4 (2 >> months ago) >> * fa68834 - drm/radeon: further cleanup vm flushing & fencing (2 >> months ago) >> >> fa68834 doesn't hang, but 2ba22c8 hangs, which means 6d2f294 or either >> of the two fixes is the first bad commit. >> >> Marek >> >> On Fri, May 9, 2014 at 8:03 PM, Marek Olšák wrote: >>> >>> Hi Christian, >>> >>> This commit which first appeared in 3.15-rc1 causes hangs on Bonaire: >>> >>> commit 6d2f2944e95e504a7d33385eeeb9bb7fcca72592 >>> Author: Christian König >>> Date: Thu Feb 20 13:42:17 2014 +0100 >>> >>> drm/radeon: use normal BOs for the page tables v4 >>> >>> No need to make it more complicated than necessary, >>> just allocate the page tables as normal BO and >>> flush whenever the address change. >>> >>> v2: update comments and function name >>> v3: squash bug fixes, page directory and tables patch >>> v4: rebased on Mareks changes >>> >>> Signed-off-by: Christian König >>> >>> >>> Reverting the commit gives me a lot of merge conflicts. >>> >>> The simplest way to reproduce the hangs is to run piglit with these >>> parameters: >>> -t texelFetch.fs >>> >>> Some of the tests allocate a lot of MSAA textures and the tests also >>> run in parallel, which creates a lot of memory pressure and probably >>> causes buffer evictions. >>> >>> Any idea what is wrong with it? >>> >>> Thanks, >>> >>> Marek > > diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c index d9ab99f..365e36f 100644 --- a/drivers/gpu/drm/radeon/radeon_vm.c +++ b/drivers/gpu/drm/radeon/radeon_vm.c @@ -116,6 +116,19 @@ void radeon_vm_manager_fini(struct radeon_device *rdev) rdev->vm_manager.enabled = false; } +static void force_gtt(struct radeon_bo *bo) +{ + if (radeon_bo_reserve(bo, false)) + return; + + radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT); + + if (ttm_bo_validate(&bo->tbo, &bo->placement, true, false)) { + DRM_ERROR("failed to force a GTT placement\n"); + } + radeon_bo_unreserve(bo); +} + /** * radeon_vm_get_bos - add the vm BOs to a validation list * @@ -147,6 +160,8 @@ struct radeon_cs_reloc *radeon_vm_get_bos(struct radeon_device *rdev, list[0].handle = 0; list_add(&list[0].tv.head, head); + force_gtt(vm->page_directory); + for (i = 0, idx = 1; i <= vm->max_pde_used; i++) { if (!vm->page_tables[i].bo) continue; @@ -159,6 +174,8 @@ struct radeon_cs_reloc *radeon_vm_get_bos(struct radeon_device *rdev, list[idx].tiling_flags = 0; list[idx].handle = 0; list_add(&list[idx++].tv.head, head); + + force_gtt(vm->page_tables[i].bo); } return list;