From patchwork Tue Jan 13 20:46:52 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kenneth Graunke X-Patchwork-Id: 5625811 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id AB34DC058D for ; Wed, 14 Jan 2015 04:51:46 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DF7732038A for ; Wed, 14 Jan 2015 04:51:41 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id F08322039C for ; Wed, 14 Jan 2015 04:51:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 775486E701; Tue, 13 Jan 2015 20:51:40 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from smtp145.dfw.emailsrvr.com (smtp145.dfw.emailsrvr.com [67.192.241.145]) by gabe.freedesktop.org (Postfix) with ESMTP id 7713C6E701 for ; Tue, 13 Jan 2015 20:51:39 -0800 (PST) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp23.relay.dfw1a.emailsrvr.com (SMTP Server) with ESMTP id 2BC46380171; Tue, 13 Jan 2015 23:51:38 -0500 (EST) X-Virus-Scanned: OK Received: by smtp23.relay.dfw1a.emailsrvr.com (Authenticated sender: kenneth-AT-whitecape.org) with ESMTPSA id 7B0EE380165; Tue, 13 Jan 2015 23:51:37 -0500 (EST) X-Sender-Id: kenneth@whitecape.org Received: from shale.shinigami (static-50-43-36-85.bvtn.or.frontiernet.net [50.43.36.85]) (using TLSv1.2 with cipher AES128-SHA256) by 0.0.0.0:465 (trex/5.4.2); Wed, 14 Jan 2015 04:51:38 GMT From: Kenneth Graunke To: intel-gfx@lists.freedesktop.org Date: Tue, 13 Jan 2015 12:46:52 -0800 Message-Id: <1421182013-751-2-git-send-email-kenneth@whitecape.org> X-Mailer: git-send-email 2.2.1 In-Reply-To: <1421182013-751-1-git-send-email-kenneth@whitecape.org> References: <1421182013-751-1-git-send-email-kenneth@whitecape.org> Subject: [Intel-gfx] [PATCH v2 2/3] drm/i915: Enable the HiZ RAW Stall Optimization on Broadwell. X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DATE_IN_PAST_06_12, RCVD_IN_DNSWL_MED,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Improves performance in OglVSInstancing by 3.2x on Broadwell GT3e (Iris Pro 6200). Thanks to Jesse Barnes and Ben Widawsky for their help in tracking this down. Thanks to Chris Wilson for showing me the new workarounds system. Signed-off-by: Kenneth Graunke Cc: Jesse Barnes --- drivers/gpu/drm/i915/intel_ringbuffer.c | 10 ++++++++++ 1 file changed, 10 insertions(+) Split, as requested by Ben. Fix the thankyous. diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index dabc1d8..0df15a4 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -796,6 +796,16 @@ static int bdw_init_workarounds(struct intel_engine_cs *ring) HDC_DONOT_FETCH_MEM_WHEN_MASKED | (IS_BDW_GT3(dev) ? HDC_FENCE_DEST_SLM_DISABLE : 0)); + /* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0: + * "The Hierarchical Z RAW Stall Optimization allows non-overlapping + * polygons in the same 8x4 pixel/sample area to be processed without + * stalling waiting for the earlier ones to write to Hierarchical Z + * buffer." + * + * This optimization is off by default for Broadwell; turn it on. + */ + WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE); + /* Wa4x4STCOptimizationDisable:bdw */ WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);