From patchwork Thu Sep 8 10:39:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: akash.goel@intel.com X-Patchwork-Id: 9320923 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 3376560231 for ; Thu, 8 Sep 2016 10:26:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 179272973E for ; Thu, 8 Sep 2016 10:26:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0C6C62979B; Thu, 8 Sep 2016 10:26:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B0EAB2973E for ; Thu, 8 Sep 2016 10:26:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 50C9E6EE1F; Thu, 8 Sep 2016 10:26:18 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6FC866EE13 for ; Thu, 8 Sep 2016 10:26:02 +0000 (UTC) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga104.fm.intel.com with ESMTP; 08 Sep 2016 03:26:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.30,300,1470726000"; d="scan'208"; a="1053038145" Received: from akashgoe-desktop.iind.intel.com ([10.223.82.36]) by fmsmga002.fm.intel.com with ESMTP; 08 Sep 2016 03:26:00 -0700 From: akash.goel@intel.com To: intel-gfx@lists.freedesktop.org Date: Thu, 8 Sep 2016 16:09:16 +0530 Message-Id: <1473331158-21082-17-git-send-email-akash.goel@intel.com> X-Mailer: git-send-email 1.9.2 In-Reply-To: <1473331158-21082-1-git-send-email-akash.goel@intel.com> References: <1473331158-21082-1-git-send-email-akash.goel@intel.com> Cc: Akash Goel Subject: [Intel-gfx] [PATCH 16/18] drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Akash Goel To ensure that we always get the up-to-date data from log buffer, its better to access the buffer through an uncached CPU mapping. Also the way buffer is accessed from GuC & Host side, manually doing cache flush may not be effective always if cached CPU mapping is used. In order to avoid any performance drop & have fast reads from the GuC log buffer, used SSE4.1 movntdqa based memcpy function i915_memcpy_from_wc, as copying using movntqda from WC type memory is almost as fast as reading from WB memory. This way log buffer sampling time will not get increased and so would be able to deal with the flush interrupt storm when GuC is generating logs at a very high rate. Ideally SSE 4.1 should be present on all chipsets supporting GuC based submisssions, but if not then logging will not be enabled. v2: Rebase. v3: Squash the WC type vmalloc mapping patch with this patch. (Chris) Suggested-by: Chris Wilson Signed-off-by: Akash Goel Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_guc_submission.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 617ded1..9ca9e0d 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1127,18 +1127,16 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) /* Just copy the newly written data */ if (read_offset > write_offset) { - memcpy(dst_data, src_data, write_offset); + i915_memcpy_from_wc(dst_data, src_data, write_offset); bytes_to_copy = buffer_size - read_offset; } else { bytes_to_copy = write_offset - read_offset; } - memcpy(dst_data + read_offset, - src_data + read_offset, bytes_to_copy); + i915_memcpy_from_wc(dst_data + read_offset, + src_data + read_offset, bytes_to_copy); src_data += buffer_size; dst_data += buffer_size; - - /* FIXME: invalidate/flush for log buffer needed */ } if (log_buf_snapshot_state) @@ -1198,8 +1196,11 @@ static int guc_create_log_extras(struct intel_guc *guc) return 0; if (!guc->log.buf_addr) { - /* Create a vmalloc mapping of log buffer pages */ - vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WB); + /* Create a WC (Uncached for read) vmalloc mapping of log + * buffer pages, so that we can directly get the data + * (up-to-date) from memory. + */ + vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WC); if (IS_ERR(vaddr)) { ret = PTR_ERR(vaddr); DRM_ERROR("Couldn't map log buffer pages %d\n", ret); @@ -1242,6 +1243,16 @@ static void guc_create_log(struct intel_guc *guc) vma = guc->log.vma; if (!vma) { + /* We require SSE 4.1 for fast reads from the GuC log buffer and + * it should be present on the chipsets supporting GuC based + * submisssions. + */ + if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) { + /* logging will not be enabled */ + i915.guc_log_level = -1; + return; + } + vma = guc_allocate_vma(guc, size); if (IS_ERR(vma)) { /* logging will be off */