From patchwork Wed Jul 15 08:51:43 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sourab.gupta@intel.com X-Patchwork-Id: 6794681 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id E44499F2E8 for ; Wed, 15 Jul 2015 08:50:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D6B9B20603 for ; Wed, 15 Jul 2015 08:50:03 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id B4319205F4 for ; Wed, 15 Jul 2015 08:50:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2A94E6EB19; Wed, 15 Jul 2015 01:50:02 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTP id D74A66EB19 for ; Wed, 15 Jul 2015 01:50:00 -0700 (PDT) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP; 15 Jul 2015 01:50:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,479,1432623600"; d="scan'208";a="762855692" Received: from sourabgu-desktop.iind.intel.com ([10.223.82.35]) by fmsmga002.fm.intel.com with ESMTP; 15 Jul 2015 01:49:57 -0700 From: sourab.gupta@intel.com To: intel-gfx@lists.freedesktop.org Date: Wed, 15 Jul 2015 14:21:43 +0530 Message-Id: <1436950306-14147-6-git-send-email-sourab.gupta@intel.com> X-Mailer: git-send-email 1.8.5.1 In-Reply-To: <1436950306-14147-1-git-send-email-sourab.gupta@intel.com> References: <1436950306-14147-1-git-send-email-sourab.gupta@intel.com> Cc: Insoo Woo , Peter Zijlstra , Jabin Wu , Sourab Gupta Subject: [Intel-gfx] [RFC 5/8] drm/i915: Add support for forwarding ring id in sample metadata through perf X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta This patch introduces flags and adds support for having ring id output with the timestamp samples and forwarding them through perf. When the userspace expresses its interest in listening to the ring id through a gen pmu attr field during event init, the samples generated would have an additional field appended with the ring id information. This patch enables this framework, which can be expanded upon to introduce further fields in the gen pmu attr through which additional metadata information can be appended to samples. Signed-off-by: Sourab Gupta --- drivers/gpu/drm/i915/i915_drv.h | 3 ++ drivers/gpu/drm/i915/i915_oa_perf.c | 90 ++++++++++++++++++++++++++++++++++++- include/uapi/drm/i915_drm.h | 13 ++++++ 3 files changed, 105 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 59d23d0..cf0528e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1682,6 +1682,7 @@ struct i915_gen_pmu_node { u32 offset; bool discard; u32 ctx_id; + u32 ring; }; extern const struct i915_oa_reg i915_oa_3d_mux_config_hsw[]; @@ -2011,6 +2012,8 @@ struct drm_i915_private { struct list_head node_list; struct work_struct work_timer; struct work_struct work_event_destroy; +#define I915_GEN_PMU_SAMPLE_RING (1<<0) + int sample_info_flags; } gen_pmu; void (*insert_profile_cmd[I915_PROFILE_MAX]) diff --git a/drivers/gpu/drm/i915/i915_oa_perf.c b/drivers/gpu/drm/i915/i915_oa_perf.c index 1780de42..5915720 100644 --- a/drivers/gpu/drm/i915/i915_oa_perf.c +++ b/drivers/gpu/drm/i915/i915_oa_perf.c @@ -102,6 +102,9 @@ void i915_oa_insert_cmd(struct intel_ringbuffer *ringbuf, u32 ctx_id, int tag) i915_vma_move_to_active(i915_gem_obj_to_ggtt(obj), ring); } +/* Returns the ring's ID mask (i.e. I915_EXEC_) */ +#define ring_id_mask(ring) ((ring)->id + 1) + void i915_gen_insert_cmd_ts(struct intel_ringbuffer *ringbuf, u32 ctx_id, int tag) { @@ -119,6 +122,8 @@ void i915_gen_insert_cmd_ts(struct intel_ringbuffer *ringbuf, u32 ctx_id, return; } entry->ctx_id = ctx_id; + if (dev_priv->gen_pmu.sample_info_flags & I915_GEN_PMU_SAMPLE_RING) + entry->ring = ring_id_mask(ring); i915_gem_request_assign(&entry->req, ring->outstanding_lazy_request); spin_lock_irqsave(&dev_priv->gen_pmu.lock, lock_flags); @@ -548,8 +553,9 @@ static void forward_one_gen_pmu_sample(struct drm_i915_private *dev_priv, struct perf_sample_data data; struct perf_event *event = dev_priv->gen_pmu.exclusive_event; int ts_size, snapshot_size; - u8 *snapshot; + u8 *snapshot, *current_ptr; struct drm_i915_ts_node_ctx_id *ctx_info; + struct drm_i915_ts_node_ring_id *ring_info; struct perf_raw_record raw; ts_size = sizeof(struct drm_i915_ts_data); @@ -558,6 +564,14 @@ static void forward_one_gen_pmu_sample(struct drm_i915_private *dev_priv, ctx_info = (struct drm_i915_ts_node_ctx_id *)(snapshot + ts_size); ctx_info->ctx_id = node->ctx_id; + current_ptr = snapshot + snapshot_size; + + if (dev_priv->gen_pmu.sample_info_flags & I915_GEN_PMU_SAMPLE_RING) { + ring_info = (struct drm_i915_ts_node_ring_id *)current_ptr; + ring_info->ring = node->ring; + snapshot_size += sizeof(*ring_info); + current_ptr = snapshot + snapshot_size; + } perf_sample_data_init(&data, 0, event->hw.last_period); @@ -1010,6 +1024,9 @@ static int init_gen_pmu_buffer(struct perf_event *event) node_size = sizeof(struct drm_i915_ts_data) + sizeof(struct drm_i915_ts_node_ctx_id); + if (dev_priv->gen_pmu.sample_info_flags & I915_GEN_PMU_SAMPLE_RING) + node_size += sizeof(struct drm_i915_ts_node_ring_id); + /* size has to be aligned to 8 bytes (required by relevant gpu cmds) */ node_size = ALIGN(node_size, 8); dev_priv->gen_pmu.buffer.node_size = node_size; @@ -1544,16 +1561,87 @@ static int i915_oa_event_event_idx(struct perf_event *event) return 0; } +static int i915_gen_pmu_copy_attr(struct drm_i915_gen_pmu_attr __user *uattr, + struct drm_i915_gen_pmu_attr *attr) +{ + u32 size; + int ret; + + if (!access_ok(VERIFY_WRITE, uattr, I915_GEN_PMU_ATTR_SIZE_VER0)) + return -EFAULT; + + /* + * zero the full structure, so that a short copy will be nice. + */ + memset(attr, 0, sizeof(*attr)); + + ret = get_user(size, &uattr->size); + if (ret) + return ret; + + if (size > PAGE_SIZE) /* silly large */ + goto err_size; + + if (size < I915_GEN_PMU_ATTR_SIZE_VER0) + goto err_size; + + /* + * If we're handed a bigger struct than we know of, + * ensure all the unknown bits are 0 - i.e. new + * user-space does not rely on any kernel feature + * extensions we dont know about yet. + */ + if (size > sizeof(*attr)) { + unsigned char __user *addr; + unsigned char __user *end; + unsigned char val; + + addr = (void __user *)uattr + sizeof(*attr); + end = (void __user *)uattr + size; + + for (; addr < end; addr++) { + ret = get_user(val, addr); + if (ret) + return ret; + if (val) + goto err_size; + } + size = sizeof(*attr); + } + + ret = copy_from_user(attr, uattr, size); + if (ret) + return -EFAULT; + +out: + return ret; + +err_size: + put_user(sizeof(*attr), &uattr->size); + ret = -E2BIG; + goto out; +} + static int i915_gen_event_init(struct perf_event *event) { struct drm_i915_private *dev_priv = container_of(event->pmu, typeof(*dev_priv), gen_pmu.pmu); + struct drm_i915_gen_pmu_attr gen_attr; unsigned long lock_flags; int ret = 0; if (event->attr.type != event->pmu->type) return -ENOENT; + ret = i915_gen_pmu_copy_attr(to_user_ptr(event->attr.config), + &gen_attr); + if (ret) + return ret; + + if (gen_attr.sample_ring) + dev_priv->gen_pmu.sample_info_flags |= + I915_GEN_PMU_SAMPLE_RING; + /* To avoid the complexity of having to accurately filter * data and marshal to the appropriate client * we currently only allow exclusive access */ diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 9c083a2..fd52926 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -81,6 +81,8 @@ #define I915_OA_ATTR_SIZE_VER0 32 /* sizeof first published struct */ +#define I915_GEN_PMU_ATTR_SIZE_VER0 8 /* sizeof first published struct */ + typedef struct _drm_i915_oa_attr { __u32 size; @@ -98,6 +100,12 @@ typedef struct _drm_i915_oa_attr { __reserved_1:60; } drm_i915_oa_attr_t; +struct drm_i915_gen_pmu_attr { + __u32 size; + __u32 sample_ring:1, + __reserved_1:31; +}; + /* Header for PERF_RECORD_DEVICE type events */ typedef struct _drm_i915_oa_event_header { __u32 type; @@ -150,6 +158,11 @@ struct drm_i915_ts_node_ctx_id { __u32 pad; }; +struct drm_i915_ts_node_ring_id { + __u32 ring; + __u32 pad; +}; + /* Each region is a minimum of 16k, and there are at most 255 of them. */ #define I915_NR_TEX_REGIONS 255 /* table size 2k - maximum due to use