From patchwork Wed Dec 7 21:40:33 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robert Bragg X-Patchwork-Id: 9465265 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 54AEC6022E for ; Wed, 7 Dec 2016 21:40:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36ECE28533 for ; Wed, 7 Dec 2016 21:40:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2802B28583; Wed, 7 Dec 2016 21:40:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3247E28533 for ; Wed, 7 Dec 2016 21:40:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0C7146E195; Wed, 7 Dec 2016 21:40:46 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-pg0-x244.google.com (mail-pg0-x244.google.com [IPv6:2607:f8b0:400e:c05::244]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6F93F6E195 for ; Wed, 7 Dec 2016 21:40:44 +0000 (UTC) Received: by mail-pg0-x244.google.com with SMTP id 3so24632985pgd.0 for ; Wed, 07 Dec 2016 13:40:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=LPqb0FEKWjdxAP9NXafZ5p9duAg65vuOQmt3Z5cf63k=; b=kA69CTXFzOl7Qy+A33mv7Do85NGfpNAkgkggzObyqH3lKnYBXGzrgx5z8BiuD2KTpw zR0/Hd8O7dKwUykExQUVUJboff1feghoGiyhmPvUWOqBc/QPpYM09OF+FFvxbGbxX4NE dJ7BBP/JW08lh5mHwlPQGwIDatEJ7erEy08xlolFfNwRpHoAGMvFi5HLfEjuEOGbOMCc wTQjbfmL9ezGFFy9jSL/Lj9qHAMLlG+sSRfF/q+CXaY5ve0WsfFzcHdDsFP0zsx6KhWS ib8JeB0OlsE3S7NYJCafRUPfQ3oN8cA2etiHwrFBclZ0OGO9OV5If7oWRcAngkOfUE3Q tutA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=LPqb0FEKWjdxAP9NXafZ5p9duAg65vuOQmt3Z5cf63k=; b=GT2TdTbJZ+76deJwBRFAfn+1WCzZrTo52FDJQZ7easko7eEbng/6HPIBhMsx3VCLGv qRgKh5FRqNydY4+YakJW/gLOMmBQbGUX3YH4yrckixHLyjU/DG1lMKPNp2IS8Tihl+JE CJwK5Yk6uX6y9+Yvedrzjv2UWBrcnD7IIsFQNNuzqoLAC5uRVGIsWLw76Ul6lmSdX/X4 n0Iu7CKBVsFl0OeqoWdjgEM7dPSu3TpdpXRZb4YPc1t5UszF21n4/KNSH9wY/gC9vZo2 fwhE7+UmagXf5HVctfEsYTFRgyQsnTW3lGSS5HCJpNcJx+sPt/HS+sNWGOOwt1u5RVbq r30g== X-Gm-Message-State: AKaTC02OWZP9hsLxebjU27wqzVdCVLeBbybiAUDHcMdSrUS0DM5r6glRUvdWSE0A0VnKzg== X-Received: by 10.98.25.205 with SMTP id 196mr71470857pfz.46.1481146842695; Wed, 07 Dec 2016 13:40:42 -0800 (PST) Received: from sixbynine.org ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id a7sm44704177pfl.87.2016.12.07.13.40.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 07 Dec 2016 13:40:42 -0800 (PST) From: Robert Bragg To: intel-gfx@lists.freedesktop.org Date: Wed, 7 Dec 2016 21:40:33 +0000 Message-Id: <20161207214033.3581-1-robert@sixbynine.org> X-Mailer: git-send-email 2.10.2 In-Reply-To: References: Cc: Daniel Vetter Subject: [Intel-gfx] [PATCH] drm/i915/perf: More documentation hooked to i915.rst X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP This adds a 'Perf' section to i915.rst with the following sub sections: - Overview - Comparison with Core Perf - i915 Driver Entry Points - i915 Perf Stream - i915 Perf Observation Architecture Stream - All i915 Perf Internals v2: section headers in i915.rst (Daniel Vetter) missing symbol docs + other fixups (Matthew Auld) Signed-off-by: Robert Bragg Reviewed-by: Matthew Auld Cc: Daniel Vetter --- Documentation/gpu/i915.rst | 91 +++++++++ drivers/gpu/drm/i915/i915_drv.h | 151 +++++++++++--- drivers/gpu/drm/i915/i915_perf.c | 412 ++++++++++++++++++++++++++++++++++++--- 3 files changed, 598 insertions(+), 56 deletions(-) diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index 117d2ab..3843ef6 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -356,4 +356,95 @@ switch_mm .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h :doc: switch_mm tracepoint +Perf +==== + +Overview +-------- +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :doc: i915 Perf Overview + +Comparison with Core Perf +------------------------- +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :doc: i915 Perf History and Comparison with Core Perf + +i915 Driver Entry Points +------------------------ + +This section covers the entrypoints exported outside of i915_perf.c to +integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl. + +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_init +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_fini +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_register +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_unregister +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_open_ioctl +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_release + +i915 Perf Stream +---------------- + +This section covers the stream-semantics-agnostic structures and functions +for representing an i915 perf stream FD and associated file operations. + +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h + :functions: i915_perf_stream +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h + :functions: i915_perf_stream_ops + +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: read_properties_unlocked +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_open_ioctl_locked +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_destroy_locked +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_read +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_ioctl +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_enable_locked +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_disable_locked +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_poll +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_perf_poll_locked + +i915 Perf Observation Architecture Stream +----------------------------------------- + +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h + :functions: i915_oa_ops + +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_oa_stream_init +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_oa_read +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_oa_stream_enable +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_oa_stream_disable +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_oa_wait_unlocked +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :functions: i915_oa_poll_wait + +All i915 Perf Internals +----------------------- + +This section simply includes all currently documented i915 perf internals, in +no particular order, but may include some more minor utilities or platform +specific details than found in the more high-level sections. + +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c + :internal: + .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 33758ac..49c7651 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1847,89 +1847,186 @@ struct i915_oa_reg { struct i915_perf_stream; +/** + * struct i915_perf_stream_ops - the OPs to support a specific stream type + */ struct i915_perf_stream_ops { - /* Enables the collection of HW samples, either in response to - * I915_PERF_IOCTL_ENABLE or implicitly called when stream is - * opened without I915_PERF_FLAG_DISABLED. + /** + * @enable: Enables the collection of HW samples, either in response to + * `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened + * without `I915_PERF_FLAG_DISABLED`. */ void (*enable)(struct i915_perf_stream *stream); - /* Disables the collection of HW samples, either in response to - * I915_PERF_IOCTL_DISABLE or implicitly called before - * destroying the stream. + /** + * @disable: Disables the collection of HW samples, either in response + * to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying + * the stream. */ void (*disable)(struct i915_perf_stream *stream); - /* Call poll_wait, passing a wait queue that will be woken + /** + * @poll_wait: Call poll_wait, passing a wait queue that will be woken * once there is something ready to read() for the stream */ void (*poll_wait)(struct i915_perf_stream *stream, struct file *file, poll_table *wait); - /* For handling a blocking read, wait until there is something - * to ready to read() for the stream. E.g. wait on the same + /** + * @wait_unlocked: For handling a blocking read, wait until there is + * something to ready to read() for the stream. E.g. wait on the same * wait queue that would be passed to poll_wait(). */ int (*wait_unlocked)(struct i915_perf_stream *stream); - /* read - Copy buffered metrics as records to userspace - * @buf: the userspace, destination buffer - * @count: the number of bytes to copy, requested by userspace - * @offset: zero at the start of the read, updated as the read - * proceeds, it represents how many bytes have been - * copied so far and the buffer offset for copying the - * next record. + /** + * @read: Copy buffered metrics as records to userspace + * **buf**: the userspace, destination buffer + * **count**: the number of bytes to copy, requested by userspace + * **offset**: zero at the start of the read, updated as the read + * proceeds, it represents how many bytes have been copied so far and + * the buffer offset for copying the next record. * - * Copy as many buffered i915 perf samples and records for - * this stream to userspace as will fit in the given buffer. + * Copy as many buffered i915 perf samples and records for this stream + * to userspace as will fit in the given buffer. * - * Only write complete records; returning -ENOSPC if there - * isn't room for a complete record. + * Only write complete records; returning -%ENOSPC if there isn't room + * for a complete record. * - * Return any error condition that results in a short read - * such as -ENOSPC or -EFAULT, even though these may be - * squashed before returning to userspace. + * Return any error condition that results in a short read such as + * -%ENOSPC or -%EFAULT, even though these may be squashed before + * returning to userspace. */ int (*read)(struct i915_perf_stream *stream, char __user *buf, size_t count, size_t *offset); - /* Cleanup any stream specific resources. + /** + * @destroy: Cleanup any stream specific resources. * * The stream will always be disabled before this is called. */ void (*destroy)(struct i915_perf_stream *stream); }; +/** + * struct i915_perf_stream - state for a single open stream FD + */ struct i915_perf_stream { + /** + * @dev_priv: i915 drm device + */ struct drm_i915_private *dev_priv; + /** + * @link: Links the stream into ``&drm_i915_private->streams`` + */ struct list_head link; + /** + * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` + * properties given when opening a stream, representing the contents + * of a single sample as read() by userspace. + */ u32 sample_flags; + + /** + * @sample_size: Considering the configured contents of a sample + * combined with the required header size, this is the total size + * of a single sample record. + */ int sample_size; + /** + * @ctx: %NULL if measuring system-wide across all contexts or a + * specific context that is being monitored. + */ struct i915_gem_context *ctx; + + /** + * @enabled: Whether the stream is currently enabled, considering + * whether the stream was opened in a disabled state and based + * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls. + */ bool enabled; + /** + * @ops: The callbacks providing the implementation of this specific + * type of configured stream. + */ const struct i915_perf_stream_ops *ops; }; +/** + * struct i915_oa_ops - Gen specific implementation of an OA unit stream + */ struct i915_oa_ops { + /** + * @init_oa_buffer: Resets the head and tail pointers of the + * circular buffer for periodic OA reports. + * + * Called when first opening a stream for OA metrics, but also may be + * called in response to an OA buffer overflow or other error + * condition. + * + * Note it may be necessary to clear the full OA buffer here as part of + * maintaining the invariable that new reports must be written to + * zeroed memory for us to be able to reliable detect if an expected + * report has not yet landed in memory. (At least on Haswell the OA + * buffer tail pointer is not synchronized with reports being visible + * to the CPU) + */ void (*init_oa_buffer)(struct drm_i915_private *dev_priv); + + /** + * @enable_metric_set: Applies any MUX configuration to set up the + * Boolean and Custom (B/C) counters that are part of the counter + * reports being sampled. May apply system constraints such as + * disabling EU clock gating as required. + */ int (*enable_metric_set)(struct drm_i915_private *dev_priv); + + /** + * @disable_metric_set: Remove system constraints associated with using + * the OA unit. + */ void (*disable_metric_set)(struct drm_i915_private *dev_priv); + + /** + * @oa_enable: Enable periodic sampling + */ void (*oa_enable)(struct drm_i915_private *dev_priv); + + /** + * @oa_disable: Disable periodic sampling + */ void (*oa_disable)(struct drm_i915_private *dev_priv); - void (*update_oacontrol)(struct drm_i915_private *dev_priv); - void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv, - u32 ctx_id); + + /** + * @read: Copy data from the circular OA buffer into a given userspace + * buffer. + */ int (*read)(struct i915_perf_stream *stream, char __user *buf, size_t count, size_t *offset); + + /** + * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK) + * + * This is either called via fops or the poll check hrtimer (atomic + * ctx) without any locks taken. + * + * It's safe to read OA config state here unlocked, assuming that this + * is only called while the stream is enabled, while the global OA + * configuration can't be modified. + * + * Efficiency is more important than avoiding some false positives + * here, which will be handled gracefully - likely resulting in an + * %EAGAIN error for userspace. + */ bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv); }; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 0c6e124..ae7bd0e 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -26,7 +26,7 @@ /** - * DOC: i915 Perf, streaming API for GPU metrics + * DOC: i915 Perf Overview * * Gen graphics supports a large number of performance counters that can help * driver and application developers understand and optimize their use of the @@ -45,6 +45,10 @@ * privileges by default, unless changed via the dev.i915.perf_event_paranoid * sysctl option. * + */ + +/** + * DOC: i915 Perf History and Comparison with Core Perf * * The interface was initially inspired by the core Perf infrastructure but * some notable differences are: @@ -75,8 +79,8 @@ * gets copied from the GPU mapped buffers to userspace buffers. * * - * Some notes regarding Linux Perf: - * -------------------------------- + * Issues hit with first prototype based on Core Perf + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * * The first prototype of this driver was based on the core perf * infrastructure, and while we did make that mostly work, with some changes to @@ -135,7 +139,7 @@ * for combining with the side-band raw reports it captures using * MI_REPORT_PERF_COUNT commands. * - * _ As a side note on perf's grouping feature; there was also some concern + * - As a side note on perf's grouping feature; there was also some concern * that using PERF_FORMAT_GROUP as a way to pack together counter values * would quite drastically inflate our sample sizes, which would likely * lower the effective sampling resolutions we could use when the available @@ -277,6 +281,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = { #define SAMPLE_OA_REPORT (1<<0) +/** + * struct perf_open_properties - for validated properties given to open a stream + * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags + * @single_context: Whether a single or all gpu contexts should be monitored + * @ctx_handle: A gem ctx handle for use with @single_context + * @metrics_set: An ID for an OA unit metric set advertised via sysfs + * @oa_format: An OA unit HW report format + * @oa_periodic: Whether to enable periodic OA unit sampling + * @oa_period_exponent: The OA unit sampling period is derived from this + * + * As read_properties_unlocked() enumerates and validates the properties given + * to open a stream of metrics the configuration is built up in the structure + * which starts out zero initialized. + */ struct perf_open_properties { u32 sample_flags; @@ -314,7 +332,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr } /** - * Appends a status record to a userspace read() buffer. + * append_oa_status - Appends a status record to a userspace read() buffer. + * @stream: An i915-perf stream opened for OA metrics + * @buf: destination buffer given by userspace + * @count: the number of bytes userspace wants to read + * @offset: (inout): the current position for writing into @buf + * @type: The kind of status to report to userspace + * + * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`) + * into the userspace read() buffer. + * + * The @buf @offset will only be updated on success. + * + * Returns: 0 on success, negative error code on failure. */ static int append_oa_status(struct i915_perf_stream *stream, char __user *buf, @@ -336,7 +366,21 @@ static int append_oa_status(struct i915_perf_stream *stream, } /** - * Copies single OA report into userspace read() buffer. + * append_oa_sample - Copies single OA report into userspace read() buffer. + * @stream: An i915-perf stream opened for OA metrics + * @buf: destination buffer given by userspace + * @count: the number of bytes userspace wants to read + * @offset: (inout): the current position for writing into @buf + * @report: A single OA report to (optionally) include as part of the sample + * + * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*` + * properties when opening a stream, tracked as `stream->sample_flags`. This + * function copies the requested components of a single sample to the given + * read() @buf. + * + * The @buf @offset will only be updated on success. + * + * Returns: 0 on success, negative error code on failure. */ static int append_oa_sample(struct i915_perf_stream *stream, char __user *buf, @@ -380,10 +424,8 @@ static int append_oa_sample(struct i915_perf_stream *stream, * @head_ptr: (inout): the current oa buffer cpu read position * @tail: the current oa buffer gpu write position * - * Returns 0 on success, negative error code on failure. - * - * Notably any error condition resulting in a short read (-ENOSPC or - * -EFAULT) will be returned even though one or more records may + * Notably any error condition resulting in a short read (-%ENOSPC or + * -%EFAULT) will be returned even though one or more records may * have been successfully copied. In this case it's up to the caller * to decide if the error should be squashed before returning to * userspace. @@ -392,6 +434,8 @@ static int append_oa_sample(struct i915_perf_stream *stream, * tail, so the head chases the tail?... If you think that's mad * and back-to-front you're not alone, but this follows the * Gen PRM naming convention. + * + * Returns: 0 on success, negative error code on failure. */ static int gen7_append_oa_reports(struct i915_perf_stream *stream, char __user *buf, @@ -496,6 +540,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, return ret; } +/** + * gen7_oa_read - copy status records then buffered OA reports + * @stream: An i915-perf stream opened for OA metrics + * @buf: destination buffer given by userspace + * @count: the number of bytes userspace wants to read + * @offset: (inout): the current position for writing into @buf + * + * Checks Gen 7 specific OA unit status registers and if necessary appends + * corresponding status records for userspace (such as for a buffer full + * condition) and then initiate appending any buffered OA reports. + * + * Updates @offset according to the number of bytes successfully copied into + * the userspace buffer. + * + * Returns: zero on success or a negative error code + */ static int gen7_oa_read(struct i915_perf_stream *stream, char __user *buf, size_t count, @@ -597,6 +657,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream, return ret; } +/** + * i915_oa_wait_unlocked - handles blocking IO until OA data available + * @stream: An i915-perf stream opened for OA metrics + * + * Called when userspace tries to read() from a blocking stream FD opened + * for OA metrics. It waits until the hrtimer callback finds a non-empty + * OA buffer and wakes us. + * + * Note: it's acceptable to have this return with some false positives + * since any subsequent read handling will return -EAGAIN if there isn't + * really data ready for userspace yet. + * + * Returns: zero on success or a negative error code + */ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; @@ -615,6 +689,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream) !dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv)); } +/** + * i915_oa_poll_wait - call poll_wait() for an OA stream poll() + * @stream: An i915-perf stream opened for OA metrics + * @file: An i915 perf stream file + * @wait: poll() state table + * + * For handling userspace polling on an i915 perf stream opened for OA metrics, + * this starts a poll_wait with the wait queue that our hrtimer callback wakes + * when it sees data ready to read in the circular OA buffer. + */ static void i915_oa_poll_wait(struct i915_perf_stream *stream, struct file *file, poll_table *wait) @@ -624,6 +708,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream, poll_wait(file, &dev_priv->perf.oa.poll_wq, wait); } +/** + * i915_oa_read - just calls through to &i915_oa_ops->read + * @stream: An i915-perf stream opened for OA metrics + * @buf: destination buffer given by userspace + * @count: the number of bytes userspace wants to read + * @offset: (inout): the current position for writing into @buf + * + * Updates @offset according to the number of bytes successfully copied into + * the userspace buffer. + * + * Returns: zero on success or a negative error code + */ static int i915_oa_read(struct i915_perf_stream *stream, char __user *buf, size_t count, @@ -634,9 +730,15 @@ static int i915_oa_read(struct i915_perf_stream *stream, return dev_priv->perf.oa.ops.read(stream, buf, count, offset); } -/* Determine the render context hw id, and ensure it remains fixed for the +/** + * oa_get_render_ctx_id - determine and hold ctx hw id + * @stream: An i915-perf stream opened for OA metrics + * + * Determine the render context hw id, and ensure it remains fixed for the * lifetime of the stream. This ensures that we don't have to worry about * updating the context ID in OACONTROL on the fly. + * + * Returns: zero on success or a negative error code */ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) { @@ -673,6 +775,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) return ret; } +/** + * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold + * @stream: An i915-perf stream opened for OA metrics + * + * In case anything needed doing to ensure the context HW ID would remain valid + * for the lifetime of the stream, then that can be undone here. + */ static void oa_put_render_ctx_id(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; @@ -945,6 +1054,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv) spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags); } +/** + * i915_oa_stream_enable - handle `I915_PERF_IOCTL_ENABLE` for OA stream + * @stream: An i915 perf stream opened for OA metrics + * + * [Re]enables hardware periodic sampling according to the period configured + * when opening the stream. This also starts a hrtimer that will periodically + * check for data in the circular OA buffer for notifying userspace (e.g. + * during a read() or poll()). + */ static void i915_oa_stream_enable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; @@ -962,6 +1080,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv) I915_WRITE(GEN7_OACONTROL, 0); } +/** + * i915_oa_stream_disable - handle `I915_PERF_IOCTL_DISABLE` for OA stream + * @stream: An i915 perf stream opened for OA metrics + * + * Stops the OA unit from periodically writing counter reports into the + * circular OA buffer. This also stops the hrtimer that periodically checks for + * data in the circular OA buffer, for notifying userspace. + */ static void i915_oa_stream_disable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; @@ -987,6 +1113,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = { .read = i915_oa_read, }; +/** + * i915_oa_stream_init - validate combined props for OA stream and init + * @stream: An i915 perf stream + * @param: The open parameters passed to `DRM_I915_PERF_OPEN` + * @props: The property state that configures stream (individually validated) + * + * While read_properties_unlocked() validates properties in isolation it + * doesn't ensure that the combination necessarily makes sense. + * + * At this point it has been determined that userspace wants a stream of + * OA metrics, but still we need to further validate the combined + * properties are OK. + * + * If the configuration makes sense then we can allocate memory for + * a circular OA buffer and apply the requested metric set configuration. + * + * Returns: zero on success or a negative error code. + */ static int i915_oa_stream_init(struct i915_perf_stream *stream, struct drm_i915_perf_open_param *param, struct perf_open_properties *props) @@ -1111,6 +1255,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, return ret; } +/** + * i915_perf_read_locked - &i915_perf_stream_ops->read with error normalisation + * @stream: An i915 perf stream + * @file: An i915 perf stream file + * @buf: destination buffer given by userspace + * @count: the number of bytes userspace wants to read + * @ppos: (inout) file seek position (unused) + * + * Besides wrapping &i915_perf_stream_ops->read this provides a common place to + * ensure that if we've successfully copied any data then reporting that takes + * precedence over any internal error status, so the data isn't lost. + * + * For example ret will be -ENOSPC whenever there is more buffered data than + * can be copied to userspace, but that's only interesting if we weren't able + * to copy some data because it implies the userspace buffer is too small to + * receive a single record (and we never split records). + * + * Another case with ret == -EFAULT is more of a grey area since it would seem + * like bad form for userspace to ask us to overrun its buffer, but the user + * knows best: + * + * http://yarchive.net/comp/linux/partial_reads_writes.html + * + * Returns: The number of bytes copied or a negative error code on failure. + */ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream, struct file *file, char __user *buf, @@ -1126,25 +1295,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream, size_t offset = 0; int ret = stream->ops->read(stream, buf, count, &offset); - /* If we've successfully copied any data then reporting that - * takes precedence over any internal error status, so the - * data isn't lost. - * - * For example ret will be -ENOSPC whenever there is more - * buffered data than can be copied to userspace, but that's - * only interesting if we weren't able to copy some data - * because it implies the userspace buffer is too small to - * receive a single record (and we never split records). - * - * Another case with ret == -EFAULT is more of a grey area - * since it would seem like bad form for userspace to ask us - * to overrun its buffer, but the user knows best: - * - * http://yarchive.net/comp/linux/partial_reads_writes.html - */ return offset ?: (ret ?: -EAGAIN); } +/** + * i915_perf_read - handles read() FOP for i915 perf stream FDs + * @file: An i915 perf stream file + * @buf: destination buffer given by userspace + * @count: the number of bytes userspace wants to read + * @ppos: (inout) file seek position (unused) + * + * The entry point for handling a read() on a stream file descriptor from + * userspace. Most of the work is left to the i915_perf_read_locked() and + * &i915_perf_stream_ops->read but to save having stream implementations (of + * which we might have multiple later) we handle blocking read here. + * + * We can also consistently treat trying to read from a disabled stream + * as an IO error so implementations can assume the stream is enabled + * while reading. + * + * Returns: The number of bytes copied or a negative error code on failure. + */ static ssize_t i915_perf_read(struct file *file, char __user *buf, size_t count, @@ -1211,6 +1382,22 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer) return HRTIMER_RESTART; } +/** + * i915_perf_poll_locked - poll_wait() with a suitable wait queue for stream + * @dev_priv: i915 device instance + * @stream: An i915 perf stream + * @file: An i915 perf stream file + * @wait: poll() state table + * + * For handling userspace polling on an i915 perf stream, this calls through to + * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that + * will be woken for new stream data. + * + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize + * with any non-file-operation driver hooks. + * + * Returns: any poll events that are ready without sleeping + */ static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv, struct i915_perf_stream *stream, struct file *file, @@ -1232,6 +1419,19 @@ static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv, return events; } +/** + * i915_perf_poll - call poll_wait() with a suitable wait queue for stream + * @file: An i915 perf stream file + * @wait: poll() state table + * + * For handling userspace polling on an i915 perf stream, this ensures + * poll_wait() gets called with a wait queue that will be woken for new stream + * data. + * + * Note: Implementation deferred to i915_perf_poll_locked() + * + * Returns: any poll events that are ready without sleeping + */ static unsigned int i915_perf_poll(struct file *file, poll_table *wait) { struct i915_perf_stream *stream = file->private_data; @@ -1245,6 +1445,16 @@ static unsigned int i915_perf_poll(struct file *file, poll_table *wait) return ret; } +/** + * i915_perf_enable_locked - handle `I915_PERF_IOCTL_ENABLE` ioctl + * @stream: A disabled i915 perf stream + * + * [Re]enables the associated capture of data for this stream. + * + * If a stream was previously enabled then there's currently no intention + * to provide userspace any guarantee about the preservation of previously + * buffered data. + */ static void i915_perf_enable_locked(struct i915_perf_stream *stream) { if (stream->enabled) @@ -1257,6 +1467,20 @@ static void i915_perf_enable_locked(struct i915_perf_stream *stream) stream->ops->enable(stream); } +/** + * i915_perf_disable_locked - handle `I915_PERF_IOCTL_DISABLE` ioctl + * @stream: An enabled i915 perf stream + * + * Disables the associated capture of data for this stream. + * + * The intention is that disabling an re-enabling a stream will ideally be + * cheaper than destroying and re-opening a stream with the same configuration, + * though there are no formal guarantees about what state or buffered data + * must be retained between disabling and re-enabling a stream. + * + * Note: while a stream is disabled it's considered an error for userspace + * to attempt to read from the stream (-EIO). + */ static void i915_perf_disable_locked(struct i915_perf_stream *stream) { if (!stream->enabled) @@ -1269,6 +1493,18 @@ static void i915_perf_disable_locked(struct i915_perf_stream *stream) stream->ops->disable(stream); } +/** + * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs + * @stream: An i915 perf stream + * @cmd: the ioctl request + * @arg: the ioctl data + * + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize + * with any non-file-operation driver hooks. + * + * Returns: zero on success or a negative error code. Returns -EINVAL for + * an unknown ioctl request. + */ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream, unsigned int cmd, unsigned long arg) @@ -1285,6 +1521,17 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream, return -EINVAL; } +/** + * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs + * @file: An i915 perf stream file + * @cmd: the ioctl request + * @arg: the ioctl data + * + * Implementation deferred to i915_perf_ioctl_locked(). + * + * Returns: zero on success or a negative error code. Returns -EINVAL for + * an unknown ioctl request. + */ static long i915_perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg) @@ -1300,6 +1547,16 @@ static long i915_perf_ioctl(struct file *file, return ret; } +/** + * i915_perf_destroy_locked - destroy an i915 perf stream + * @stream: An i915 perf stream + * + * Frees all resources associated with the given i915 perf @stream, disabling + * any associated data capture in the process. + * + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize + * with any non-file-operation driver hooks. + */ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; @@ -1321,6 +1578,17 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) kfree(stream); } +/** + * i915_perf_release - handles userspace close() of a stream file + * @inode: anonymous inode associated with file + * @file: An i915 perf stream file + * + * Cleans up any resources associated with an open i915 perf stream file. + * + * NB: close() can't really fail from the userspace point of view. + * + * Returns: zero on success or a negative error code. + */ static int i915_perf_release(struct inode *inode, struct file *file) { struct i915_perf_stream *stream = file->private_data; @@ -1365,6 +1633,30 @@ lookup_context(struct drm_i915_private *dev_priv, return ctx; } +/** + * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD + * @dev_priv: i915 device instance + * @param: The open parameters passed to 'DRM_I915_PERF_OPEN` + * @props: individually validated u64 property value pairs + * @file: drm file + * + * See i915_perf_ioctl_open() for interface details. + * + * Implements further stream config validation and stream initialization on + * behalf of i915_perf_open_ioctl() with the &drm_i915_private->perf.lock mutex + * taken to serialize with any non-file-operation driver hooks. + * + * Note: at this point the @props have only been validated in isolation and + * it's still necessary to validate that the combination of properties makes + * sense. + * + * In the case where userspace is interested in OA unit metrics then further + * config validation and stream initialization details will be handled by + * i915_oa_stream_init(). The code here should only validate config state that + * will be relevant to all stream types / backends. + * + * Returns: zero on success or a negative error code. + */ static int i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, struct drm_i915_perf_open_param *param, @@ -1459,12 +1751,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, return ret; } -/* Note we copy the properties from userspace outside of the i915 perf - * mutex to avoid an awkward lockdep with mmap_sem. +/** + * read_properties_unlocked - validate + copy userspace stream open properties + * @dev_priv: i915 device instance + * @uprops: The array of u64 key value pairs given by userspace + * @n_props: The number of key value pairs expected in @uprops + * @props: The stream configuration built up while validating properties * * Note this function only validates properties in isolation it doesn't * validate that the combination of properties makes sense or that all * properties necessary for a particular kind of stream have been set. + * + * Note that there currently aren't any ordering requirements for properties so + * we shouldn't validate or assume anything about ordering here. This doesn't + * rule out defining new properties with ordering requirements in the future. */ static int read_properties_unlocked(struct drm_i915_private *dev_priv, u64 __user *uprops, @@ -1586,6 +1886,30 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, return 0; } +/** + * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD + * @dev: drm device + * @data: ioctl data copied from userspace (unvalidated) + * @file: drm file + * + * Validates the stream open parameters given by userspace including flags + * and an array of u64 key, value pair properties. + * + * Very little is assumed up front about the nature of the stream being + * opened (for instance we don't assume it's for periodic OA unit metrics). An + * i915-perf stream is expected to be a suitable interface for other forms of + * buffered data written by the GPU besides periodic OA metrics. + * + * Note we copy the properties from userspace outside of the i915 perf + * mutex to avoid an awkward lockdep with mmap_sem. + * + * Most of the implementation details are handled by + * i915_perf_open_ioctl_locked() after taking the &drm_i915_private->perf.lock + * mutex for serializing with any non-file-operation driver hooks. + * + * Return: A newly opened i915 Perf stream file descriptor or negative + * error code on failure. + */ int i915_perf_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -1622,6 +1946,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, return ret; } +/** + * i915_perf_register - exposes i915-perf to userspace + * @dev_priv: i915 device instance + * + * In particular OA metric sets are advertised under a sysfs metrics/ + * directory allowing userspace to enumerate valid IDs that can be + * used to open an i915-perf stream. + */ void i915_perf_register(struct drm_i915_private *dev_priv) { if (!IS_HASWELL(dev_priv)) @@ -1651,6 +1983,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv) mutex_unlock(&dev_priv->perf.lock); } +/** + * i915_perf_unregister - hide i915-perf from userspace + * @dev_priv: i915 device instance + * + * i915-perf state cleanup is split up into an 'unregister' and + * 'deinit' phase where the interface is first hidden from + * userspace by i915_perf_unregister() before cleaning up + * remaining state in i915_perf_fini(). + */ void i915_perf_unregister(struct drm_i915_private *dev_priv) { if (!IS_HASWELL(dev_priv)) @@ -1707,6 +2048,15 @@ static struct ctl_table dev_root[] = { {} }; +/** + * i915_perf_init - initialize i915-perf state on module load + * @dev_priv: i915 device instance + * + * Initializes i915-perf state without exposing anything to userspace. + * + * Note: i915-perf initialization is split into an 'init' and 'register' + * phase with the i915_perf_register() exposing state to userspace. + */ void i915_perf_init(struct drm_i915_private *dev_priv) { if (!IS_HASWELL(dev_priv)) @@ -1742,6 +2092,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv) dev_priv->perf.initialized = true; } +/** + * i915_perf_fini - Counter part to i915_perf_init() + * @dev_priv: i915 device instance + */ void i915_perf_fini(struct drm_i915_private *dev_priv) { if (!dev_priv->perf.initialized)