From patchwork Wed Oct 3 12:07:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10624733 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60A3418A7 for ; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 493FA28942 for ; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3CE9428988; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5191828942 for ; Wed, 3 Oct 2018 12:07:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 936186E450; Wed, 3 Oct 2018 12:07:30 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5FC2F6E450 for ; Wed, 3 Oct 2018 12:07:27 +0000 (UTC) Received: by mail-wm1-x332.google.com with SMTP id 143-v6so5445593wmf.1 for ; Wed, 03 Oct 2018 05:07:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hdHoVR5555hXvIOLwoltCK0GXASpvXxEWj3UnkICyYk=; b=I8+t8TBTIgeGf4KtXXjdCc5ZWg9ETUeI0rSBYeceB66PGtYyfdM/CdqNrzf4aT8wPY yUbZ3rpJNabV1Y1NaihlJJ4CVNqhCy2KNFgmiQ58Z1bon2K3ps26NedSbY7IpAnku7Ws OF4gz9y1TGPG1NuaDw6sTZl5ChJi0nq5o8Sk5WPYEGTq53T2/XyhMNiSjiKWwqV6wVAT 3KEuwdL4QHQ5jOe29yscvAlq5PN1YYQQqv4QlryVX+cZobQII62Ey0yfcbmUSajKTcOR du4B4+KePcmxyRASn3xmw5QovJ/f2KCPi8HZEBKS6yudeyq46vkdkKbZfUHppIHjVV1p 1pfg== X-Gm-Message-State: ABuFfoguQVkoOBtyk3wVLDhMiLPq3Izmc/4dWHJL5Lxk4/hHib7q+uye 7xezlXnclrPxZ4vTSoA9+khcuw== X-Google-Smtp-Source: ACcGV60LKPjR+siSMCsYz6xxGJ02zuFQ00ohskNphudc1qgB9bxw6kNzhQhJeBkJ39aNOow8pFN9Tw== X-Received: by 2002:a1c:1d6:: with SMTP id 205-v6mr1265158wmb.151.1538568444965; Wed, 03 Oct 2018 05:07:24 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id u191-v6sm1627707wmd.31.2018.10.03.05.07.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Oct 2018 05:07:24 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Wed, 3 Oct 2018 13:07:13 +0100 Message-Id: <20181003120718.6898-2-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> References: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC i-g-t 1/6] include: DRM uAPI headers update X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Signed-off-by: Tvrtko Ursulin --- include/drm-uapi/amdgpu_drm.h | 52 +++- include/drm-uapi/drm.h | 16 ++ include/drm-uapi/drm_fourcc.h | 215 ++++++++++++++ include/drm-uapi/drm_mode.h | 26 +- include/drm-uapi/etnaviv_drm.h | 6 + include/drm-uapi/exynos_drm.h | 240 ++++++++++++++++ include/drm-uapi/i915_drm.h | 49 +++- include/drm-uapi/msm_drm.h | 2 + include/drm-uapi/tegra_drm.h | 492 ++++++++++++++++++++++++++++++++- include/drm-uapi/v3d_drm.h | 194 +++++++++++++ include/drm-uapi/vc4_drm.h | 13 +- include/drm-uapi/virtgpu_drm.h | 1 + include/drm-uapi/vmwgfx_drm.h | 166 ++++++++--- 13 files changed, 1415 insertions(+), 57 deletions(-) create mode 100644 include/drm-uapi/v3d_drm.h diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h index 1816bd8200d1..370e9a5536ef 100644 --- a/include/drm-uapi/amdgpu_drm.h +++ b/include/drm-uapi/amdgpu_drm.h @@ -72,12 +72,41 @@ extern "C" { #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle) #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched) +/** + * DOC: memory domains + * + * %AMDGPU_GEM_DOMAIN_CPU System memory that is not GPU accessible. + * Memory in this pool could be swapped out to disk if there is pressure. + * + * %AMDGPU_GEM_DOMAIN_GTT GPU accessible system memory, mapped into the + * GPU's virtual address space via gart. Gart memory linearizes non-contiguous + * pages of system memory, allows GPU access system memory in a linezrized + * fashion. + * + * %AMDGPU_GEM_DOMAIN_VRAM Local video memory. For APUs, it is memory + * carved out by the BIOS. + * + * %AMDGPU_GEM_DOMAIN_GDS Global on-chip data storage used to share data + * across shader threads. + * + * %AMDGPU_GEM_DOMAIN_GWS Global wave sync, used to synchronize the + * execution of all the waves on a device. + * + * %AMDGPU_GEM_DOMAIN_OA Ordered append, used by 3D or Compute engines + * for appending data. + */ #define AMDGPU_GEM_DOMAIN_CPU 0x1 #define AMDGPU_GEM_DOMAIN_GTT 0x2 #define AMDGPU_GEM_DOMAIN_VRAM 0x4 #define AMDGPU_GEM_DOMAIN_GDS 0x8 #define AMDGPU_GEM_DOMAIN_GWS 0x10 #define AMDGPU_GEM_DOMAIN_OA 0x20 +#define AMDGPU_GEM_DOMAIN_MASK (AMDGPU_GEM_DOMAIN_CPU | \ + AMDGPU_GEM_DOMAIN_GTT | \ + AMDGPU_GEM_DOMAIN_VRAM | \ + AMDGPU_GEM_DOMAIN_GDS | \ + AMDGPU_GEM_DOMAIN_GWS | \ + AMDGPU_GEM_DOMAIN_OA) /* Flag that CPU access will be required for the case of VRAM domain */ #define AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED (1 << 0) @@ -95,6 +124,10 @@ extern "C" { #define AMDGPU_GEM_CREATE_VM_ALWAYS_VALID (1 << 6) /* Flag that BO sharing will be explicitly synchronized */ #define AMDGPU_GEM_CREATE_EXPLICIT_SYNC (1 << 7) +/* Flag that indicates allocating MQD gart on GFX9, where the mtype + * for the second page onward should be set to NC. + */ +#define AMDGPU_GEM_CREATE_MQD_GFX9 (1 << 8) struct drm_amdgpu_gem_create_in { /** the requested memory size */ @@ -473,7 +506,8 @@ struct drm_amdgpu_gem_va { #define AMDGPU_HW_IP_UVD_ENC 5 #define AMDGPU_HW_IP_VCN_DEC 6 #define AMDGPU_HW_IP_VCN_ENC 7 -#define AMDGPU_HW_IP_NUM 8 +#define AMDGPU_HW_IP_VCN_JPEG 8 +#define AMDGPU_HW_IP_NUM 9 #define AMDGPU_HW_IP_INSTANCE_MAX_COUNT 1 @@ -482,6 +516,7 @@ struct drm_amdgpu_gem_va { #define AMDGPU_CHUNK_ID_DEPENDENCIES 0x03 #define AMDGPU_CHUNK_ID_SYNCOBJ_IN 0x04 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05 +#define AMDGPU_CHUNK_ID_BO_HANDLES 0x06 struct drm_amdgpu_cs_chunk { __u32 chunk_id; @@ -520,6 +555,10 @@ union drm_amdgpu_cs { /* Preempt flag, IB should set Pre_enb bit if PREEMPT flag detected */ #define AMDGPU_IB_FLAG_PREEMPT (1<<2) +/* The IB fence should do the L2 writeback but not invalidate any shader + * caches (L2/vL1/sL1/I$). */ +#define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3) + struct drm_amdgpu_cs_chunk_ib { __u32 _pad; /** AMDGPU_IB_FLAG_* */ @@ -618,6 +657,16 @@ struct drm_amdgpu_cs_chunk_data { #define AMDGPU_INFO_FW_SOS 0x0c /* Subquery id: Query PSP ASD firmware version */ #define AMDGPU_INFO_FW_ASD 0x0d + /* Subquery id: Query VCN firmware version */ + #define AMDGPU_INFO_FW_VCN 0x0e + /* Subquery id: Query GFX RLC SRLC firmware version */ + #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_CNTL 0x0f + /* Subquery id: Query GFX RLC SRLG firmware version */ + #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_GPM_MEM 0x10 + /* Subquery id: Query GFX RLC SRLS firmware version */ + #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11 + /* Subquery id: Query DMCU firmware version */ + #define AMDGPU_INFO_FW_DMCU 0x12 /* number of bytes moved for TTM migration */ #define AMDGPU_INFO_NUM_BYTES_MOVED 0x0f /* the used VRAM size */ @@ -806,6 +855,7 @@ struct drm_amdgpu_info_firmware { #define AMDGPU_VRAM_TYPE_GDDR5 5 #define AMDGPU_VRAM_TYPE_HBM 6 #define AMDGPU_VRAM_TYPE_DDR3 7 +#define AMDGPU_VRAM_TYPE_DDR4 8 struct drm_amdgpu_info_device { /** PCI Device ID */ diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h index f0bd91de0cf9..85c685a2075e 100644 --- a/include/drm-uapi/drm.h +++ b/include/drm-uapi/drm.h @@ -674,6 +674,22 @@ struct drm_get_cap { */ #define DRM_CLIENT_CAP_ATOMIC 3 +/** + * DRM_CLIENT_CAP_ASPECT_RATIO + * + * If set to 1, the DRM core will provide aspect ratio information in modes. + */ +#define DRM_CLIENT_CAP_ASPECT_RATIO 4 + +/** + * DRM_CLIENT_CAP_WRITEBACK_CONNECTORS + * + * If set to 1, the DRM core will expose special connectors to be used for + * writing back to memory the scene setup in the commit. Depends on client + * also supporting DRM_CLIENT_CAP_ATOMIC + */ +#define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS 5 + /** DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */ struct drm_set_client_cap { __u64 capability; diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h index e04613d30a13..139632b87181 100644 --- a/include/drm-uapi/drm_fourcc.h +++ b/include/drm-uapi/drm_fourcc.h @@ -30,11 +30,50 @@ extern "C" { #endif +/** + * DOC: overview + * + * In the DRM subsystem, framebuffer pixel formats are described using the + * fourcc codes defined in `include/uapi/drm/drm_fourcc.h`. In addition to the + * fourcc code, a Format Modifier may optionally be provided, in order to + * further describe the buffer's format - for example tiling or compression. + * + * Format Modifiers + * ---------------- + * + * Format modifiers are used in conjunction with a fourcc code, forming a + * unique fourcc:modifier pair. This format:modifier pair must fully define the + * format and data layout of the buffer, and should be the only way to describe + * that particular buffer. + * + * Having multiple fourcc:modifier pairs which describe the same layout should + * be avoided, as such aliases run the risk of different drivers exposing + * different names for the same data format, forcing userspace to understand + * that they are aliases. + * + * Format modifiers may change any property of the buffer, including the number + * of planes and/or the required allocation size. Format modifiers are + * vendor-namespaced, and as such the relationship between a fourcc code and a + * modifier is specific to the modifer being used. For example, some modifiers + * may preserve meaning - such as number of planes - from the fourcc code, + * whereas others may not. + * + * Vendors should document their modifier usage in as much detail as + * possible, to ensure maximum compatibility across devices, drivers and + * applications. + * + * The authoritative list of format modifier codes is found in + * `include/uapi/drm/drm_fourcc.h` + */ + #define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \ ((__u32)(c) << 16) | ((__u32)(d) << 24)) #define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of little endian */ +/* Reserve 0 for the invalid format specifier */ +#define DRM_FORMAT_INVALID 0 + /* color index */ #define DRM_FORMAT_C8 fourcc_code('C', '8', ' ', ' ') /* [7:0] C */ @@ -183,6 +222,7 @@ extern "C" { #define DRM_FORMAT_MOD_VENDOR_QCOM 0x05 #define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06 #define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07 +#define DRM_FORMAT_MOD_VENDOR_ARM 0x08 /* add more to the end as needed */ #define DRM_FORMAT_RESERVED ((1ULL << 56) - 1) @@ -298,6 +338,19 @@ extern "C" { */ #define DRM_FORMAT_MOD_SAMSUNG_64_32_TILE fourcc_mod_code(SAMSUNG, 1) +/* + * Qualcomm Compressed Format + * + * Refers to a compressed variant of the base format that is compressed. + * Implementation may be platform and base-format specific. + * + * Each macrotile consists of m x n (mostly 4 x 4) tiles. + * Pixel data pitch/stride is aligned with macrotile width. + * Pixel data height is aligned with macrotile height. + * Entire pixel data buffer is aligned with 4k(bytes). + */ +#define DRM_FORMAT_MOD_QCOM_COMPRESSED fourcc_mod_code(QCOM, 1) + /* Vivante framebuffer modifiers */ /* @@ -384,6 +437,23 @@ extern "C" { #define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \ fourcc_mod_code(NVIDIA, 0x15) +/* + * Some Broadcom modifiers take parameters, for example the number of + * vertical lines in the image. Reserve the lower 32 bits for modifier + * type, and the next 24 bits for parameters. Top 8 bits are the + * vendor code. + */ +#define __fourcc_mod_broadcom_param_shift 8 +#define __fourcc_mod_broadcom_param_bits 48 +#define fourcc_mod_broadcom_code(val, params) \ + fourcc_mod_code(BROADCOM, ((((__u64)params) << __fourcc_mod_broadcom_param_shift) | val)) +#define fourcc_mod_broadcom_param(m) \ + ((int)(((m) >> __fourcc_mod_broadcom_param_shift) & \ + ((1ULL << __fourcc_mod_broadcom_param_bits) - 1))) +#define fourcc_mod_broadcom_mod(m) \ + ((m) & ~(((1ULL << __fourcc_mod_broadcom_param_bits) - 1) << \ + __fourcc_mod_broadcom_param_shift)) + /* * Broadcom VC4 "T" format * @@ -405,6 +475,151 @@ extern "C" { */ #define DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED fourcc_mod_code(BROADCOM, 1) +/* + * Broadcom SAND format + * + * This is the native format that the H.264 codec block uses. For VC4 + * HVS, it is only valid for H.264 (NV12/21) and RGBA modes. + * + * The image can be considered to be split into columns, and the + * columns are placed consecutively into memory. The width of those + * columns can be either 32, 64, 128, or 256 pixels, but in practice + * only 128 pixel columns are used. + * + * The pitch between the start of each column is set to optimally + * switch between SDRAM banks. This is passed as the number of lines + * of column width in the modifier (we can't use the stride value due + * to various core checks that look at it , so you should set the + * stride to width*cpp). + * + * Note that the column height for this format modifier is the same + * for all of the planes, assuming that each column contains both Y + * and UV. Some SAND-using hardware stores UV in a separate tiled + * image from Y to reduce the column height, which is not supported + * with these modifiers. + */ + +#define DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(2, v) +#define DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(3, v) +#define DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(4, v) +#define DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(5, v) + +#define DRM_FORMAT_MOD_BROADCOM_SAND32 \ + DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(0) +#define DRM_FORMAT_MOD_BROADCOM_SAND64 \ + DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(0) +#define DRM_FORMAT_MOD_BROADCOM_SAND128 \ + DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(0) +#define DRM_FORMAT_MOD_BROADCOM_SAND256 \ + DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(0) + +/* Broadcom UIF format + * + * This is the common format for the current Broadcom multimedia + * blocks, including V3D 3.x and newer, newer video codecs, and + * displays. + * + * The image consists of utiles (64b blocks), UIF blocks (2x2 utiles), + * and macroblocks (4x4 UIF blocks). Those 4x4 UIF block groups are + * stored in columns, with padding between the columns to ensure that + * moving from one column to the next doesn't hit the same SDRAM page + * bank. + * + * To calculate the padding, it is assumed that each hardware block + * and the software driving it knows the platform's SDRAM page size, + * number of banks, and XOR address, and that it's identical between + * all blocks using the format. This tiling modifier will use XOR as + * necessary to reduce the padding. If a hardware block can't do XOR, + * the assumption is that a no-XOR tiling modifier will be created. + */ +#define DRM_FORMAT_MOD_BROADCOM_UIF fourcc_mod_code(BROADCOM, 6) + +/* + * Arm Framebuffer Compression (AFBC) modifiers + * + * AFBC is a proprietary lossless image compression protocol and format. + * It provides fine-grained random access and minimizes the amount of data + * transferred between IP blocks. + * + * AFBC has several features which may be supported and/or used, which are + * represented using bits in the modifier. Not all combinations are valid, + * and different devices or use-cases may support different combinations. + */ +#define DRM_FORMAT_MOD_ARM_AFBC(__afbc_mode) fourcc_mod_code(ARM, __afbc_mode) + +/* + * AFBC superblock size + * + * Indicates the superblock size(s) used for the AFBC buffer. The buffer + * size (in pixels) must be aligned to a multiple of the superblock size. + * Four lowest significant bits(LSBs) are reserved for block size. + */ +#define AFBC_FORMAT_MOD_BLOCK_SIZE_MASK 0xf +#define AFBC_FORMAT_MOD_BLOCK_SIZE_16x16 (1ULL) +#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8 (2ULL) + +/* + * AFBC lossless colorspace transform + * + * Indicates that the buffer makes use of the AFBC lossless colorspace + * transform. + */ +#define AFBC_FORMAT_MOD_YTR (1ULL << 4) + +/* + * AFBC block-split + * + * Indicates that the payload of each superblock is split. The second + * half of the payload is positioned at a predefined offset from the start + * of the superblock payload. + */ +#define AFBC_FORMAT_MOD_SPLIT (1ULL << 5) + +/* + * AFBC sparse layout + * + * This flag indicates that the payload of each superblock must be stored at a + * predefined position relative to the other superblocks in the same AFBC + * buffer. This order is the same order used by the header buffer. In this mode + * each superblock is given the same amount of space as an uncompressed + * superblock of the particular format would require, rounding up to the next + * multiple of 128 bytes in size. + */ +#define AFBC_FORMAT_MOD_SPARSE (1ULL << 6) + +/* + * AFBC copy-block restrict + * + * Buffers with this flag must obey the copy-block restriction. The restriction + * is such that there are no copy-blocks referring across the border of 8x8 + * blocks. For the subsampled data the 8x8 limitation is also subsampled. + */ +#define AFBC_FORMAT_MOD_CBR (1ULL << 7) + +/* + * AFBC tiled layout + * + * The tiled layout groups superblocks in 8x8 or 4x4 tiles, where all + * superblocks inside a tile are stored together in memory. 8x8 tiles are used + * for pixel formats up to and including 32 bpp while 4x4 tiles are used for + * larger bpp formats. The order between the tiles is scan line. + * When the tiled layout is used, the buffer size (in pixels) must be aligned + * to the tile size. + */ +#define AFBC_FORMAT_MOD_TILED (1ULL << 8) + +/* + * AFBC solid color blocks + * + * Indicates that the buffer makes use of solid-color blocks, whereby bandwidth + * can be reduced if a whole superblock is a single color. + */ +#define AFBC_FORMAT_MOD_SC (1ULL << 9) + #if defined(__cplusplus) } #endif diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h index 2c575794fb52..d3e0fe31efc5 100644 --- a/include/drm-uapi/drm_mode.h +++ b/include/drm-uapi/drm_mode.h @@ -93,6 +93,15 @@ extern "C" { #define DRM_MODE_PICTURE_ASPECT_NONE 0 #define DRM_MODE_PICTURE_ASPECT_4_3 1 #define DRM_MODE_PICTURE_ASPECT_16_9 2 +#define DRM_MODE_PICTURE_ASPECT_64_27 3 +#define DRM_MODE_PICTURE_ASPECT_256_135 4 + +/* Content type options */ +#define DRM_MODE_CONTENT_TYPE_NO_DATA 0 +#define DRM_MODE_CONTENT_TYPE_GRAPHICS 1 +#define DRM_MODE_CONTENT_TYPE_PHOTO 2 +#define DRM_MODE_CONTENT_TYPE_CINEMA 3 +#define DRM_MODE_CONTENT_TYPE_GAME 4 /* Aspect ratio flag bitmask (4 bits 22:19) */ #define DRM_MODE_FLAG_PIC_AR_MASK (0x0F<<19) @@ -102,6 +111,10 @@ extern "C" { (DRM_MODE_PICTURE_ASPECT_4_3<<19) #define DRM_MODE_FLAG_PIC_AR_16_9 \ (DRM_MODE_PICTURE_ASPECT_16_9<<19) +#define DRM_MODE_FLAG_PIC_AR_64_27 \ + (DRM_MODE_PICTURE_ASPECT_64_27<<19) +#define DRM_MODE_FLAG_PIC_AR_256_135 \ + (DRM_MODE_PICTURE_ASPECT_256_135<<19) #define DRM_MODE_FLAG_ALL (DRM_MODE_FLAG_PHSYNC | \ DRM_MODE_FLAG_NHSYNC | \ @@ -173,8 +186,9 @@ extern "C" { /* * DRM_MODE_REFLECT_ * - * Signals that the contents of a drm plane is reflected in the axis, + * Signals that the contents of a drm plane is reflected along the axis, * in the same way as mirroring. + * See kerneldoc chapter "Plane Composition Properties" for more details. * * This define is provided as a convenience, looking up the property id * using the name->prop id lookup is the preferred method. @@ -338,6 +352,7 @@ enum drm_mode_subconnector { #define DRM_MODE_CONNECTOR_VIRTUAL 15 #define DRM_MODE_CONNECTOR_DSI 16 #define DRM_MODE_CONNECTOR_DPI 17 +#define DRM_MODE_CONNECTOR_WRITEBACK 18 struct drm_mode_get_connector { @@ -363,7 +378,7 @@ struct drm_mode_get_connector { __u32 pad; }; -#define DRM_MODE_PROP_PENDING (1<<0) +#define DRM_MODE_PROP_PENDING (1<<0) /* deprecated, do not use */ #define DRM_MODE_PROP_RANGE (1<<1) #define DRM_MODE_PROP_IMMUTABLE (1<<2) #define DRM_MODE_PROP_ENUM (1<<3) /* enumerated type with text strings */ @@ -598,8 +613,11 @@ struct drm_mode_crtc_lut { }; struct drm_color_ctm { - /* Conversion matrix in S31.32 format. */ - __s64 matrix[9]; + /* + * Conversion matrix in S31.32 sign-magnitude + * (not two's complement!) format. + */ + __u64 matrix[9]; }; struct drm_color_lut { diff --git a/include/drm-uapi/etnaviv_drm.h b/include/drm-uapi/etnaviv_drm.h index e9b997a0ef27..0d5c49dc478c 100644 --- a/include/drm-uapi/etnaviv_drm.h +++ b/include/drm-uapi/etnaviv_drm.h @@ -55,6 +55,12 @@ struct drm_etnaviv_timespec { #define ETNAVIV_PARAM_GPU_FEATURES_4 0x07 #define ETNAVIV_PARAM_GPU_FEATURES_5 0x08 #define ETNAVIV_PARAM_GPU_FEATURES_6 0x09 +#define ETNAVIV_PARAM_GPU_FEATURES_7 0x0a +#define ETNAVIV_PARAM_GPU_FEATURES_8 0x0b +#define ETNAVIV_PARAM_GPU_FEATURES_9 0x0c +#define ETNAVIV_PARAM_GPU_FEATURES_10 0x0d +#define ETNAVIV_PARAM_GPU_FEATURES_11 0x0e +#define ETNAVIV_PARAM_GPU_FEATURES_12 0x0f #define ETNAVIV_PARAM_GPU_STREAM_COUNT 0x10 #define ETNAVIV_PARAM_GPU_REGISTER_MAX 0x11 diff --git a/include/drm-uapi/exynos_drm.h b/include/drm-uapi/exynos_drm.h index a00116b5cc5c..7414cfd76419 100644 --- a/include/drm-uapi/exynos_drm.h +++ b/include/drm-uapi/exynos_drm.h @@ -135,6 +135,219 @@ struct drm_exynos_g2d_exec { __u64 async; }; +/* Exynos DRM IPP v2 API */ + +/** + * Enumerate available IPP hardware modules. + * + * @count_ipps: size of ipp_id array / number of ipp modules (set by driver) + * @reserved: padding + * @ipp_id_ptr: pointer to ipp_id array or NULL + */ +struct drm_exynos_ioctl_ipp_get_res { + __u32 count_ipps; + __u32 reserved; + __u64 ipp_id_ptr; +}; + +enum drm_exynos_ipp_format_type { + DRM_EXYNOS_IPP_FORMAT_SOURCE = 0x01, + DRM_EXYNOS_IPP_FORMAT_DESTINATION = 0x02, +}; + +struct drm_exynos_ipp_format { + __u32 fourcc; + __u32 type; + __u64 modifier; +}; + +enum drm_exynos_ipp_capability { + DRM_EXYNOS_IPP_CAP_CROP = 0x01, + DRM_EXYNOS_IPP_CAP_ROTATE = 0x02, + DRM_EXYNOS_IPP_CAP_SCALE = 0x04, + DRM_EXYNOS_IPP_CAP_CONVERT = 0x08, +}; + +/** + * Get IPP hardware capabilities and supported image formats. + * + * @ipp_id: id of IPP module to query + * @capabilities: bitmask of drm_exynos_ipp_capability (set by driver) + * @reserved: padding + * @formats_count: size of formats array (in entries) / number of filled + * formats (set by driver) + * @formats_ptr: pointer to formats array or NULL + */ +struct drm_exynos_ioctl_ipp_get_caps { + __u32 ipp_id; + __u32 capabilities; + __u32 reserved; + __u32 formats_count; + __u64 formats_ptr; +}; + +enum drm_exynos_ipp_limit_type { + /* size (horizontal/vertial) limits, in pixels (min, max, alignment) */ + DRM_EXYNOS_IPP_LIMIT_TYPE_SIZE = 0x0001, + /* scale ratio (horizonta/vertial), 16.16 fixed point (min, max) */ + DRM_EXYNOS_IPP_LIMIT_TYPE_SCALE = 0x0002, + + /* image buffer area */ + DRM_EXYNOS_IPP_LIMIT_SIZE_BUFFER = 0x0001 << 16, + /* src/dst rectangle area */ + DRM_EXYNOS_IPP_LIMIT_SIZE_AREA = 0x0002 << 16, + /* src/dst rectangle area when rotation enabled */ + DRM_EXYNOS_IPP_LIMIT_SIZE_ROTATED = 0x0003 << 16, + + DRM_EXYNOS_IPP_LIMIT_TYPE_MASK = 0x000f, + DRM_EXYNOS_IPP_LIMIT_SIZE_MASK = 0x000f << 16, +}; + +struct drm_exynos_ipp_limit_val { + __u32 min; + __u32 max; + __u32 align; + __u32 reserved; +}; + +/** + * IPP module limitation. + * + * @type: limit type (see drm_exynos_ipp_limit_type enum) + * @reserved: padding + * @h: horizontal limits + * @v: vertical limits + */ +struct drm_exynos_ipp_limit { + __u32 type; + __u32 reserved; + struct drm_exynos_ipp_limit_val h; + struct drm_exynos_ipp_limit_val v; +}; + +/** + * Get IPP limits for given image format. + * + * @ipp_id: id of IPP module to query + * @fourcc: image format code (see DRM_FORMAT_* in drm_fourcc.h) + * @modifier: image format modifier (see DRM_FORMAT_MOD_* in drm_fourcc.h) + * @type: source/destination identifier (drm_exynos_ipp_format_flag enum) + * @limits_count: size of limits array (in entries) / number of filled entries + * (set by driver) + * @limits_ptr: pointer to limits array or NULL + */ +struct drm_exynos_ioctl_ipp_get_limits { + __u32 ipp_id; + __u32 fourcc; + __u64 modifier; + __u32 type; + __u32 limits_count; + __u64 limits_ptr; +}; + +enum drm_exynos_ipp_task_id { + /* buffer described by struct drm_exynos_ipp_task_buffer */ + DRM_EXYNOS_IPP_TASK_BUFFER = 0x0001, + /* rectangle described by struct drm_exynos_ipp_task_rect */ + DRM_EXYNOS_IPP_TASK_RECTANGLE = 0x0002, + /* transformation described by struct drm_exynos_ipp_task_transform */ + DRM_EXYNOS_IPP_TASK_TRANSFORM = 0x0003, + /* alpha configuration described by struct drm_exynos_ipp_task_alpha */ + DRM_EXYNOS_IPP_TASK_ALPHA = 0x0004, + + /* source image data (for buffer and rectangle chunks) */ + DRM_EXYNOS_IPP_TASK_TYPE_SOURCE = 0x0001 << 16, + /* destination image data (for buffer and rectangle chunks) */ + DRM_EXYNOS_IPP_TASK_TYPE_DESTINATION = 0x0002 << 16, +}; + +/** + * Memory buffer with image data. + * + * @id: must be DRM_EXYNOS_IPP_TASK_BUFFER + * other parameters are same as for AddFB2 generic DRM ioctl + */ +struct drm_exynos_ipp_task_buffer { + __u32 id; + __u32 fourcc; + __u32 width, height; + __u32 gem_id[4]; + __u32 offset[4]; + __u32 pitch[4]; + __u64 modifier; +}; + +/** + * Rectangle for processing. + * + * @id: must be DRM_EXYNOS_IPP_TASK_RECTANGLE + * @reserved: padding + * @x,@y: left corner in pixels + * @w,@h: width/height in pixels + */ +struct drm_exynos_ipp_task_rect { + __u32 id; + __u32 reserved; + __u32 x; + __u32 y; + __u32 w; + __u32 h; +}; + +/** + * Image tranformation description. + * + * @id: must be DRM_EXYNOS_IPP_TASK_TRANSFORM + * @rotation: DRM_MODE_ROTATE_* and DRM_MODE_REFLECT_* values + */ +struct drm_exynos_ipp_task_transform { + __u32 id; + __u32 rotation; +}; + +/** + * Image global alpha configuration for formats without alpha values. + * + * @id: must be DRM_EXYNOS_IPP_TASK_ALPHA + * @value: global alpha value (0-255) + */ +struct drm_exynos_ipp_task_alpha { + __u32 id; + __u32 value; +}; + +enum drm_exynos_ipp_flag { + /* generate DRM event after processing */ + DRM_EXYNOS_IPP_FLAG_EVENT = 0x01, + /* dry run, only check task parameters */ + DRM_EXYNOS_IPP_FLAG_TEST_ONLY = 0x02, + /* non-blocking processing */ + DRM_EXYNOS_IPP_FLAG_NONBLOCK = 0x04, +}; + +#define DRM_EXYNOS_IPP_FLAGS (DRM_EXYNOS_IPP_FLAG_EVENT |\ + DRM_EXYNOS_IPP_FLAG_TEST_ONLY | DRM_EXYNOS_IPP_FLAG_NONBLOCK) + +/** + * Perform image processing described by array of drm_exynos_ipp_task_* + * structures (parameters array). + * + * @ipp_id: id of IPP module to run the task + * @flags: bitmask of drm_exynos_ipp_flag values + * @reserved: padding + * @params_size: size of parameters array (in bytes) + * @params_ptr: pointer to parameters array or NULL + * @user_data: (optional) data for drm event + */ +struct drm_exynos_ioctl_ipp_commit { + __u32 ipp_id; + __u32 flags; + __u32 reserved; + __u32 params_size; + __u64 params_ptr; + __u64 user_data; +}; + #define DRM_EXYNOS_GEM_CREATE 0x00 #define DRM_EXYNOS_GEM_MAP 0x01 /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */ @@ -147,6 +360,11 @@ struct drm_exynos_g2d_exec { #define DRM_EXYNOS_G2D_EXEC 0x22 /* Reserved 0x30 ~ 0x33 for obsolete Exynos IPP ioctls */ +/* IPP - Image Post Processing */ +#define DRM_EXYNOS_IPP_GET_RESOURCES 0x40 +#define DRM_EXYNOS_IPP_GET_CAPS 0x41 +#define DRM_EXYNOS_IPP_GET_LIMITS 0x42 +#define DRM_EXYNOS_IPP_COMMIT 0x43 #define DRM_IOCTL_EXYNOS_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + \ DRM_EXYNOS_GEM_CREATE, struct drm_exynos_gem_create) @@ -165,8 +383,20 @@ struct drm_exynos_g2d_exec { #define DRM_IOCTL_EXYNOS_G2D_EXEC DRM_IOWR(DRM_COMMAND_BASE + \ DRM_EXYNOS_G2D_EXEC, struct drm_exynos_g2d_exec) +#define DRM_IOCTL_EXYNOS_IPP_GET_RESOURCES DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_GET_RESOURCES, \ + struct drm_exynos_ioctl_ipp_get_res) +#define DRM_IOCTL_EXYNOS_IPP_GET_CAPS DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_GET_CAPS, struct drm_exynos_ioctl_ipp_get_caps) +#define DRM_IOCTL_EXYNOS_IPP_GET_LIMITS DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_GET_LIMITS, \ + struct drm_exynos_ioctl_ipp_get_limits) +#define DRM_IOCTL_EXYNOS_IPP_COMMIT DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_COMMIT, struct drm_exynos_ioctl_ipp_commit) + /* EXYNOS specific events */ #define DRM_EXYNOS_G2D_EVENT 0x80000000 +#define DRM_EXYNOS_IPP_EVENT 0x80000002 struct drm_exynos_g2d_event { struct drm_event base; @@ -177,6 +407,16 @@ struct drm_exynos_g2d_event { __u32 reserved; }; +struct drm_exynos_ipp_event { + struct drm_event base; + __u64 user_data; + __u32 tv_sec; + __u32 tv_usec; + __u32 ipp_id; + __u32 sequence; + __u64 reserved; +}; + #if defined(__cplusplus) } #endif diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h index 16e452aa12d4..cf9dcf040321 100644 --- a/include/drm-uapi/i915_drm.h +++ b/include/drm-uapi/i915_drm.h @@ -110,9 +110,17 @@ enum drm_i915_gem_engine_class { enum drm_i915_pmu_engine_sample { I915_SAMPLE_BUSY = 0, I915_SAMPLE_WAIT = 1, - I915_SAMPLE_SEMA = 2 + I915_SAMPLE_SEMA = 2, + I915_SAMPLE_QUEUED = 3, + I915_SAMPLE_RUNNABLE = 4, + I915_SAMPLE_RUNNING = 5, }; + /* Divide counter value by divisor to get the real value. */ +#define I915_SAMPLE_QUEUED_DIVISOR (1000) +#define I915_SAMPLE_RUNNABLE_DIVISOR (1000) +#define I915_SAMPLE_RUNNING_DIVISOR (1000) + #define I915_PMU_SAMPLE_BITS (4) #define I915_PMU_SAMPLE_MASK (0xf) #define I915_PMU_SAMPLE_INSTANCE_BITS (8) @@ -133,6 +141,15 @@ enum drm_i915_pmu_engine_sample { #define I915_PMU_ENGINE_SEMA(class, instance) \ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA) +#define I915_PMU_ENGINE_QUEUED(class, instance) \ + __I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED) + +#define I915_PMU_ENGINE_RUNNABLE(class, instance) \ + __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE) + +#define I915_PMU_ENGINE_RUNNING(class, instance) \ + __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING) + #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) #define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0) @@ -412,6 +429,14 @@ typedef struct drm_i915_irq_wait { int irq_seq; } drm_i915_irq_wait_t; +/* + * Different modes of per-process Graphics Translation Table, + * see I915_PARAM_HAS_ALIASING_PPGTT + */ +#define I915_GEM_PPGTT_NONE 0 +#define I915_GEM_PPGTT_ALIASING 1 +#define I915_GEM_PPGTT_FULL 2 + /* Ioctl to query kernel params: */ #define I915_PARAM_IRQ_ACTIVE 1 @@ -529,6 +554,28 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_CS_TIMESTAMP_FREQUENCY 51 +/* + * Once upon a time we supposed that writes through the GGTT would be + * immediately in physical memory (once flushed out of the CPU path). However, + * on a few different processors and chipsets, this is not necessarily the case + * as the writes appear to be buffered internally. Thus a read of the backing + * storage (physical memory) via a different path (with different physical tags + * to the indirect write via the GGTT) will see stale values from before + * the GGTT write. Inside the kernel, we can for the most part keep track of + * the different read/write domains in use (e.g. set-domain), but the assumption + * of coherency is baked into the ABI, hence reporting its true state in this + * parameter. + * + * Reports true when writes via mmap_gtt are immediately visible following an + * lfence to flush the WCB. + * + * Reports false when writes via mmap_gtt are indeterminately delayed in an in + * internal buffer and are _not_ immediately visible to third parties accessing + * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC + * communications channel when reporting false is strongly disadvised. + */ +#define I915_PARAM_MMAP_GTT_COHERENT 52 + typedef struct drm_i915_getparam { __s32 param; /* diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h index bbbaffad772d..c06d0a5bdd80 100644 --- a/include/drm-uapi/msm_drm.h +++ b/include/drm-uapi/msm_drm.h @@ -201,10 +201,12 @@ struct drm_msm_gem_submit_bo { #define MSM_SUBMIT_NO_IMPLICIT 0x80000000 /* disable implicit sync */ #define MSM_SUBMIT_FENCE_FD_IN 0x40000000 /* enable input fence_fd */ #define MSM_SUBMIT_FENCE_FD_OUT 0x20000000 /* enable output fence_fd */ +#define MSM_SUBMIT_SUDO 0x10000000 /* run submitted cmds from RB */ #define MSM_SUBMIT_FLAGS ( \ MSM_SUBMIT_NO_IMPLICIT | \ MSM_SUBMIT_FENCE_FD_IN | \ MSM_SUBMIT_FENCE_FD_OUT | \ + MSM_SUBMIT_SUDO | \ 0) /* Each cmdstream submit consists of a table of buffers involved, and diff --git a/include/drm-uapi/tegra_drm.h b/include/drm-uapi/tegra_drm.h index 12f9bf848db1..6c07919c04e9 100644 --- a/include/drm-uapi/tegra_drm.h +++ b/include/drm-uapi/tegra_drm.h @@ -32,143 +32,615 @@ extern "C" { #define DRM_TEGRA_GEM_CREATE_TILED (1 << 0) #define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1) +/** + * struct drm_tegra_gem_create - parameters for the GEM object creation IOCTL + */ struct drm_tegra_gem_create { + /** + * @size: + * + * The size, in bytes, of the buffer object to be created. + */ __u64 size; + + /** + * @flags: + * + * A bitmask of flags that influence the creation of GEM objects: + * + * DRM_TEGRA_GEM_CREATE_TILED + * Use the 16x16 tiling format for this buffer. + * + * DRM_TEGRA_GEM_CREATE_BOTTOM_UP + * The buffer has a bottom-up layout. + */ __u32 flags; + + /** + * @handle: + * + * The handle of the created GEM object. Set by the kernel upon + * successful completion of the IOCTL. + */ __u32 handle; }; +/** + * struct drm_tegra_gem_mmap - parameters for the GEM mmap IOCTL + */ struct drm_tegra_gem_mmap { + /** + * @handle: + * + * Handle of the GEM object to obtain an mmap offset for. + */ __u32 handle; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; + + /** + * @offset: + * + * The mmap offset for the given GEM object. Set by the kernel upon + * successful completion of the IOCTL. + */ __u64 offset; }; +/** + * struct drm_tegra_syncpt_read - parameters for the read syncpoint IOCTL + */ struct drm_tegra_syncpt_read { + /** + * @id: + * + * ID of the syncpoint to read the current value from. + */ __u32 id; + + /** + * @value: + * + * The current syncpoint value. Set by the kernel upon successful + * completion of the IOCTL. + */ __u32 value; }; +/** + * struct drm_tegra_syncpt_incr - parameters for the increment syncpoint IOCTL + */ struct drm_tegra_syncpt_incr { + /** + * @id: + * + * ID of the syncpoint to increment. + */ __u32 id; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_syncpt_wait - parameters for the wait syncpoint IOCTL + */ struct drm_tegra_syncpt_wait { + /** + * @id: + * + * ID of the syncpoint to wait on. + */ __u32 id; + + /** + * @thresh: + * + * Threshold value for which to wait. + */ __u32 thresh; + + /** + * @timeout: + * + * Timeout, in milliseconds, to wait. + */ __u32 timeout; + + /** + * @value: + * + * The new syncpoint value after the wait. Set by the kernel upon + * successful completion of the IOCTL. + */ __u32 value; }; #define DRM_TEGRA_NO_TIMEOUT (0xffffffff) +/** + * struct drm_tegra_open_channel - parameters for the open channel IOCTL + */ struct drm_tegra_open_channel { + /** + * @client: + * + * The client ID for this channel. + */ __u32 client; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; + + /** + * @context: + * + * The application context of this channel. Set by the kernel upon + * successful completion of the IOCTL. This context needs to be passed + * to the DRM_TEGRA_CHANNEL_CLOSE or the DRM_TEGRA_SUBMIT IOCTLs. + */ __u64 context; }; +/** + * struct drm_tegra_close_channel - parameters for the close channel IOCTL + */ struct drm_tegra_close_channel { + /** + * @context: + * + * The application context of this channel. This is obtained from the + * DRM_TEGRA_OPEN_CHANNEL IOCTL. + */ __u64 context; }; +/** + * struct drm_tegra_get_syncpt - parameters for the get syncpoint IOCTL + */ struct drm_tegra_get_syncpt { + /** + * @context: + * + * The application context identifying the channel for which to obtain + * the syncpoint ID. + */ __u64 context; + + /** + * @index: + * + * Index of the client syncpoint for which to obtain the ID. + */ __u32 index; + + /** + * @id: + * + * The ID of the given syncpoint. Set by the kernel upon successful + * completion of the IOCTL. + */ __u32 id; }; +/** + * struct drm_tegra_get_syncpt_base - parameters for the get wait base IOCTL + */ struct drm_tegra_get_syncpt_base { + /** + * @context: + * + * The application context identifying for which channel to obtain the + * wait base. + */ __u64 context; + + /** + * @syncpt: + * + * ID of the syncpoint for which to obtain the wait base. + */ __u32 syncpt; + + /** + * @id: + * + * The ID of the wait base corresponding to the client syncpoint. Set + * by the kernel upon successful completion of the IOCTL. + */ __u32 id; }; +/** + * struct drm_tegra_syncpt - syncpoint increment operation + */ struct drm_tegra_syncpt { + /** + * @id: + * + * ID of the syncpoint to operate on. + */ __u32 id; + + /** + * @incrs: + * + * Number of increments to perform for the syncpoint. + */ __u32 incrs; }; +/** + * struct drm_tegra_cmdbuf - structure describing a command buffer + */ struct drm_tegra_cmdbuf { + /** + * @handle: + * + * Handle to a GEM object containing the command buffer. + */ __u32 handle; + + /** + * @offset: + * + * Offset, in bytes, into the GEM object identified by @handle at + * which the command buffer starts. + */ __u32 offset; + + /** + * @words: + * + * Number of 32-bit words in this command buffer. + */ __u32 words; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_reloc - GEM object relocation structure + */ struct drm_tegra_reloc { struct { + /** + * @cmdbuf.handle: + * + * Handle to the GEM object containing the command buffer for + * which to perform this GEM object relocation. + */ __u32 handle; + + /** + * @cmdbuf.offset: + * + * Offset, in bytes, into the command buffer at which to + * insert the relocated address. + */ __u32 offset; } cmdbuf; struct { + /** + * @target.handle: + * + * Handle to the GEM object to be relocated. + */ __u32 handle; + + /** + * @target.offset: + * + * Offset, in bytes, into the target GEM object at which the + * relocated data starts. + */ __u32 offset; } target; + + /** + * @shift: + * + * The number of bits by which to shift relocated addresses. + */ __u32 shift; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_waitchk - wait check structure + */ struct drm_tegra_waitchk { + /** + * @handle: + * + * Handle to the GEM object containing a command stream on which to + * perform the wait check. + */ __u32 handle; + + /** + * @offset: + * + * Offset, in bytes, of the location in the command stream to perform + * the wait check on. + */ __u32 offset; + + /** + * @syncpt: + * + * ID of the syncpoint to wait check. + */ __u32 syncpt; + + /** + * @thresh: + * + * Threshold value for which to check. + */ __u32 thresh; }; +/** + * struct drm_tegra_submit - job submission structure + */ struct drm_tegra_submit { + /** + * @context: + * + * The application context identifying the channel to use for the + * execution of this job. + */ __u64 context; + + /** + * @num_syncpts: + * + * The number of syncpoints operated on by this job. This defines the + * length of the array pointed to by @syncpts. + */ __u32 num_syncpts; + + /** + * @num_cmdbufs: + * + * The number of command buffers to execute as part of this job. This + * defines the length of the array pointed to by @cmdbufs. + */ __u32 num_cmdbufs; + + /** + * @num_relocs: + * + * The number of relocations to perform before executing this job. + * This defines the length of the array pointed to by @relocs. + */ __u32 num_relocs; + + /** + * @num_waitchks: + * + * The number of wait checks to perform as part of this job. This + * defines the length of the array pointed to by @waitchks. + */ __u32 num_waitchks; + + /** + * @waitchk_mask: + * + * Bitmask of valid wait checks. + */ __u32 waitchk_mask; + + /** + * @timeout: + * + * Timeout, in milliseconds, before this job is cancelled. + */ __u32 timeout; + + /** + * @syncpts: + * + * A pointer to an array of &struct drm_tegra_syncpt structures that + * specify the syncpoint operations performed as part of this job. + * The number of elements in the array must be equal to the value + * given by @num_syncpts. + */ __u64 syncpts; + + /** + * @cmdbufs: + * + * A pointer to an array of &struct drm_tegra_cmdbuf structures that + * define the command buffers to execute as part of this job. The + * number of elements in the array must be equal to the value given + * by @num_syncpts. + */ __u64 cmdbufs; + + /** + * @relocs: + * + * A pointer to an array of &struct drm_tegra_reloc structures that + * specify the relocations that need to be performed before executing + * this job. The number of elements in the array must be equal to the + * value given by @num_relocs. + */ __u64 relocs; + + /** + * @waitchks: + * + * A pointer to an array of &struct drm_tegra_waitchk structures that + * specify the wait checks to be performed while executing this job. + * The number of elements in the array must be equal to the value + * given by @num_waitchks. + */ __u64 waitchks; - __u32 fence; /* Return value */ - __u32 reserved[5]; /* future expansion */ + /** + * @fence: + * + * The threshold of the syncpoint associated with this job after it + * has been completed. Set by the kernel upon successful completion of + * the IOCTL. This can be used with the DRM_TEGRA_SYNCPT_WAIT IOCTL to + * wait for this job to be finished. + */ + __u32 fence; + + /** + * @reserved: + * + * This field is reserved for future use. Must be 0. + */ + __u32 reserved[5]; }; #define DRM_TEGRA_GEM_TILING_MODE_PITCH 0 #define DRM_TEGRA_GEM_TILING_MODE_TILED 1 #define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2 +/** + * struct drm_tegra_gem_set_tiling - parameters for the set tiling IOCTL + */ struct drm_tegra_gem_set_tiling { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to set the tiling parameters. + */ __u32 handle; + + /** + * @mode: + * + * The tiling mode to set. Must be one of: + * + * DRM_TEGRA_GEM_TILING_MODE_PITCH + * pitch linear format + * + * DRM_TEGRA_GEM_TILING_MODE_TILED + * 16x16 tiling format + * + * DRM_TEGRA_GEM_TILING_MODE_BLOCK + * 16Bx2 tiling format + */ __u32 mode; + + /** + * @value: + * + * The value to set for the tiling mode parameter. + */ __u32 value; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_gem_get_tiling - parameters for the get tiling IOCTL + */ struct drm_tegra_gem_get_tiling { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to query the tiling parameters. + */ __u32 handle; - /* output */ + + /** + * @mode: + * + * The tiling mode currently associated with the GEM object. Set by + * the kernel upon successful completion of the IOCTL. + */ __u32 mode; + + /** + * @value: + * + * The tiling mode parameter currently associated with the GEM object. + * Set by the kernel upon successful completion of the IOCTL. + */ __u32 value; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; #define DRM_TEGRA_GEM_BOTTOM_UP (1 << 0) #define DRM_TEGRA_GEM_FLAGS (DRM_TEGRA_GEM_BOTTOM_UP) +/** + * struct drm_tegra_gem_set_flags - parameters for the set flags IOCTL + */ struct drm_tegra_gem_set_flags { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to set the flags. + */ __u32 handle; - /* output */ + + /** + * @flags: + * + * The flags to set for the GEM object. + */ __u32 flags; }; +/** + * struct drm_tegra_gem_get_flags - parameters for the get flags IOCTL + */ struct drm_tegra_gem_get_flags { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to query the flags. + */ __u32 handle; - /* output */ + + /** + * @flags: + * + * The flags currently associated with the GEM object. Set by the + * kernel upon successful completion of the IOCTL. + */ __u32 flags; }; @@ -193,7 +665,7 @@ struct drm_tegra_gem_get_flags { #define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr) #define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait) #define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel) -#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel) +#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_close_channel) #define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt) #define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit) #define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base) diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h new file mode 100644 index 000000000000..7b6627783608 --- /dev/null +++ b/include/drm-uapi/v3d_drm.h @@ -0,0 +1,194 @@ +/* + * Copyright © 2014-2018 Broadcom + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef _V3D_DRM_H_ +#define _V3D_DRM_H_ + +#include "drm.h" + +#if defined(__cplusplus) +extern "C" { +#endif + +#define DRM_V3D_SUBMIT_CL 0x00 +#define DRM_V3D_WAIT_BO 0x01 +#define DRM_V3D_CREATE_BO 0x02 +#define DRM_V3D_MMAP_BO 0x03 +#define DRM_V3D_GET_PARAM 0x04 +#define DRM_V3D_GET_BO_OFFSET 0x05 + +#define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl) +#define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo) +#define DRM_IOCTL_V3D_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_CREATE_BO, struct drm_v3d_create_bo) +#define DRM_IOCTL_V3D_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_MMAP_BO, struct drm_v3d_mmap_bo) +#define DRM_IOCTL_V3D_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param) +#define DRM_IOCTL_V3D_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset) + +/** + * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D + * engine. + * + * This asks the kernel to have the GPU execute an optional binner + * command list, and a render command list. + */ +struct drm_v3d_submit_cl { + /* Pointer to the binner command list. + * + * This is the first set of commands executed, which runs the + * coordinate shader to determine where primitives land on the screen, + * then writes out the state updates and draw calls necessary per tile + * to the tile allocation BO. + */ + __u32 bcl_start; + + /** End address of the BCL (first byte after the BCL) */ + __u32 bcl_end; + + /* Offset of the render command list. + * + * This is the second set of commands executed, which will either + * execute the tiles that have been set up by the BCL, or a fixed set + * of tiles (in the case of RCL-only blits). + */ + __u32 rcl_start; + + /** End address of the RCL (first byte after the RCL) */ + __u32 rcl_end; + + /** An optional sync object to wait on before starting the BCL. */ + __u32 in_sync_bcl; + /** An optional sync object to wait on before starting the RCL. */ + __u32 in_sync_rcl; + /** An optional sync object to place the completion fence in. */ + __u32 out_sync; + + /* Offset of the tile alloc memory + * + * This is optional on V3D 3.3 (where the CL can set the value) but + * required on V3D 4.1. + */ + __u32 qma; + + /** Size of the tile alloc memory. */ + __u32 qms; + + /** Offset of the tile state data array. */ + __u32 qts; + + /* Pointer to a u32 array of the BOs that are referenced by the job. + */ + __u64 bo_handles; + + /* Number of BO handles passed in (size is that times 4). */ + __u32 bo_handle_count; + + /* Pad, must be zero-filled. */ + __u32 pad; +}; + +/** + * struct drm_v3d_wait_bo - ioctl argument for waiting for + * completion of the last DRM_V3D_SUBMIT_CL on a BO. + * + * This is useful for cases where multiple processes might be + * rendering to a BO and you want to wait for all rendering to be + * completed. + */ +struct drm_v3d_wait_bo { + __u32 handle; + __u32 pad; + __u64 timeout_ns; +}; + +/** + * struct drm_v3d_create_bo - ioctl argument for creating V3D BOs. + * + * There are currently no values for the flags argument, but it may be + * used in a future extension. + */ +struct drm_v3d_create_bo { + __u32 size; + __u32 flags; + /** Returned GEM handle for the BO. */ + __u32 handle; + /** + * Returned offset for the BO in the V3D address space. This offset + * is private to the DRM fd and is valid for the lifetime of the GEM + * handle. + * + * This offset value will always be nonzero, since various HW + * units treat 0 specially. + */ + __u32 offset; +}; + +/** + * struct drm_v3d_mmap_bo - ioctl argument for mapping V3D BOs. + * + * This doesn't actually perform an mmap. Instead, it returns the + * offset you need to use in an mmap on the DRM device node. This + * means that tools like valgrind end up knowing about the mapped + * memory. + * + * There are currently no values for the flags argument, but it may be + * used in a future extension. + */ +struct drm_v3d_mmap_bo { + /** Handle for the object being mapped. */ + __u32 handle; + __u32 flags; + /** offset into the drm node to use for subsequent mmap call. */ + __u64 offset; +}; + +enum drm_v3d_param { + DRM_V3D_PARAM_V3D_UIFCFG, + DRM_V3D_PARAM_V3D_HUB_IDENT1, + DRM_V3D_PARAM_V3D_HUB_IDENT2, + DRM_V3D_PARAM_V3D_HUB_IDENT3, + DRM_V3D_PARAM_V3D_CORE0_IDENT0, + DRM_V3D_PARAM_V3D_CORE0_IDENT1, + DRM_V3D_PARAM_V3D_CORE0_IDENT2, +}; + +struct drm_v3d_get_param { + __u32 param; + __u32 pad; + __u64 value; +}; + +/** + * Returns the offset for the BO in the V3D address space for this DRM fd. + * This is the same value returned by drm_v3d_create_bo, if that was called + * from this DRM fd. + */ +struct drm_v3d_get_bo_offset { + __u32 handle; + __u32 offset; +}; + +#if defined(__cplusplus) +} +#endif + +#endif /* _V3D_DRM_H_ */ diff --git a/include/drm-uapi/vc4_drm.h b/include/drm-uapi/vc4_drm.h index 4117117b4204..31f50de39acb 100644 --- a/include/drm-uapi/vc4_drm.h +++ b/include/drm-uapi/vc4_drm.h @@ -183,10 +183,17 @@ struct drm_vc4_submit_cl { /* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmonid; - /* Unused field to align this struct on 64 bits. Must be set to 0. - * If one ever needs to add an u32 field to this struct, this field - * can be used. + /* Syncobj handle to wait on. If set, processing of this render job + * will not start until the syncobj is signaled. 0 means ignore. */ + __u32 in_sync; + + /* Syncobj handle to export fence to. If set, the fence in the syncobj + * will be replaced with a fence that signals upon completion of this + * render job. 0 means ignore. + */ + __u32 out_sync; + __u32 pad2; }; diff --git a/include/drm-uapi/virtgpu_drm.h b/include/drm-uapi/virtgpu_drm.h index 91a31ffed828..9a781f0611df 100644 --- a/include/drm-uapi/virtgpu_drm.h +++ b/include/drm-uapi/virtgpu_drm.h @@ -63,6 +63,7 @@ struct drm_virtgpu_execbuffer { }; #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */ +#define VIRTGPU_PARAM_CAPSET_QUERY_FIX 2 /* do we have the capset fix */ struct drm_virtgpu_getparam { __u64 param; diff --git a/include/drm-uapi/vmwgfx_drm.h b/include/drm-uapi/vmwgfx_drm.h index 0bc784f5e0db..399f58317cff 100644 --- a/include/drm-uapi/vmwgfx_drm.h +++ b/include/drm-uapi/vmwgfx_drm.h @@ -40,6 +40,7 @@ extern "C" { #define DRM_VMW_GET_PARAM 0 #define DRM_VMW_ALLOC_DMABUF 1 +#define DRM_VMW_ALLOC_BO 1 #define DRM_VMW_UNREF_DMABUF 2 #define DRM_VMW_HANDLE_CLOSE 2 #define DRM_VMW_CURSOR_BYPASS 3 @@ -68,6 +69,8 @@ extern "C" { #define DRM_VMW_GB_SURFACE_REF 24 #define DRM_VMW_SYNCCPU 25 #define DRM_VMW_CREATE_EXTENDED_CONTEXT 26 +#define DRM_VMW_GB_SURFACE_CREATE_EXT 27 +#define DRM_VMW_GB_SURFACE_REF_EXT 28 /*************************************************************************/ /** @@ -79,6 +82,9 @@ extern "C" { * * DRM_VMW_PARAM_OVERLAY_IOCTL: * Does the driver support the overlay ioctl. + * + * DRM_VMW_PARAM_SM4_1 + * SM4_1 support is enabled. */ #define DRM_VMW_PARAM_NUM_STREAMS 0 @@ -94,6 +100,8 @@ extern "C" { #define DRM_VMW_PARAM_MAX_MOB_SIZE 10 #define DRM_VMW_PARAM_SCREEN_TARGET 11 #define DRM_VMW_PARAM_DX 12 +#define DRM_VMW_PARAM_HW_CAPS2 13 +#define DRM_VMW_PARAM_SM4_1 14 /** * enum drm_vmw_handle_type - handle type for ref ioctls @@ -356,9 +364,9 @@ struct drm_vmw_fence_rep { /*************************************************************************/ /** - * DRM_VMW_ALLOC_DMABUF + * DRM_VMW_ALLOC_BO * - * Allocate a DMA buffer that is visible also to the host. + * Allocate a buffer object that is visible also to the host. * NOTE: The buffer is * identified by a handle and an offset, which are private to the guest, but * useable in the command stream. The guest kernel may translate these @@ -366,27 +374,28 @@ struct drm_vmw_fence_rep { * be zero at all times, or it may disappear from the interface before it is * fixed. * - * The DMA buffer may stay user-space mapped in the guest at all times, + * The buffer object may stay user-space mapped in the guest at all times, * and is thus suitable for sub-allocation. * - * DMA buffers are mapped using the mmap() syscall on the drm device. + * Buffer objects are mapped using the mmap() syscall on the drm device. */ /** - * struct drm_vmw_alloc_dmabuf_req + * struct drm_vmw_alloc_bo_req * * @size: Required minimum size of the buffer. * - * Input data to the DRM_VMW_ALLOC_DMABUF Ioctl. + * Input data to the DRM_VMW_ALLOC_BO Ioctl. */ -struct drm_vmw_alloc_dmabuf_req { +struct drm_vmw_alloc_bo_req { __u32 size; __u32 pad64; }; +#define drm_vmw_alloc_dmabuf_req drm_vmw_alloc_bo_req /** - * struct drm_vmw_dmabuf_rep + * struct drm_vmw_bo_rep * * @map_handle: Offset to use in the mmap() call used to map the buffer. * @handle: Handle unique to this buffer. Used for unreferencing. @@ -395,50 +404,32 @@ struct drm_vmw_alloc_dmabuf_req { * @cur_gmr_offset: Offset to use in the command stream when this buffer is * referenced. See note above. * - * Output data from the DRM_VMW_ALLOC_DMABUF Ioctl. + * Output data from the DRM_VMW_ALLOC_BO Ioctl. */ -struct drm_vmw_dmabuf_rep { +struct drm_vmw_bo_rep { __u64 map_handle; __u32 handle; __u32 cur_gmr_id; __u32 cur_gmr_offset; __u32 pad64; }; +#define drm_vmw_dmabuf_rep drm_vmw_bo_rep /** - * union drm_vmw_dmabuf_arg + * union drm_vmw_alloc_bo_arg * * @req: Input data as described above. * @rep: Output data as described above. * - * Argument to the DRM_VMW_ALLOC_DMABUF Ioctl. + * Argument to the DRM_VMW_ALLOC_BO Ioctl. */ -union drm_vmw_alloc_dmabuf_arg { - struct drm_vmw_alloc_dmabuf_req req; - struct drm_vmw_dmabuf_rep rep; -}; - -/*************************************************************************/ -/** - * DRM_VMW_UNREF_DMABUF - Free a DMA buffer. - * - */ - -/** - * struct drm_vmw_unref_dmabuf_arg - * - * @handle: Handle indicating what buffer to free. Obtained from the - * DRM_VMW_ALLOC_DMABUF Ioctl. - * - * Argument to the DRM_VMW_UNREF_DMABUF Ioctl. - */ - -struct drm_vmw_unref_dmabuf_arg { - __u32 handle; - __u32 pad64; +union drm_vmw_alloc_bo_arg { + struct drm_vmw_alloc_bo_req req; + struct drm_vmw_bo_rep rep; }; +#define drm_vmw_alloc_dmabuf_arg drm_vmw_alloc_bo_arg /*************************************************************************/ /** @@ -1103,9 +1094,8 @@ union drm_vmw_extended_context_arg { * DRM_VMW_HANDLE_CLOSE - Close a user-space handle and release its * underlying resource. * - * Note that this ioctl is overlaid on the DRM_VMW_UNREF_DMABUF Ioctl. - * The ioctl arguments therefore need to be identical in layout. - * + * Note that this ioctl is overlaid on the deprecated DRM_VMW_UNREF_DMABUF + * Ioctl. */ /** @@ -1119,7 +1109,107 @@ struct drm_vmw_handle_close_arg { __u32 handle; __u32 pad64; }; +#define drm_vmw_unref_dmabuf_arg drm_vmw_handle_close_arg + +/*************************************************************************/ +/** + * DRM_VMW_GB_SURFACE_CREATE_EXT - Create a host guest-backed surface. + * + * Allocates a surface handle and queues a create surface command + * for the host on the first use of the surface. The surface ID can + * be used as the surface ID in commands referencing the surface. + * + * This new command extends DRM_VMW_GB_SURFACE_CREATE by adding version + * parameter and 64 bit svga flag. + */ + +/** + * enum drm_vmw_surface_version + * + * @drm_vmw_surface_gb_v1: Corresponds to current gb surface format with + * svga3d surface flags split into 2, upper half and lower half. + */ +enum drm_vmw_surface_version { + drm_vmw_gb_surface_v1 +}; + +/** + * struct drm_vmw_gb_surface_create_ext_req + * + * @base: Surface create parameters. + * @version: Version of surface create ioctl. + * @svga3d_flags_upper_32_bits: Upper 32 bits of svga3d flags. + * @multisample_pattern: Multisampling pattern when msaa is supported. + * @quality_level: Precision settings for each sample. + * @must_be_zero: Reserved for future usage. + * + * Input argument to the DRM_VMW_GB_SURFACE_CREATE_EXT Ioctl. + * Part of output argument for the DRM_VMW_GB_SURFACE_REF_EXT Ioctl. + */ +struct drm_vmw_gb_surface_create_ext_req { + struct drm_vmw_gb_surface_create_req base; + enum drm_vmw_surface_version version; + uint32_t svga3d_flags_upper_32_bits; + SVGA3dMSPattern multisample_pattern; + SVGA3dMSQualityLevel quality_level; + uint64_t must_be_zero; +}; + +/** + * union drm_vmw_gb_surface_create_ext_arg + * + * @req: Input argument as described above. + * @rep: Output argument as described above. + * + * Argument to the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl. + */ +union drm_vmw_gb_surface_create_ext_arg { + struct drm_vmw_gb_surface_create_rep rep; + struct drm_vmw_gb_surface_create_ext_req req; +}; + +/*************************************************************************/ +/** + * DRM_VMW_GB_SURFACE_REF_EXT - Reference a host surface. + * + * Puts a reference on a host surface with a given handle, as previously + * returned by the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl. + * A reference will make sure the surface isn't destroyed while we hold + * it and will allow the calling client to use the surface handle in + * the command stream. + * + * On successful return, the Ioctl returns the surface information given + * to and returned from the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl. + */ +/** + * struct drm_vmw_gb_surface_ref_ext_rep + * + * @creq: The data used as input when the surface was created, as described + * above at "struct drm_vmw_gb_surface_create_ext_req" + * @crep: Additional data output when the surface was created, as described + * above at "struct drm_vmw_gb_surface_create_rep" + * + * Output Argument to the DRM_VMW_GB_SURFACE_REF_EXT ioctl. + */ +struct drm_vmw_gb_surface_ref_ext_rep { + struct drm_vmw_gb_surface_create_ext_req creq; + struct drm_vmw_gb_surface_create_rep crep; +}; + +/** + * union drm_vmw_gb_surface_reference_ext_arg + * + * @req: Input data as described above at "struct drm_vmw_surface_arg" + * @rep: Output data as described above at + * "struct drm_vmw_gb_surface_ref_ext_rep" + * + * Argument to the DRM_VMW_GB_SURFACE_REF Ioctl. + */ +union drm_vmw_gb_surface_reference_ext_arg { + struct drm_vmw_gb_surface_ref_ext_rep rep; + struct drm_vmw_surface_arg req; +}; #if defined(__cplusplus) } From patchwork Wed Oct 3 12:07:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10624729 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C35114BD for ; Wed, 3 Oct 2018 12:07:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4EDD428942 for ; Wed, 3 Oct 2018 12:07:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 433102896A; Wed, 3 Oct 2018 12:07:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D9AC828942 for ; Wed, 3 Oct 2018 12:07:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1CBF56E457; Wed, 3 Oct 2018 12:07:28 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id 366F96E44B for ; Wed, 3 Oct 2018 12:07:27 +0000 (UTC) Received: by mail-wm1-x341.google.com with SMTP id o18-v6so5395255wmc.0 for ; Wed, 03 Oct 2018 05:07:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jVCouqac4HASZ43tMJzWkI+GfgbSX7lwvDUnvEvefLI=; b=ichp/mgo5HVcOropcZEmd4GPkP/lAxHS4jI2ksGx6igqtaqSjSfuh+qg2Rt+r25Rua X+YYb0JblSA8xpF4bK7esybGcW3OVw7Vw22OTz2sVPD0mHX87lV7Ty2WJqm4Z/qQm3ku k+rUIpIJBI728mLeegESo7SzqqA6TW4QR8AfV096r3B4VOwATYZxdV4MqObfZTtM4+Ed /p3PwdGDM+ODTbrOj2A+GzZo+keVMfn2k4dX98nO+QCf7qhZufvMVZleKSqEahiggCyr Earf7ZSYiJKyChK0czguBi/Ur9V067uPwUo1thg2F0+q201slUdWG91u2uMHNWd1GWvh srtA== X-Gm-Message-State: ABuFfoiOfq/ITvehlbEYqBerrdexLdHdB6Bz9DO1qq6L1JNDre4hxC+J JDjWgB59C7jERDIPNtwuMbYUgA== X-Google-Smtp-Source: ACcGV61TT8bCDetS6XtNOSYg/seGVj00jwkce4kog78kPGBsjFUxwVNoFVWnPA2B69fgxCo4SOv1mg== X-Received: by 2002:a1c:e583:: with SMTP id c125-v6mr1310152wmh.134.1538568445818; Wed, 03 Oct 2018 05:07:25 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id u191-v6sm1627707wmd.31.2018.10.03.05.07.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Oct 2018 05:07:25 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Wed, 3 Oct 2018 13:07:14 +0100 Message-Id: <20181003120718.6898-3-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> References: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [RFC i-g-t 2/6] intel-gpu-overlay: Add engine queue stats X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Use new PMU engine queue stats (queued, runnable and running) and display them per engine. v2: * Compact per engine stats. (Chris Wilson) Signed-off-by: Tvrtko Ursulin --- overlay/gpu-top.c | 42 ++++++++++++++++++++++++++++++++++++++++++ overlay/gpu-top.h | 11 +++++++++++ overlay/overlay.c | 7 +++++++ 3 files changed, 60 insertions(+) diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c index 61b8f62fd78c..22e9badb22c1 100644 --- a/overlay/gpu-top.c +++ b/overlay/gpu-top.c @@ -72,6 +72,18 @@ static int perf_init(struct gpu_top *gt) gt->fd) >= 0) gt->have_sema = 1; + if (perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class, d->inst), + gt->fd) >= 0) + gt->have_queued = 1; + + if (perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class, d->inst), + gt->fd) >= 0) + gt->have_runnable = 1; + + if (perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class, d->inst), + gt->fd) >= 0) + gt->have_running = 1; + gt->ring[0].name = d->name; gt->num_rings = 1; @@ -93,6 +105,24 @@ static int perf_init(struct gpu_top *gt) gt->fd) < 0) return -1; + if (gt->have_queued && + perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class, + d->inst), + gt->fd) < 0) + return -1; + + if (gt->have_runnable && + perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class, + d->inst), + gt->fd) < 0) + return -1; + + if (gt->have_running && + perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class, + d->inst), + gt->fd) < 0) + return -1; + gt->ring[gt->num_rings++].name = d->name; } @@ -298,6 +328,12 @@ int gpu_top_update(struct gpu_top *gt) s->wait[n] = sample[m++]; if (gt->have_sema) s->sema[n] = sample[m++]; + if (gt->have_queued) + s->queued[n] = sample[m++]; + if (gt->have_runnable) + s->runnable[n] = sample[m++]; + if (gt->have_running) + s->running[n] = sample[m++]; } if (gt->count == 1) @@ -310,6 +346,12 @@ int gpu_top_update(struct gpu_top *gt) gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time; if (gt->have_sema) gt->ring[n].u.u.sema = (100 * (s->sema[n] - d->sema[n]) + d_time/2) / d_time; + if (gt->have_queued) + gt->ring[n].queued = (double)((s->queued[n] - d->queued[n])) / I915_SAMPLE_QUEUED_DIVISOR * 1e9 / d_time; + if (gt->have_runnable) + gt->ring[n].runnable = (double)((s->runnable[n] - d->runnable[n])) / I915_SAMPLE_RUNNABLE_DIVISOR * 1e9 / d_time; + if (gt->have_running) + gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time; /* in case of rounding + sampling errors, fudge */ if (gt->ring[n].u.u.busy > 100) diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h index d3cdd779760f..cb4310c82a94 100644 --- a/overlay/gpu-top.h +++ b/overlay/gpu-top.h @@ -36,6 +36,9 @@ struct gpu_top { int num_rings; int have_wait; int have_sema; + int have_queued; + int have_runnable; + int have_running; struct gpu_top_ring { const char *name; @@ -47,6 +50,10 @@ struct gpu_top { } u; uint32_t payload; } u; + + double queued; + double runnable; + double running; } ring[MAX_RINGS]; struct gpu_top_stat { @@ -54,7 +61,11 @@ struct gpu_top { uint64_t busy[MAX_RINGS]; uint64_t wait[MAX_RINGS]; uint64_t sema[MAX_RINGS]; + uint64_t queued[MAX_RINGS]; + uint64_t runnable[MAX_RINGS]; + uint64_t running[MAX_RINGS]; } stat[2]; + int count; }; diff --git a/overlay/overlay.c b/overlay/overlay.c index eae5ddfa8823..7087d5e5b24e 100644 --- a/overlay/overlay.c +++ b/overlay/overlay.c @@ -256,6 +256,13 @@ static void show_gpu_top(struct overlay_context *ctx, struct overlay_gpu_top *gt len = sprintf(txt, "%s: %3d%% busy", gt->gpu_top.ring[n].name, gt->gpu_top.ring[n].u.u.busy); + if (gt->gpu_top.have_queued && + gt->gpu_top.have_runnable && + gt->gpu_top.have_running) + len += sprintf(txt + len, " (%.2f / %.2f / %.2f)", + gt->gpu_top.ring[n].queued, + gt->gpu_top.ring[n].runnable, + gt->gpu_top.ring[n].running); if (gt->gpu_top.ring[n].u.u.wait) len += sprintf(txt + len, ", %d%% wait", gt->gpu_top.ring[n].u.u.wait); From patchwork Wed Oct 3 12:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10624731 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0AE1013BB for ; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F15CA28984 for ; Wed, 3 Oct 2018 12:07:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E637628988; Wed, 3 Oct 2018 12:07:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A7AD62896A for ; Wed, 3 Oct 2018 12:07:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CD2566E45B; Wed, 3 Oct 2018 12:07:30 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by gabe.freedesktop.org (Postfix) with ESMTPS id AD0786E450 for ; Wed, 3 Oct 2018 12:07:28 +0000 (UTC) Received: by mail-wr1-x429.google.com with SMTP id 61-v6so5102895wrb.6 for ; Wed, 03 Oct 2018 05:07:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nRJOsq3uYf9l+6doFb34v/eok00QMa5oZpx0AXbiC0E=; b=UO0vJwvoAUmEpDQWWRDV4MBM239mLM2WIoCdl9Imim3qPC05gW4f9xXexDV/rreT3E gcubNFu0mZnW9wzF6LvSs6hekmUsTKmF+Rt59cr/oAw0u83zMNvXotA6uJz07tL31Mif DtqIapbzFgeiJUos9vtk00egPMftYthDFk4e5AT6rQHFyq0A5J0I67nfDJrIQ+TyKfrT WRu0dZin7M6IV+ZkHSGyrUnPp4CuVRByiPpYx8vQVdQanNJWPReBI8GBpYpm3PesXoL/ si248ugR6XPkuR2moOruIxJo0JgeAvpqEjZDZ6kpToRfG0lOFIDLI67N1CfjiS4ISd0a 53iQ== X-Gm-Message-State: ABuFfojl0GnlgxJ3Lku4iqLOzglND5axhCIgfO7F7WCDt9mn6eDGzTDq pdI8MGZebzgyik/4OnQba91ABa06qxM= X-Google-Smtp-Source: ACcGV62bIPMQF9+Cd9FmLIbN7P3CQuPHYOGyVxRzuV7Q4OUS2wg3md+jVbkktmHjyvn5QABBP+2/mA== X-Received: by 2002:a5d:6902:: with SMTP id t2-v6mr1025402wru.323.1538568446919; Wed, 03 Oct 2018 05:07:26 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id u191-v6sm1627707wmd.31.2018.10.03.05.07.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Oct 2018 05:07:26 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Wed, 3 Oct 2018 13:07:15 +0100 Message-Id: <20181003120718.6898-4-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> References: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [RFC i-g-t 3/6] intel-gpu-overlay: Show 1s, 30s and 15m GPU load X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Show total GPU loads in the window banner. Engine load is defined as total of runnable and running requests on an engine. Total, non-normalized, load is display. In other words if N engines are busy with exactly one request, the load will be shown as N. v2: * Different flavour of load avg. (Chris Wilson) * Simplify code. (Chris Wilson) Signed-off-by: Tvrtko Ursulin --- overlay/gpu-top.c | 39 ++++++++++++++++++++++++++++++++++++++- overlay/gpu-top.h | 11 ++++++++++- overlay/overlay.c | 28 ++++++++++++++++++++++------ 3 files changed, 70 insertions(+), 8 deletions(-) diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c index 22e9badb22c1..501429b86379 100644 --- a/overlay/gpu-top.c +++ b/overlay/gpu-top.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -126,6 +127,10 @@ static int perf_init(struct gpu_top *gt) gt->ring[gt->num_rings++].name = d->name; } + gt->have_load_avg = gt->have_queued && + gt->have_runnable && + gt->have_running; + return 0; } @@ -290,17 +295,32 @@ static void mmio_init(struct gpu_top *gt) } } -void gpu_top_init(struct gpu_top *gt) +void gpu_top_init(struct gpu_top *gt, unsigned int period_us) { + const double period = (double)period_us / 1e6; + const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 }; + const char *load_names[NUM_LOADS] = { "1s", "30s", "15m" }; + unsigned int i; + memset(gt, 0, sizeof(*gt)); gt->fd = -1; + for (i = 0; i < NUM_LOADS; i++) { + gt->load_name[i] = load_names[i]; + gt->exp[i] = exp(-period / load_period[i]); + } + if (perf_init(gt) == 0) return; mmio_init(gt); } +static double update_load(double load, double exp, double val) +{ + return val + exp * (load - val); +} + int gpu_top_update(struct gpu_top *gt) { uint32_t data[1024]; @@ -313,6 +333,8 @@ int gpu_top_update(struct gpu_top *gt) struct gpu_top_stat *s = >->stat[gt->count++&1]; struct gpu_top_stat *d = >->stat[gt->count&1]; uint64_t *sample, d_time; + double gpu_qd = 0.0; + unsigned int i; int n, m; len = read(gt->fd, data, sizeof(data)); @@ -341,6 +363,8 @@ int gpu_top_update(struct gpu_top *gt) d_time = s->time - d->time; for (n = 0; n < gt->num_rings; n++) { + double qd = 0.0; + gt->ring[n].u.u.busy = (100 * (s->busy[n] - d->busy[n]) + d_time/2) / d_time; if (gt->have_wait) gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time; @@ -353,6 +377,14 @@ int gpu_top_update(struct gpu_top *gt) if (gt->have_running) gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time; + qd = gt->ring[n].runnable + gt->ring[n].running; + gpu_qd += qd; + + for (i = 0; i < NUM_LOADS; i++) + gt->ring[n].load[i] = + update_load(gt->ring[n].load[i], + gt->exp[i], qd); + /* in case of rounding + sampling errors, fudge */ if (gt->ring[n].u.u.busy > 100) gt->ring[n].u.u.busy = 100; @@ -362,6 +394,11 @@ int gpu_top_update(struct gpu_top *gt) gt->ring[n].u.u.sema = 100; } + for (i = 0; i < NUM_LOADS; i++) { + gt->load[i] = update_load(gt->load[i], gt->exp[i], + gpu_qd); + gt->norm_load[i] = gt->load[i] / gt->num_rings; + } update = 1; } else { while ((len = read(gt->fd, data, sizeof(data))) > 0) { diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h index cb4310c82a94..115ce8c482c1 100644 --- a/overlay/gpu-top.h +++ b/overlay/gpu-top.h @@ -26,6 +26,7 @@ #define GPU_TOP_H #define MAX_RINGS 16 +#define NUM_LOADS 3 #include @@ -39,6 +40,12 @@ struct gpu_top { int have_queued; int have_runnable; int have_running; + int have_load_avg; + + double exp[NUM_LOADS]; + double load[NUM_LOADS]; + double norm_load[NUM_LOADS]; + const char *load_name[NUM_LOADS]; struct gpu_top_ring { const char *name; @@ -54,6 +61,8 @@ struct gpu_top { double queued; double runnable; double running; + + double load[NUM_LOADS]; } ring[MAX_RINGS]; struct gpu_top_stat { @@ -69,7 +78,7 @@ struct gpu_top { int count; }; -void gpu_top_init(struct gpu_top *gt); +void gpu_top_init(struct gpu_top *gt, unsigned int period_us); int gpu_top_update(struct gpu_top *gt); #endif /* GPU_TOP_H */ diff --git a/overlay/overlay.c b/overlay/overlay.c index 7087d5e5b24e..1a1d01db082b 100644 --- a/overlay/overlay.c +++ b/overlay/overlay.c @@ -141,7 +141,8 @@ struct overlay_context { }; static void init_gpu_top(struct overlay_context *ctx, - struct overlay_gpu_top *gt) + struct overlay_gpu_top *gt, + unsigned int period_us) { const double rgba[][4] = { { 1, 0.25, 0.25, 1 }, @@ -153,7 +154,7 @@ static void init_gpu_top(struct overlay_context *ctx, int n; cpu_top_init(>->cpu_top); - gpu_top_init(>->gpu_top); + gpu_top_init(>->gpu_top, period_us); chart_init(>->cpu, "CPU", 120); chart_set_position(>->cpu, PAD, PAD); @@ -928,13 +929,13 @@ int main(int argc, char **argv) debugfs_init(); - init_gpu_top(&ctx, &ctx.gpu_top); + sample_period = get_sample_period(&config); + + init_gpu_top(&ctx, &ctx.gpu_top, sample_period); init_gpu_perf(&ctx, &ctx.gpu_perf); init_gpu_freq(&ctx, &ctx.gpu_freq); init_gem_objects(&ctx, &ctx.gem_objects); - sample_period = get_sample_period(&config); - i = 0; while (1) { ctx.time = time(NULL); @@ -950,9 +951,24 @@ int main(int argc, char **argv) show_gem_objects(&ctx, &ctx.gem_objects); { - char buf[80]; + struct gpu_top *gt = &ctx.gpu_top.gpu_top; cairo_text_extents_t extents; + char buf[256]; + gethostname(buf, sizeof(buf)); + + if (gt->have_load_avg) { + int len = strlen(buf); + + snprintf(buf + len, sizeof(buf) - len, + "%s; %u engines; load %s %.2f, %s %.2f, %s %.2f", + buf, + gt->num_rings, + gt->load_name[0], gt->load[0], + gt->load_name[1], gt->load[1], + gt->load_name[2], gt->load[2]); + } + cairo_set_source_rgb(ctx.cr, .5, .5, .5); cairo_set_font_size(ctx.cr, PAD-2); cairo_text_extents(ctx.cr, buf, &extents); From patchwork Wed Oct 3 12:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10624735 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6C6921926 for ; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5EFF52896D for ; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 538832898D; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B39672896D for ; Wed, 3 Oct 2018 12:07:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 378316E46F; Wed, 3 Oct 2018 12:07:31 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4F3C96E450 for ; Wed, 3 Oct 2018 12:07:29 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id z3-v6so5814737wrr.13 for ; Wed, 03 Oct 2018 05:07:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jX+de3ubx+DYnrPa+t2+6MPqQlb8kdCihefe4gsWYeU=; b=OlwJbKEiLgLTr82noFfNrnj8ZVMiURJdaw2FOFDEQ5SrfroOVHrRuWA7hnuv9eiSRF dqhZPKD5JSg69XCuXLgVNuJEqwN1gSQqbW4tMLy53B+RKS8l2RUtFZruRJ8/9nZItxeZ KSfFl5qh/vTlbau90rftPK6mOZKfFxKZ7r3dHY/Jun+qNPH+7rBWQ+W7MAUmC2mgf5IM H5DBTQBIFgS0f+7OqYQh0hruL7y9R+6CHdtFUIUUKuJUVxiKYv4Loxn3i+xqBJykfj0C vC8hNGadK/zsJRJMIaAkvadE/vD9eNAr2rNh4UHO38AkaDB3hmVwTDcMdYqiJPW5+K7s 5gkw== X-Gm-Message-State: ABuFfojDkbgn2C3Ym0XUXxLPfvssz1aad9luiMABmDOmmH4XpKJo7GRb YrDkZ/NdPTdfFdUgUDkrEpYy9w== X-Google-Smtp-Source: ACcGV63taFfv17NlkufsXd898VHhne7Svn/fpTFjkxjYYTCab/xl7wxWpxORoXmuXpBkJE80i4mdRw== X-Received: by 2002:adf:9bcf:: with SMTP id e15-v6mr1007265wrc.240.1538568447838; Wed, 03 Oct 2018 05:07:27 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id u191-v6sm1627707wmd.31.2018.10.03.05.07.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Oct 2018 05:07:27 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Wed, 3 Oct 2018 13:07:16 +0100 Message-Id: <20181003120718.6898-5-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> References: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [RFC i-g-t 4/6] tests/perf_pmu: Add tests for engine queued/runnable/running stats X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Simple tests to check reported queue depths are correct. v2: * Improvements similar to ones from i915_query.c. v3: * Rebase for __igt_spin_batch_new. Signed-off-by: Tvrtko Ursulin --- tests/perf_pmu.c | 259 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 259 insertions(+) diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c index 6ab2595b114d..6f064b605a4c 100644 --- a/tests/perf_pmu.c +++ b/tests/perf_pmu.c @@ -169,6 +169,7 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e) #define TEST_RUNTIME_PM (8) #define FLAG_LONG (16) #define FLAG_HANG (32) +#define TEST_CONTEXTS (64) static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags) { @@ -959,6 +960,224 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e) assert_within_epsilon(val[1], perf_slept[1], tolerance); } +static double calc_queued(uint64_t d_val, uint64_t d_ns) +{ + return (double)d_val * 1e9 / I915_SAMPLE_QUEUED_DIVISOR / d_ns; +} + +static void +queued(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags) +{ + const unsigned long engine = e2ring(gem_fd, e); + const uint32_t bbe = MI_BATCH_BUFFER_END; + const unsigned int max_rq = 10; + double queued[max_rq + 1]; + unsigned int n, i; + uint64_t val[2]; + uint64_t ts[2]; + uint32_t bo; + int fd; + + igt_require_sw_sync(); + if (flags & TEST_CONTEXTS) + gem_require_contexts(gem_fd); + + memset(queued, 0, sizeof(queued)); + + bo = gem_create(gem_fd, 4096); + gem_write(gem_fd, bo, 4092, &bbe, sizeof(bbe)); + + fd = open_pmu(I915_PMU_ENGINE_QUEUED(e->class, e->instance)); + + for (n = 0; n <= max_rq; n++) { + IGT_CORK_FENCE(cork); + int fence = -1; + + gem_quiescent_gpu(gem_fd); + + if (n) + fence = igt_cork_plug(&cork, -1); + + for (i = 0; i < n; i++) { + struct drm_i915_gem_exec_object2 obj = { }; + struct drm_i915_gem_execbuffer2 eb = { }; + + obj.handle = bo; + + eb.buffer_count = 1; + eb.buffers_ptr = to_user_pointer(&obj); + + eb.flags = engine | I915_EXEC_FENCE_IN; + if (flags & TEST_CONTEXTS) + eb.rsvd1 = gem_context_create(gem_fd); + eb.rsvd2 = fence; + + gem_execbuf(gem_fd, &eb); + + if (flags & TEST_CONTEXTS) + gem_context_destroy(gem_fd, eb.rsvd1); + } + + val[0] = __pmu_read_single(fd, &ts[0]); + usleep(batch_duration_ns / 1000); + val[1] = __pmu_read_single(fd, &ts[1]); + + queued[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]); + igt_info("n=%u queued=%.2f\n", n, queued[n]); + + if (fence >= 0) + igt_cork_unplug(&cork); + + for (i = 0; i < n; i++) + gem_sync(gem_fd, bo); + } + + close(fd); + + gem_close(gem_fd, bo); + + for (i = 0; i <= max_rq; i++) + assert_within_epsilon(queued[i], i, tolerance); +} + +static unsigned long __query_wait(igt_spin_t *spin, unsigned int n) +{ + struct timespec ts = { }; + unsigned long t; + + igt_nsec_elapsed(&ts); + + if (spin->running) { + igt_spin_busywait_until_running(spin); + } else { + igt_debug("__spin_wait - usleep mode\n"); + usleep(500e3); /* Better than nothing! */ + } + + t = igt_nsec_elapsed(&ts); + + return spin->running ? t : 500e6 / n; +} + +static void +runnable(int gem_fd, const struct intel_execution_engine2 *e) +{ + const unsigned long engine = e2ring(gem_fd, e); + bool contexts = gem_has_contexts(gem_fd); + const unsigned int max_rq = 10; + igt_spin_t *spin[max_rq + 1]; + double runnable[max_rq + 1]; + uint32_t ctx[max_rq]; + unsigned int n, i; + uint64_t val[2]; + uint64_t ts[2]; + int fd; + + memset(runnable, 0, sizeof(runnable)); + + if (contexts) { + for (i = 0; i < max_rq; i++) + ctx[i] = gem_context_create(gem_fd); + } + + fd = open_pmu(I915_PMU_ENGINE_RUNNABLE(e->class, e->instance)); + + for (n = 0; n <= max_rq; n++) { + gem_quiescent_gpu(gem_fd); + + for (i = 0; i < n; i++) { + uint32_t ctx_ = contexts ? ctx[i] : 0; + + if (i == 0) + spin[i] = __spin_poll(gem_fd, ctx_, engine); + else + spin[i] = __igt_spin_batch_new(gem_fd, + .ctx = ctx_, + .engine = engine); + } + + if (n) + usleep(__query_wait(spin[0], n) * n); + + val[0] = __pmu_read_single(fd, &ts[0]); + usleep(batch_duration_ns / 1000); + val[1] = __pmu_read_single(fd, &ts[1]); + + runnable[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]); + igt_info("n=%u runnable=%.2f\n", n, runnable[n]); + + for (i = 0; i < n; i++) { + end_spin(gem_fd, spin[i], FLAG_SYNC); + igt_spin_batch_free(gem_fd, spin[i]); + } + } + + if (contexts) { + for (i = 0; i < max_rq; i++) + gem_context_destroy(gem_fd, ctx[i]); + } + + close(fd); + + assert_within_epsilon(runnable[0], 0, tolerance); + igt_assert(runnable[max_rq] > 0.0); + + if (contexts) + assert_within_epsilon(runnable[max_rq] - runnable[max_rq - 1], + 1, tolerance); +} + +static void +running(int gem_fd, const struct intel_execution_engine2 *e) +{ + const unsigned long engine = e2ring(gem_fd, e); + const unsigned int max_rq = 10; + igt_spin_t *spin[max_rq + 1]; + double running[max_rq + 1]; + unsigned int n, i; + uint64_t val[2]; + uint64_t ts[2]; + int fd; + + memset(running, 0, sizeof(running)); + memset(spin, 0, sizeof(spin)); + + fd = open_pmu(I915_PMU_ENGINE_RUNNING(e->class, e->instance)); + + for (n = 0; n <= max_rq; n++) { + gem_quiescent_gpu(gem_fd); + + for (i = 0; i < n; i++) { + if (i == 0) + spin[i] = __spin_poll(gem_fd, 0, engine); + else + spin[i] = __igt_spin_batch_new(gem_fd, + .engine = engine); + } + + if (n) + usleep(__query_wait(spin[0], n) * n); + + val[0] = __pmu_read_single(fd, &ts[0]); + usleep(batch_duration_ns / 1000); + val[1] = __pmu_read_single(fd, &ts[1]); + + running[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]); + igt_info("n=%u running=%.2f\n", n, running[n]); + + for (i = 0; i < n; i++) { + end_spin(gem_fd, spin[i], FLAG_SYNC); + igt_spin_batch_free(gem_fd, spin[i]); + } + } + + close(fd); + + assert_within_epsilon(running[0], 0, tolerance); + for (i = 1; i <= max_rq; i++) + igt_assert(running[i] > 0); +} + /** * Tests that i915 PMU corectly errors out in invalid initialization. * i915 PMU is uncore PMU, thus: @@ -1709,6 +1928,15 @@ igt_main igt_subtest_f("init-sema-%s", e->name) init(fd, e, I915_SAMPLE_SEMA); + igt_subtest_f("init-queued-%s", e->name) + init(fd, e, I915_SAMPLE_QUEUED); + + igt_subtest_f("init-runnable-%s", e->name) + init(fd, e, I915_SAMPLE_RUNNABLE); + + igt_subtest_f("init-running-%s", e->name) + init(fd, e, I915_SAMPLE_RUNNING); + igt_subtest_group { igt_fixture { gem_require_engine(fd, e->class, e->instance); @@ -1814,6 +2042,27 @@ igt_main igt_subtest_f("busy-hang-%s", e->name) single(fd, e, TEST_BUSY | FLAG_HANG); + + /** + * Test that queued metric works. + */ + igt_subtest_f("queued-%s", e->name) + queued(fd, e, 0); + + igt_subtest_f("queued-contexts-%s", e->name) + queued(fd, e, TEST_CONTEXTS); + + /** + * Test that runnable metric works. + */ + igt_subtest_f("runnable-%s", e->name) + runnable(fd, e); + + /** + * Test that running metric works. + */ + igt_subtest_f("running-%s", e->name) + running(fd, e); } /** @@ -1906,6 +2155,16 @@ igt_main e->name) single(render_fd, e, TEST_BUSY | TEST_TRAILING_IDLE); + igt_subtest_f("render-node-queued-%s", e->name) + queued(render_fd, e, 0); + igt_subtest_f("render-node-queued-contexts-%s", + e->name) + queued(render_fd, e, TEST_CONTEXTS); + igt_subtest_f("render-node-runnable-%s", + e->name) + runnable(render_fd, e); + igt_subtest_f("render-node-running-%s", e->name) + running(render_fd, e); } } From patchwork Wed Oct 3 12:07:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10624739 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 97D0013BB for ; Wed, 3 Oct 2018 12:07:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 893AF28942 for ; Wed, 3 Oct 2018 12:07:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7DC1F2896A; Wed, 3 Oct 2018 12:07:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id DE66C28942 for ; Wed, 3 Oct 2018 12:07:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0B63F6E47D; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by gabe.freedesktop.org (Postfix) with ESMTPS id ACC516E451 for ; Wed, 3 Oct 2018 12:07:30 +0000 (UTC) Received: by mail-wr1-x442.google.com with SMTP id z4-v6so5845843wrb.1 for ; Wed, 03 Oct 2018 05:07:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=T3nZ41zHoAWtmAdmKIADLMlZZDld4r4d4eDYA6qRIkc=; b=qvordAo1m4SbtJHYg6O9Y+ahiXxNKItxHjeUWd7eAD9DYlF4ZLgMtFZzwmFbd2QRXd Jvk89Nd2g34EA01erEh3iLmmiVwHMqAT/gTI1Rbju7Q7szunkVdeVE2z6GHZyFXk1OVh c/9IkW5cyhdkcgotU51HiIjl8BC/SyoePQC9aKkhGjYip/5+ZgfvDqTKcIqA7px0xxYf W6oJzqAej52Rkek86mnRo7cCM10yAoVip9vS9pJAz7E3zarL23RVbnKJI6Dppg5ilA9y VMEmeaeypmrIxjD8rO4E1tKnWAiWhdwnGYQAjXTGHMeJWNu6ZNBgJ58iOrEc/ZLb0kTp aLmA== X-Gm-Message-State: ABuFfohPHsytv5sFacK8/dJPGL7jl1+scJ7T9OeK6c9a7V10bOSTbpBp EKpwfmJJhHPWGrraMosf/UQ0CA9Kq6o= X-Google-Smtp-Source: ACcGV62tvZ/ETETzWvZQ3DaKDRqo7wqCwOQY9i+GeCPepXY2d9HzJk6eh1/kYomlxSIqgW5Hgyd/lA== X-Received: by 2002:adf:e348:: with SMTP id n8-v6mr1029019wrj.158.1538568449099; Wed, 03 Oct 2018 05:07:29 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id u191-v6sm1627707wmd.31.2018.10.03.05.07.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Oct 2018 05:07:28 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Wed, 3 Oct 2018 13:07:17 +0100 Message-Id: <20181003120718.6898-6-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> References: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [RFC i-g-t 5/6] intel-gpu-top: Add queue depths and load average X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin With the driver now exporting various request queue depths for each engine, we can display this information here. Both as raw counters, and by adding a load average like metrics composed from number of runnable and running requests in a given time period. 1s, 30s and 5m periods are used for load average. Signed-off-by: Tvrtko Ursulin --- tools/Makefile.am | 2 +- tools/intel_gpu_top.c | 122 +++++++++++++++++++++++++++++++++++++----- tools/meson.build | 2 +- 3 files changed, 110 insertions(+), 16 deletions(-) diff --git a/tools/Makefile.am b/tools/Makefile.am index e7de4d90241c..e03842afce8d 100644 --- a/tools/Makefile.am +++ b/tools/Makefile.am @@ -29,7 +29,7 @@ intel_aubdump_la_LDFLAGS = -module -avoid-version -no-undefined intel_aubdump_la_SOURCES = aubdump.c intel_aubdump_la_LIBADD = $(top_builddir)/lib/libintel_tools.la -ldl -intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la +intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la -lm bin_SCRIPTS = intel_aubdump CLEANFILES = $(bin_SCRIPTS) diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c index b923c3cfbe97..8990ef17b771 100644 --- a/tools/intel_gpu_top.c +++ b/tools/intel_gpu_top.c @@ -56,6 +56,10 @@ struct pmu_counter { struct pmu_pair val; }; +#define NUM_QUEUES (3) + +#define NUM_LOADS (3) + struct engine { const char *name; const char *display_name; @@ -65,9 +69,13 @@ struct engine { unsigned int num_counters; + double qd[3]; + double load_avg[NUM_LOADS]; + struct pmu_counter busy; struct pmu_counter wait; struct pmu_counter sema; + struct pmu_counter queue[NUM_QUEUES]; }; struct engines { @@ -95,6 +103,11 @@ struct engines { struct pmu_counter imc_reads; struct pmu_counter imc_writes; + double qd_scale; + + double load_exp[NUM_LOADS]; + double load_avg[NUM_LOADS]; + struct engine engine; }; @@ -320,6 +333,13 @@ static double filename_to_double(const char *filename) return v; } +#define I915_EVENT "/sys/devices/i915/events/" + +static double i915_qd_scale(void) +{ + return filename_to_double(I915_EVENT "rcs0-queued.scale"); +} + #define RAPL_ROOT "/sys/devices/power/" #define RAPL_EVENT "/sys/devices/power/events/" @@ -453,6 +473,8 @@ static int pmu_init(struct engines *engines) engines->rc6.config = I915_PMU_RC6_RESIDENCY; _open_pmu(engines->num_counters, &engines->rc6, engines->fd); + engines->qd_scale = i915_qd_scale(); + for (i = 0; i < engines->num_engines; i++) { struct engine *engine = engine_ptr(engines, i); struct { @@ -462,6 +484,9 @@ static int pmu_init(struct engines *engines) { .pmu = &engine->busy, .counter = "busy" }, { .pmu = &engine->wait, .counter = "wait" }, { .pmu = &engine->sema, .counter = "sema" }, + { .pmu = &engine->queue[0], .counter = "queued" }, + { .pmu = &engine->queue[1], .counter = "runnable" }, + { .pmu = &engine->queue[2], .counter = "running" }, { .pmu = NULL, .counter = NULL }, }; @@ -576,12 +601,11 @@ static void fill_str(char *buf, unsigned int bufsz, char c, unsigned int num) *buf = 0; } -static void pmu_calc(struct pmu_counter *cnt, - char *buf, unsigned int bufsz, - unsigned int width, unsigned width_dec, - double d, double t, double s) +static void _pmu_calc(struct pmu_counter *cnt, + char *buf, unsigned int bufsz, + unsigned int width, unsigned width_dec, + double val) { - double val; int len; assert(bufsz >= (width + width_dec + 1)); @@ -591,8 +615,6 @@ static void pmu_calc(struct pmu_counter *cnt, return; } - val = __pmu_calc(&cnt->val, d, t, s); - len = snprintf(buf, bufsz, "%*.*f", width + width_dec, width_dec, val); if (len < 0 || len == bufsz) { fill_str(buf, bufsz, 'X', width + width_dec); @@ -600,6 +622,16 @@ static void pmu_calc(struct pmu_counter *cnt, } } +static void pmu_calc(struct pmu_counter *cnt, + char *buf, unsigned int bufsz, + unsigned int width, unsigned width_dec, + double d, double t, double s) +{ + double val = __pmu_calc(&cnt->val, d, t, s); + + _pmu_calc(cnt, buf, bufsz, width, width_dec, val); +} + static uint64_t __pmu_read_single(int fd, uint64_t *ts) { uint64_t data[2] = { }; @@ -658,10 +690,14 @@ static void pmu_sample(struct engines *engines) for (i = 0; i < engines->num_engines; i++) { struct engine *engine = engine_ptr(engines, i); + unsigned int j; update_sample(&engine->busy, val); update_sample(&engine->sema, val); update_sample(&engine->wait, val); + + for (j = 0; j < NUM_QUEUES; j++) + update_sample(&engine->queue[j], val); } } @@ -702,12 +738,19 @@ usage(const char *appname) appname, DEFAULT_PERIOD_MS); } +static double update_load(double load, double exp, double val) +{ + return val + exp * (load - val); +} + int main(int argc, char **argv) { unsigned int period_us = DEFAULT_PERIOD_MS * 1000; + const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 }; int con_w = -1, con_h = -1; struct engines *engines; unsigned int i; + double period; int ret, ch; /* Parse options */ @@ -741,10 +784,15 @@ int main(int argc, char **argv) return 1; } + /* Load average setup. */ + period = (double)period_us / 1e6; + for (i = 0; i < NUM_LOADS; i++) + engines->load_exp[i] = exp(-period / load_period[i]); + pmu_sample(engines); for (;;) { - double t; + double t, qd = 0; #define BUFSZ 16 char freq[BUFSZ]; char fact[BUFSZ]; @@ -754,6 +802,7 @@ int main(int argc, char **argv) char reads[BUFSZ]; char writes[BUFSZ]; struct winsize ws; + unsigned int j; int lines = 0; /* Update terminal size. */ @@ -778,8 +827,44 @@ int main(int argc, char **argv) pmu_calc(&engines->imc_writes, writes, BUFSZ, 6, 0, 1.0, t, engines->imc_writes_scale); + for (i = 0; i < engines->num_engines; i++) { + struct engine *engine = engine_ptr(engines, i); + + if (!engine->num_counters) + continue; + + for (j = 0; j < NUM_QUEUES; j++) { + if (!engine->queue[j].present) + continue; + + engine->qd[j] = + __pmu_calc(&engine->queue[j].val, 1, t, + engines->qd_scale); + } + + qd += engine->qd[1] + engine->qd[2]; + + for (j = 0; j < NUM_LOADS; j++) { + engine->load_avg[j] = + update_load(engine->load_avg[j], + engines->load_exp[j], + engine->qd[1] + + engine->qd[2]); + } + } + + for (j = 0; j < NUM_LOADS; j++) { + engines->load_avg[j] = + update_load(engines->load_avg[j], + engines->load_exp[j], + qd); + } + if (lines++ < con_h) - printf("intel-gpu-top - %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n", + printf("intel-gpu-top - load avg %5.2f, %5.2f, %5.2f; %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n", + engines->load_avg[0], + engines->load_avg[1], + engines->load_avg[2], fact, freq, rc6, power, engines->rapl_unit, irq); if (lines++ < con_h) @@ -803,7 +888,7 @@ int main(int argc, char **argv) if (engine->num_counters && lines < con_h) { const char *a = " ENGINE BUSY "; - const char *b = " MI_SEMA MI_WAIT"; + const char *b = "Q r R MI_SEMA MI_WAIT"; printf("\033[7m%s%*s%s\033[0m\n", a, @@ -817,6 +902,7 @@ int main(int argc, char **argv) for (i = 0; i < engines->num_engines && lines < con_h; i++) { struct engine *engine = engine_ptr(engines, i); unsigned int max_w = con_w - 1; + char qdbuf[NUM_LOADS][BUFSZ]; unsigned int len; char sema[BUFSZ]; char wait[BUFSZ]; @@ -827,14 +913,22 @@ int main(int argc, char **argv) if (!engine->num_counters) continue; + for (j = 0; j < NUM_QUEUES; j++) + _pmu_calc(&engine->queue[j], qdbuf[j], BUFSZ, + 3, 0, engine->qd[j]); + pmu_calc(&engine->sema, sema, BUFSZ, 3, 0, 1e9, t, 100); pmu_calc(&engine->wait, wait, BUFSZ, 3, 0, 1e9, t, 100); - len = snprintf(buf, sizeof(buf), " %s%% %s%%", + + len = snprintf(buf, sizeof(buf), + " %s %s %s %s%% %s%%", + qdbuf[0], qdbuf[1], qdbuf[2], sema, wait); - pmu_calc(&engine->busy, busy, BUFSZ, 6, 2, 1e9, t, - 100); - len += printf("%16s %s%% ", engine->display_name, busy); + pmu_calc(&engine->busy, busy, BUFSZ, 6, 2, 1e9, t, 100); + + len += printf("%16s %s%% ", + engine->display_name, busy); val = __pmu_calc(&engine->busy.val, 1e9, t, 100); print_percentage_bar(val, max_w - len); diff --git a/tools/meson.build b/tools/meson.build index e4517d667299..c1dd71fada79 100644 --- a/tools/meson.build +++ b/tools/meson.build @@ -97,7 +97,7 @@ shared_library('intel_aubdump', 'aubdump.c', executable('intel_gpu_top', 'intel_gpu_top.c', install : true, install_rpath : bindir_rpathdir, - dependencies : tool_deps + [ lib_igt_perf ]) + dependencies : tool_deps + [ lib_igt_perf, math ]) conf_data = configuration_data() conf_data.set('prefix', prefix) From patchwork Wed Oct 3 12:07:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10624737 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9BE9D14BD for ; Wed, 3 Oct 2018 12:07:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E56328942 for ; Wed, 3 Oct 2018 12:07:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 827682896A; Wed, 3 Oct 2018 12:07:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id DA49928942 for ; Wed, 3 Oct 2018 12:07:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DCDAF6E478; Wed, 3 Oct 2018 12:07:33 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by gabe.freedesktop.org (Postfix) with ESMTPS id B7E2C6E473 for ; Wed, 3 Oct 2018 12:07:31 +0000 (UTC) Received: by mail-wr1-x42f.google.com with SMTP id z4-v6so5845899wrb.1 for ; Wed, 03 Oct 2018 05:07:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=J1Pr4c87Wv1N/yH6uyipjV5Ih5OH4atmwT6EL9hJ6T0=; b=Ths4jtp7duAUE+645Qzqa4hPXGNL3YOYaA0CH6CVRAzzCPBIqm5qiG2GAGEolKOtez yVqsCXf7EN6ahYI/Zi69v87qa8LAtFLmHA52OBkkiZAYyQ6bPn2+HFGXOnPUKnfRBwbn fJRlBzJJBxKbxL3/NaowMPa03nUqXEcKnrmkrNXqEhCv0L+Q46QzsNjEAaOQJneTjMzW EFM/iVcMtJs4VJb8cL35j/l1gIi0DKXyCJG+rTQv0UZvjze+d3JESpTT2D2PDPqcXwtq uZVGlx7oqZj2ExeseDpm4mHueHe5ZP9qARubsDKoXDmpNfE4vN3R9qWd3bsz9snolLSW iL8w== X-Gm-Message-State: ABuFfogR2lmnwyt3oyjvNj5afozlRj8ASlIfMuHZ3G2v6M48b9ZxpEUG IIqTawLx1VrfnS1lc80oHyOwpA== X-Google-Smtp-Source: ACcGV63y4E5GOOlYDvj511Dle2rgZis80krRSgoDF55BgKAElPbjGoNaT+OkIheQZB1JLKLEyfMyDA== X-Received: by 2002:adf:f1ce:: with SMTP id z14-v6mr1066245wro.214.1538568450111; Wed, 03 Oct 2018 05:07:30 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id u191-v6sm1627707wmd.31.2018.10.03.05.07.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Oct 2018 05:07:29 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Wed, 3 Oct 2018 13:07:18 +0100 Message-Id: <20181003120718.6898-7-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> References: <20181003120718.6898-1-tvrtko.ursulin@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC i-g-t 6/6] intel-gpu-top: Support for client stats X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Signed-off-by: Tvrtko Ursulin --- tools/intel_gpu_top.c | 360 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 351 insertions(+), 9 deletions(-) diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c index 8990ef17b771..02c5a7d77614 100644 --- a/tools/intel_gpu_top.c +++ b/tools/intel_gpu_top.c @@ -20,9 +20,6 @@ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER * DEALINGS IN THE SOFTWARE. * - * Authors: - * Eric Anholt - * Eugeni Dodonov */ #include @@ -41,6 +38,8 @@ #include #include #include +#include +#include #include "igt_perf.h" @@ -701,8 +700,300 @@ static void pmu_sample(struct engines *engines) } } +enum client_status { + FREE = 0, /* mbz */ + ALIVE, + PROBE +}; + +struct client { + enum client_status status; + unsigned int id; + unsigned int pid; + char name[128]; + unsigned int samples; + unsigned long total; + struct engines *engines; + unsigned long *val; + uint64_t *last; +}; + +#define SYSFS_ENABLE "/sys/class/drm/card0/clients/enable_stats" + +bool __stats_enabled; + +static int __set_stats(bool val) +{ + int fd, ret; + + fd = open(SYSFS_ENABLE, O_WRONLY); + if (fd < 0) + return -errno; + + ret = write(fd, val ? "1" : "0", 2); + if (ret < 0) + return -errno; + else if (ret < 2) + return 1; + + close(fd); + + return 0; +} + +static void __restore_stats(void) +{ + int ret; + + if (__stats_enabled) + return; + + ret = __set_stats(false); + if (ret) + fprintf(stderr, "Failed to disable per-client stats! (%d)\n", + ret); +} + +static void __restore_stats_signal(int sig) +{ + exit(0); +} + +static void enable_stats(void) +{ + int fd, ret; + + fd = open(SYSFS_ENABLE, O_RDONLY); + if (fd < 0) + return; + + close(fd); + + __stats_enabled = filename_to_u64(SYSFS_ENABLE, 10); + if (__stats_enabled) + return; + + ret = __set_stats(true); + if (!ret) { + if (atexit(__restore_stats)) + fprintf(stderr, "Failed to register exit handler!"); + + if (signal(SIGINT, __restore_stats_signal)) + fprintf(stderr, "Failed to register signal handler!"); + } else { + fprintf(stderr, "Failed to enable per-client stats! (%d)\n", + ret); + } +} + +#define SYSFS_CLIENTS "/sys/class/drm/card0/clients/" + +static struct client *clients; +static unsigned int num_clients; + +#define for_each_client(c, tmp) \ + for (tmp = num_clients, c = clients; tmp > 0; tmp--, c++) + +static uint64_t read_client_busy(unsigned int id, const char *engine) +{ + char buf[256]; + ssize_t ret; + + ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/busy/%s", + id, engine); + assert(ret > 0 && ret < sizeof(buf)); + if (ret <= 0 || ret == sizeof(buf)) + return 0; + + return filename_to_u64(buf, 10); +} + +static struct client *find_client(enum client_status status, unsigned int id) +{ + struct client *c; + unsigned int tmp; + + for_each_client(c, tmp) { + if ((status == FREE && c->status == FREE) || + (status == c->status && c->id == id)) + return c; + } + + return NULL; +} + +static void update_client(struct client *c, unsigned int pid, char *name) +{ + uint64_t val[c->engines->num_engines]; + unsigned int i; + + if (c->pid != pid) + c->pid = pid; + + if (strncmp(c->name, name, sizeof(c->name))) + strncpy(c->name, name, sizeof(c->name)); + + for (i = 0; i < c->engines->num_engines; i++) { + struct engine *engine = engine_ptr(c->engines, i); + + val[i] = read_client_busy(c->id, engine->name); + } + + c->total = 0; + + for (i = 0; i < c->engines->num_engines; i++) { + assert(val[i] >= c->last[i]); + c->val[i] = val[i] - c->last[i]; + c->total += c->val[i]; + c->last[i] = val[i]; + } + + c->samples++; + c->status = ALIVE; +} + +static void +add_client(unsigned int id, unsigned int pid, char *name, + struct engines *engines) +{ + struct client *c; + + assert(!find_client(ALIVE, id)); + + c = find_client(FREE, 0); + if (!c) { + unsigned int idx = num_clients; + + num_clients += (num_clients + 2) / 2; + clients = realloc(clients, num_clients * sizeof(*c)); + assert(clients); + c = &clients[idx]; + memset(c, 0, (num_clients - idx) * sizeof(*c)); + } + + c->id = id; + c->engines = engines; + c->val = calloc(engines->num_engines, sizeof(c->val)); + c->last = calloc(engines->num_engines, sizeof(c->last)); + assert(c->val && c->last); + + update_client(c, pid, name); +} + +static void free_client(struct client *c) +{ + free(c->val); + free(c->last); + memset(c, 0, sizeof(*c)); +} + +static char *read_client_sysfs(unsigned int id, const char *field) +{ + char buf[256]; + ssize_t ret; + + ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/%s", id, field); + assert(ret > 0 && ret < sizeof(buf)); + if (ret <= 0 || ret == sizeof(buf)) + return NULL; + + ret = filename_to_buf(buf, buf, sizeof(buf)); + assert(ret == 0); + if (ret) + return NULL; + + return strdup(buf); +} + +static void scan(struct engines *engines) +{ + struct dirent *dent; + struct client *c; + char *pid, *name; + unsigned int tmp; + unsigned int id; + DIR *d; + + for_each_client(c, tmp) { + if (c->status == ALIVE) + c->status = PROBE; + } + + d = opendir(SYSFS_CLIENTS); + if (!d) + return; + + while ((dent = readdir(d)) != NULL) { + if (dent->d_type != DT_DIR) + continue; + if (!isdigit(dent->d_name[0])) + continue; + + id = atoi(dent->d_name); + + name = read_client_sysfs(id, "name"); + assert(name); + if (!name) + continue; + + pid = read_client_sysfs(id, "pid"); + assert(pid); + if (!pid) { + free(name); + continue; + } + + c = find_client(PROBE, id); + if (c) { + update_client(c, atoi(pid), name); + continue; + } + + add_client(id, atoi(pid), name, engines); + + free(name); + free(pid); + } + + closedir(d); + + for_each_client(c, tmp) { + if (c->status == PROBE) + free_client(c); + } +} + +static int cmp(const void *_a, const void *_b) +{ + const struct client *a = _a; + const struct client *b = _b; + long tot_a = a->total; + long tot_b = b->total; + + tot_a *= a->status == ALIVE && a->samples > 1; + tot_b *= b->status == ALIVE && b->samples > 1; + + tot_b -= tot_a; + + if (!tot_b) + return (int)b->id - a->id; + + while (tot_b > INT_MAX || tot_b < INT_MIN) + tot_b /= 2; + + return tot_b; +} + static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", "▉", "█" }; +static void n_spaces(const unsigned int n) +{ + unsigned int i; + + for (i = 0; i < n; i++) + putchar(' '); +} + static void print_percentage_bar(double percent, int max_len) { @@ -716,8 +1007,10 @@ print_percentage_bar(double percent, int max_len) if (i) printf("%s", bars[i]); - for (i = 0; i < (max_len - 2 - (bar_len + 7) / 8); i++) - putchar(' '); + bar_len = max_len - 2 - (bar_len + 7) / 8; + if (bar_len > max_len) + bar_len = max_len; + n_spaces(bar_len); putchar('|'); } @@ -784,12 +1077,15 @@ int main(int argc, char **argv) return 1; } + enable_stats(); + /* Load average setup. */ period = (double)period_us / 1e6; for (i = 0; i < NUM_LOADS; i++) engines->load_exp[i] = exp(-period / load_period[i]); pmu_sample(engines); + scan(engines); for (;;) { double t, qd = 0; @@ -802,7 +1098,8 @@ int main(int argc, char **argv) char reads[BUFSZ]; char writes[BUFSZ]; struct winsize ws; - unsigned int j; + unsigned int len, engine_w, j; + struct client *c; int lines = 0; /* Update terminal size. */ @@ -812,9 +1109,10 @@ int main(int argc, char **argv) } pmu_sample(engines); - t = (double)(engines->ts.cur - engines->ts.prev) / 1e9; + scan(engines); + qsort(clients, num_clients, sizeof(*c), cmp); - printf("\033[H\033[J"); + t = (double)(engines->ts.cur - engines->ts.prev) / 1e9; pmu_calc(&engines->freq_req, freq, BUFSZ, 4, 0, 1.0, t, 1); pmu_calc(&engines->freq_act, fact, BUFSZ, 4, 0, 1.0, t, 1); @@ -860,6 +1158,8 @@ int main(int argc, char **argv) qd); } + printf("\033[H\033[J"); + if (lines++ < con_h) printf("intel-gpu-top - load avg %5.2f, %5.2f, %5.2f; %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n", engines->load_avg[0], @@ -903,7 +1203,6 @@ int main(int argc, char **argv) struct engine *engine = engine_ptr(engines, i); unsigned int max_w = con_w - 1; char qdbuf[NUM_LOADS][BUFSZ]; - unsigned int len; char sema[BUFSZ]; char wait[BUFSZ]; char busy[BUFSZ]; @@ -941,6 +1240,49 @@ int main(int argc, char **argv) if (lines++ < con_h) printf("\n"); + if (lines++ < con_h) { + printf("\033[7m"); + len = printf("%5s%16s", "PID", "NAME"); + + engine_w = (con_w - len) / engines->num_engines; + for (i = 0; i < engines->num_engines; i++) { + struct engine *engine = engine_ptr(engines, i); + unsigned int name_len = + strlen(engine->display_name); + unsigned int pad = (engine_w - name_len) / 2; + + + n_spaces(pad); + printf("%s", engine->display_name); + n_spaces(engine_w - pad - name_len); + len += pad + name_len + (engine_w - pad - + name_len); + } + n_spaces(con_w - len); + printf("\033[0m\n"); + } + + for_each_client(c, i) { + if (lines++ > con_h) + break; + + assert(c->status != PROBE); + if (c->status != ALIVE || c->samples < 2) + break; + + printf("%5u%16s ", c->pid, c->name); + + for (j = 0; j < engines->num_engines; j++) { + double pct; + + pct = (double)c->val[j] / period_us / 1e3 * 100; + + print_percentage_bar(pct, engine_w); + } + + putchar('\n'); + } + usleep(period_us); }