From patchwork Thu Oct 18 15:27:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647493 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8AE47112B for ; Thu, 18 Oct 2018 15:28:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7285328C20 for ; Thu, 18 Oct 2018 15:28:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 664C728D13; Thu, 18 Oct 2018 15:28:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0AFFB28C20 for ; Thu, 18 Oct 2018 15:28:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AD77789C69; Thu, 18 Oct 2018 15:28:29 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2F49389A75 for ; Thu, 18 Oct 2018 15:28:27 +0000 (UTC) Received: by mail-wm1-x331.google.com with SMTP id r63-v6so696994wma.4 for ; Thu, 18 Oct 2018 08:28:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8QQk9455VoVB0TluL+LRTZ2hd6Dw8CjSVlMArNvoaZs=; b=JJgcY6ATWrpPIVW3kNm5YggIbwRQm+6BzBZt1V/6nALf94Niksq+cthomLqlnYrjG1 V+dxwaERGJKB+0xBQNj9BO+GO1LzEefpSElQxIuas11W/vTVpO8GT1/O7AgYpjpg6CKF 23k7zQ2ud8HEtSoYJ6QIYXM/u1CAiSYyWjr/zi6b0oMU18MWFTOIho2Z4DMcutwTJzhE /Mxl6QodpI9OqBMH+O4bQq2QouVMIKJk9qnvZxCkekbOPPK3owabhniddA53pQ++2Zl+ q7KGSxc30uw9tEEedykgS55f4gjmt9SMfXDTyTrUmgCSPmiLS/6+xAO/m8NzWVTa0OjV 4hIA== X-Gm-Message-State: ABuFfoi2HDVk9f9w+6fe+X5QzOXnFXyMdQuEDj7GOHjol4798n1s4WUO P3xepk1ZvqdouABbfct13ebx1A== X-Google-Smtp-Source: ACcGV63fygcPVnzTHxg8+qT10BtwlrpP+zRQcwqrfjNe3rMj0VJSB4RxxnFnyg4Xsk0wdMrVyjDfqA== X-Received: by 2002:a1c:ed1a:: with SMTP id l26-v6mr837890wmh.48.1539876504261; Thu, 18 Oct 2018 08:28:24 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:23 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:27:59 +0100 Message-Id: <20181018152815.31816-2-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH i-g-t 01/17] lib: Update uapi headers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Sync with latest DRM uapi changes. --- include/drm-uapi/amdgpu_drm.h | 52 +++- include/drm-uapi/drm.h | 16 ++ include/drm-uapi/drm_fourcc.h | 224 +++++++++++++++ include/drm-uapi/drm_mode.h | 26 +- include/drm-uapi/etnaviv_drm.h | 6 + include/drm-uapi/exynos_drm.h | 240 ++++++++++++++++ include/drm-uapi/i915_drm.h | 239 +++++++++++++++- include/drm-uapi/msm_drm.h | 2 + include/drm-uapi/sync_file.h | 98 ------- include/drm-uapi/tegra_drm.h | 492 ++++++++++++++++++++++++++++++++- include/drm-uapi/v3d_drm.h | 194 +++++++++++++ include/drm-uapi/vc4_drm.h | 13 +- include/drm-uapi/virtgpu_drm.h | 1 + include/drm-uapi/vmwgfx_drm.h | 166 ++++++++--- 14 files changed, 1613 insertions(+), 156 deletions(-) delete mode 100644 include/drm-uapi/sync_file.h create mode 100644 include/drm-uapi/v3d_drm.h diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h index 1816bd8200d1..370e9a5536ef 100644 --- a/include/drm-uapi/amdgpu_drm.h +++ b/include/drm-uapi/amdgpu_drm.h @@ -72,12 +72,41 @@ extern "C" { #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle) #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched) +/** + * DOC: memory domains + * + * %AMDGPU_GEM_DOMAIN_CPU System memory that is not GPU accessible. + * Memory in this pool could be swapped out to disk if there is pressure. + * + * %AMDGPU_GEM_DOMAIN_GTT GPU accessible system memory, mapped into the + * GPU's virtual address space via gart. Gart memory linearizes non-contiguous + * pages of system memory, allows GPU access system memory in a linezrized + * fashion. + * + * %AMDGPU_GEM_DOMAIN_VRAM Local video memory. For APUs, it is memory + * carved out by the BIOS. + * + * %AMDGPU_GEM_DOMAIN_GDS Global on-chip data storage used to share data + * across shader threads. + * + * %AMDGPU_GEM_DOMAIN_GWS Global wave sync, used to synchronize the + * execution of all the waves on a device. + * + * %AMDGPU_GEM_DOMAIN_OA Ordered append, used by 3D or Compute engines + * for appending data. + */ #define AMDGPU_GEM_DOMAIN_CPU 0x1 #define AMDGPU_GEM_DOMAIN_GTT 0x2 #define AMDGPU_GEM_DOMAIN_VRAM 0x4 #define AMDGPU_GEM_DOMAIN_GDS 0x8 #define AMDGPU_GEM_DOMAIN_GWS 0x10 #define AMDGPU_GEM_DOMAIN_OA 0x20 +#define AMDGPU_GEM_DOMAIN_MASK (AMDGPU_GEM_DOMAIN_CPU | \ + AMDGPU_GEM_DOMAIN_GTT | \ + AMDGPU_GEM_DOMAIN_VRAM | \ + AMDGPU_GEM_DOMAIN_GDS | \ + AMDGPU_GEM_DOMAIN_GWS | \ + AMDGPU_GEM_DOMAIN_OA) /* Flag that CPU access will be required for the case of VRAM domain */ #define AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED (1 << 0) @@ -95,6 +124,10 @@ extern "C" { #define AMDGPU_GEM_CREATE_VM_ALWAYS_VALID (1 << 6) /* Flag that BO sharing will be explicitly synchronized */ #define AMDGPU_GEM_CREATE_EXPLICIT_SYNC (1 << 7) +/* Flag that indicates allocating MQD gart on GFX9, where the mtype + * for the second page onward should be set to NC. + */ +#define AMDGPU_GEM_CREATE_MQD_GFX9 (1 << 8) struct drm_amdgpu_gem_create_in { /** the requested memory size */ @@ -473,7 +506,8 @@ struct drm_amdgpu_gem_va { #define AMDGPU_HW_IP_UVD_ENC 5 #define AMDGPU_HW_IP_VCN_DEC 6 #define AMDGPU_HW_IP_VCN_ENC 7 -#define AMDGPU_HW_IP_NUM 8 +#define AMDGPU_HW_IP_VCN_JPEG 8 +#define AMDGPU_HW_IP_NUM 9 #define AMDGPU_HW_IP_INSTANCE_MAX_COUNT 1 @@ -482,6 +516,7 @@ struct drm_amdgpu_gem_va { #define AMDGPU_CHUNK_ID_DEPENDENCIES 0x03 #define AMDGPU_CHUNK_ID_SYNCOBJ_IN 0x04 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05 +#define AMDGPU_CHUNK_ID_BO_HANDLES 0x06 struct drm_amdgpu_cs_chunk { __u32 chunk_id; @@ -520,6 +555,10 @@ union drm_amdgpu_cs { /* Preempt flag, IB should set Pre_enb bit if PREEMPT flag detected */ #define AMDGPU_IB_FLAG_PREEMPT (1<<2) +/* The IB fence should do the L2 writeback but not invalidate any shader + * caches (L2/vL1/sL1/I$). */ +#define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3) + struct drm_amdgpu_cs_chunk_ib { __u32 _pad; /** AMDGPU_IB_FLAG_* */ @@ -618,6 +657,16 @@ struct drm_amdgpu_cs_chunk_data { #define AMDGPU_INFO_FW_SOS 0x0c /* Subquery id: Query PSP ASD firmware version */ #define AMDGPU_INFO_FW_ASD 0x0d + /* Subquery id: Query VCN firmware version */ + #define AMDGPU_INFO_FW_VCN 0x0e + /* Subquery id: Query GFX RLC SRLC firmware version */ + #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_CNTL 0x0f + /* Subquery id: Query GFX RLC SRLG firmware version */ + #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_GPM_MEM 0x10 + /* Subquery id: Query GFX RLC SRLS firmware version */ + #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11 + /* Subquery id: Query DMCU firmware version */ + #define AMDGPU_INFO_FW_DMCU 0x12 /* number of bytes moved for TTM migration */ #define AMDGPU_INFO_NUM_BYTES_MOVED 0x0f /* the used VRAM size */ @@ -806,6 +855,7 @@ struct drm_amdgpu_info_firmware { #define AMDGPU_VRAM_TYPE_GDDR5 5 #define AMDGPU_VRAM_TYPE_HBM 6 #define AMDGPU_VRAM_TYPE_DDR3 7 +#define AMDGPU_VRAM_TYPE_DDR4 8 struct drm_amdgpu_info_device { /** PCI Device ID */ diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h index f0bd91de0cf9..85c685a2075e 100644 --- a/include/drm-uapi/drm.h +++ b/include/drm-uapi/drm.h @@ -674,6 +674,22 @@ struct drm_get_cap { */ #define DRM_CLIENT_CAP_ATOMIC 3 +/** + * DRM_CLIENT_CAP_ASPECT_RATIO + * + * If set to 1, the DRM core will provide aspect ratio information in modes. + */ +#define DRM_CLIENT_CAP_ASPECT_RATIO 4 + +/** + * DRM_CLIENT_CAP_WRITEBACK_CONNECTORS + * + * If set to 1, the DRM core will expose special connectors to be used for + * writing back to memory the scene setup in the commit. Depends on client + * also supporting DRM_CLIENT_CAP_ATOMIC + */ +#define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS 5 + /** DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */ struct drm_set_client_cap { __u64 capability; diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h index e04613d30a13..0cd40ebfa1b1 100644 --- a/include/drm-uapi/drm_fourcc.h +++ b/include/drm-uapi/drm_fourcc.h @@ -30,11 +30,50 @@ extern "C" { #endif +/** + * DOC: overview + * + * In the DRM subsystem, framebuffer pixel formats are described using the + * fourcc codes defined in `include/uapi/drm/drm_fourcc.h`. In addition to the + * fourcc code, a Format Modifier may optionally be provided, in order to + * further describe the buffer's format - for example tiling or compression. + * + * Format Modifiers + * ---------------- + * + * Format modifiers are used in conjunction with a fourcc code, forming a + * unique fourcc:modifier pair. This format:modifier pair must fully define the + * format and data layout of the buffer, and should be the only way to describe + * that particular buffer. + * + * Having multiple fourcc:modifier pairs which describe the same layout should + * be avoided, as such aliases run the risk of different drivers exposing + * different names for the same data format, forcing userspace to understand + * that they are aliases. + * + * Format modifiers may change any property of the buffer, including the number + * of planes and/or the required allocation size. Format modifiers are + * vendor-namespaced, and as such the relationship between a fourcc code and a + * modifier is specific to the modifer being used. For example, some modifiers + * may preserve meaning - such as number of planes - from the fourcc code, + * whereas others may not. + * + * Vendors should document their modifier usage in as much detail as + * possible, to ensure maximum compatibility across devices, drivers and + * applications. + * + * The authoritative list of format modifier codes is found in + * `include/uapi/drm/drm_fourcc.h` + */ + #define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \ ((__u32)(c) << 16) | ((__u32)(d) << 24)) #define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of little endian */ +/* Reserve 0 for the invalid format specifier */ +#define DRM_FORMAT_INVALID 0 + /* color index */ #define DRM_FORMAT_C8 fourcc_code('C', '8', ' ', ' ') /* [7:0] C */ @@ -183,6 +222,7 @@ extern "C" { #define DRM_FORMAT_MOD_VENDOR_QCOM 0x05 #define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06 #define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07 +#define DRM_FORMAT_MOD_VENDOR_ARM 0x08 /* add more to the end as needed */ #define DRM_FORMAT_RESERVED ((1ULL << 56) - 1) @@ -298,6 +338,28 @@ extern "C" { */ #define DRM_FORMAT_MOD_SAMSUNG_64_32_TILE fourcc_mod_code(SAMSUNG, 1) +/* + * Tiled, 16 (pixels) x 16 (lines) - sized macroblocks + * + * This is a simple tiled layout using tiles of 16x16 pixels in a row-major + * layout. For YCbCr formats Cb/Cr components are taken in such a way that + * they correspond to their 16x16 luma block. + */ +#define DRM_FORMAT_MOD_SAMSUNG_16_16_TILE fourcc_mod_code(SAMSUNG, 2) + +/* + * Qualcomm Compressed Format + * + * Refers to a compressed variant of the base format that is compressed. + * Implementation may be platform and base-format specific. + * + * Each macrotile consists of m x n (mostly 4 x 4) tiles. + * Pixel data pitch/stride is aligned with macrotile width. + * Pixel data height is aligned with macrotile height. + * Entire pixel data buffer is aligned with 4k(bytes). + */ +#define DRM_FORMAT_MOD_QCOM_COMPRESSED fourcc_mod_code(QCOM, 1) + /* Vivante framebuffer modifiers */ /* @@ -384,6 +446,23 @@ extern "C" { #define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \ fourcc_mod_code(NVIDIA, 0x15) +/* + * Some Broadcom modifiers take parameters, for example the number of + * vertical lines in the image. Reserve the lower 32 bits for modifier + * type, and the next 24 bits for parameters. Top 8 bits are the + * vendor code. + */ +#define __fourcc_mod_broadcom_param_shift 8 +#define __fourcc_mod_broadcom_param_bits 48 +#define fourcc_mod_broadcom_code(val, params) \ + fourcc_mod_code(BROADCOM, ((((__u64)params) << __fourcc_mod_broadcom_param_shift) | val)) +#define fourcc_mod_broadcom_param(m) \ + ((int)(((m) >> __fourcc_mod_broadcom_param_shift) & \ + ((1ULL << __fourcc_mod_broadcom_param_bits) - 1))) +#define fourcc_mod_broadcom_mod(m) \ + ((m) & ~(((1ULL << __fourcc_mod_broadcom_param_bits) - 1) << \ + __fourcc_mod_broadcom_param_shift)) + /* * Broadcom VC4 "T" format * @@ -405,6 +484,151 @@ extern "C" { */ #define DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED fourcc_mod_code(BROADCOM, 1) +/* + * Broadcom SAND format + * + * This is the native format that the H.264 codec block uses. For VC4 + * HVS, it is only valid for H.264 (NV12/21) and RGBA modes. + * + * The image can be considered to be split into columns, and the + * columns are placed consecutively into memory. The width of those + * columns can be either 32, 64, 128, or 256 pixels, but in practice + * only 128 pixel columns are used. + * + * The pitch between the start of each column is set to optimally + * switch between SDRAM banks. This is passed as the number of lines + * of column width in the modifier (we can't use the stride value due + * to various core checks that look at it , so you should set the + * stride to width*cpp). + * + * Note that the column height for this format modifier is the same + * for all of the planes, assuming that each column contains both Y + * and UV. Some SAND-using hardware stores UV in a separate tiled + * image from Y to reduce the column height, which is not supported + * with these modifiers. + */ + +#define DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(2, v) +#define DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(3, v) +#define DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(4, v) +#define DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(v) \ + fourcc_mod_broadcom_code(5, v) + +#define DRM_FORMAT_MOD_BROADCOM_SAND32 \ + DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(0) +#define DRM_FORMAT_MOD_BROADCOM_SAND64 \ + DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(0) +#define DRM_FORMAT_MOD_BROADCOM_SAND128 \ + DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(0) +#define DRM_FORMAT_MOD_BROADCOM_SAND256 \ + DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(0) + +/* Broadcom UIF format + * + * This is the common format for the current Broadcom multimedia + * blocks, including V3D 3.x and newer, newer video codecs, and + * displays. + * + * The image consists of utiles (64b blocks), UIF blocks (2x2 utiles), + * and macroblocks (4x4 UIF blocks). Those 4x4 UIF block groups are + * stored in columns, with padding between the columns to ensure that + * moving from one column to the next doesn't hit the same SDRAM page + * bank. + * + * To calculate the padding, it is assumed that each hardware block + * and the software driving it knows the platform's SDRAM page size, + * number of banks, and XOR address, and that it's identical between + * all blocks using the format. This tiling modifier will use XOR as + * necessary to reduce the padding. If a hardware block can't do XOR, + * the assumption is that a no-XOR tiling modifier will be created. + */ +#define DRM_FORMAT_MOD_BROADCOM_UIF fourcc_mod_code(BROADCOM, 6) + +/* + * Arm Framebuffer Compression (AFBC) modifiers + * + * AFBC is a proprietary lossless image compression protocol and format. + * It provides fine-grained random access and minimizes the amount of data + * transferred between IP blocks. + * + * AFBC has several features which may be supported and/or used, which are + * represented using bits in the modifier. Not all combinations are valid, + * and different devices or use-cases may support different combinations. + */ +#define DRM_FORMAT_MOD_ARM_AFBC(__afbc_mode) fourcc_mod_code(ARM, __afbc_mode) + +/* + * AFBC superblock size + * + * Indicates the superblock size(s) used for the AFBC buffer. The buffer + * size (in pixels) must be aligned to a multiple of the superblock size. + * Four lowest significant bits(LSBs) are reserved for block size. + */ +#define AFBC_FORMAT_MOD_BLOCK_SIZE_MASK 0xf +#define AFBC_FORMAT_MOD_BLOCK_SIZE_16x16 (1ULL) +#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8 (2ULL) + +/* + * AFBC lossless colorspace transform + * + * Indicates that the buffer makes use of the AFBC lossless colorspace + * transform. + */ +#define AFBC_FORMAT_MOD_YTR (1ULL << 4) + +/* + * AFBC block-split + * + * Indicates that the payload of each superblock is split. The second + * half of the payload is positioned at a predefined offset from the start + * of the superblock payload. + */ +#define AFBC_FORMAT_MOD_SPLIT (1ULL << 5) + +/* + * AFBC sparse layout + * + * This flag indicates that the payload of each superblock must be stored at a + * predefined position relative to the other superblocks in the same AFBC + * buffer. This order is the same order used by the header buffer. In this mode + * each superblock is given the same amount of space as an uncompressed + * superblock of the particular format would require, rounding up to the next + * multiple of 128 bytes in size. + */ +#define AFBC_FORMAT_MOD_SPARSE (1ULL << 6) + +/* + * AFBC copy-block restrict + * + * Buffers with this flag must obey the copy-block restriction. The restriction + * is such that there are no copy-blocks referring across the border of 8x8 + * blocks. For the subsampled data the 8x8 limitation is also subsampled. + */ +#define AFBC_FORMAT_MOD_CBR (1ULL << 7) + +/* + * AFBC tiled layout + * + * The tiled layout groups superblocks in 8x8 or 4x4 tiles, where all + * superblocks inside a tile are stored together in memory. 8x8 tiles are used + * for pixel formats up to and including 32 bpp while 4x4 tiles are used for + * larger bpp formats. The order between the tiles is scan line. + * When the tiled layout is used, the buffer size (in pixels) must be aligned + * to the tile size. + */ +#define AFBC_FORMAT_MOD_TILED (1ULL << 8) + +/* + * AFBC solid color blocks + * + * Indicates that the buffer makes use of solid-color blocks, whereby bandwidth + * can be reduced if a whole superblock is a single color. + */ +#define AFBC_FORMAT_MOD_SC (1ULL << 9) + #if defined(__cplusplus) } #endif diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h index 2c575794fb52..d3e0fe31efc5 100644 --- a/include/drm-uapi/drm_mode.h +++ b/include/drm-uapi/drm_mode.h @@ -93,6 +93,15 @@ extern "C" { #define DRM_MODE_PICTURE_ASPECT_NONE 0 #define DRM_MODE_PICTURE_ASPECT_4_3 1 #define DRM_MODE_PICTURE_ASPECT_16_9 2 +#define DRM_MODE_PICTURE_ASPECT_64_27 3 +#define DRM_MODE_PICTURE_ASPECT_256_135 4 + +/* Content type options */ +#define DRM_MODE_CONTENT_TYPE_NO_DATA 0 +#define DRM_MODE_CONTENT_TYPE_GRAPHICS 1 +#define DRM_MODE_CONTENT_TYPE_PHOTO 2 +#define DRM_MODE_CONTENT_TYPE_CINEMA 3 +#define DRM_MODE_CONTENT_TYPE_GAME 4 /* Aspect ratio flag bitmask (4 bits 22:19) */ #define DRM_MODE_FLAG_PIC_AR_MASK (0x0F<<19) @@ -102,6 +111,10 @@ extern "C" { (DRM_MODE_PICTURE_ASPECT_4_3<<19) #define DRM_MODE_FLAG_PIC_AR_16_9 \ (DRM_MODE_PICTURE_ASPECT_16_9<<19) +#define DRM_MODE_FLAG_PIC_AR_64_27 \ + (DRM_MODE_PICTURE_ASPECT_64_27<<19) +#define DRM_MODE_FLAG_PIC_AR_256_135 \ + (DRM_MODE_PICTURE_ASPECT_256_135<<19) #define DRM_MODE_FLAG_ALL (DRM_MODE_FLAG_PHSYNC | \ DRM_MODE_FLAG_NHSYNC | \ @@ -173,8 +186,9 @@ extern "C" { /* * DRM_MODE_REFLECT_ * - * Signals that the contents of a drm plane is reflected in the axis, + * Signals that the contents of a drm plane is reflected along the axis, * in the same way as mirroring. + * See kerneldoc chapter "Plane Composition Properties" for more details. * * This define is provided as a convenience, looking up the property id * using the name->prop id lookup is the preferred method. @@ -338,6 +352,7 @@ enum drm_mode_subconnector { #define DRM_MODE_CONNECTOR_VIRTUAL 15 #define DRM_MODE_CONNECTOR_DSI 16 #define DRM_MODE_CONNECTOR_DPI 17 +#define DRM_MODE_CONNECTOR_WRITEBACK 18 struct drm_mode_get_connector { @@ -363,7 +378,7 @@ struct drm_mode_get_connector { __u32 pad; }; -#define DRM_MODE_PROP_PENDING (1<<0) +#define DRM_MODE_PROP_PENDING (1<<0) /* deprecated, do not use */ #define DRM_MODE_PROP_RANGE (1<<1) #define DRM_MODE_PROP_IMMUTABLE (1<<2) #define DRM_MODE_PROP_ENUM (1<<3) /* enumerated type with text strings */ @@ -598,8 +613,11 @@ struct drm_mode_crtc_lut { }; struct drm_color_ctm { - /* Conversion matrix in S31.32 format. */ - __s64 matrix[9]; + /* + * Conversion matrix in S31.32 sign-magnitude + * (not two's complement!) format. + */ + __u64 matrix[9]; }; struct drm_color_lut { diff --git a/include/drm-uapi/etnaviv_drm.h b/include/drm-uapi/etnaviv_drm.h index e9b997a0ef27..0d5c49dc478c 100644 --- a/include/drm-uapi/etnaviv_drm.h +++ b/include/drm-uapi/etnaviv_drm.h @@ -55,6 +55,12 @@ struct drm_etnaviv_timespec { #define ETNAVIV_PARAM_GPU_FEATURES_4 0x07 #define ETNAVIV_PARAM_GPU_FEATURES_5 0x08 #define ETNAVIV_PARAM_GPU_FEATURES_6 0x09 +#define ETNAVIV_PARAM_GPU_FEATURES_7 0x0a +#define ETNAVIV_PARAM_GPU_FEATURES_8 0x0b +#define ETNAVIV_PARAM_GPU_FEATURES_9 0x0c +#define ETNAVIV_PARAM_GPU_FEATURES_10 0x0d +#define ETNAVIV_PARAM_GPU_FEATURES_11 0x0e +#define ETNAVIV_PARAM_GPU_FEATURES_12 0x0f #define ETNAVIV_PARAM_GPU_STREAM_COUNT 0x10 #define ETNAVIV_PARAM_GPU_REGISTER_MAX 0x11 diff --git a/include/drm-uapi/exynos_drm.h b/include/drm-uapi/exynos_drm.h index a00116b5cc5c..7414cfd76419 100644 --- a/include/drm-uapi/exynos_drm.h +++ b/include/drm-uapi/exynos_drm.h @@ -135,6 +135,219 @@ struct drm_exynos_g2d_exec { __u64 async; }; +/* Exynos DRM IPP v2 API */ + +/** + * Enumerate available IPP hardware modules. + * + * @count_ipps: size of ipp_id array / number of ipp modules (set by driver) + * @reserved: padding + * @ipp_id_ptr: pointer to ipp_id array or NULL + */ +struct drm_exynos_ioctl_ipp_get_res { + __u32 count_ipps; + __u32 reserved; + __u64 ipp_id_ptr; +}; + +enum drm_exynos_ipp_format_type { + DRM_EXYNOS_IPP_FORMAT_SOURCE = 0x01, + DRM_EXYNOS_IPP_FORMAT_DESTINATION = 0x02, +}; + +struct drm_exynos_ipp_format { + __u32 fourcc; + __u32 type; + __u64 modifier; +}; + +enum drm_exynos_ipp_capability { + DRM_EXYNOS_IPP_CAP_CROP = 0x01, + DRM_EXYNOS_IPP_CAP_ROTATE = 0x02, + DRM_EXYNOS_IPP_CAP_SCALE = 0x04, + DRM_EXYNOS_IPP_CAP_CONVERT = 0x08, +}; + +/** + * Get IPP hardware capabilities and supported image formats. + * + * @ipp_id: id of IPP module to query + * @capabilities: bitmask of drm_exynos_ipp_capability (set by driver) + * @reserved: padding + * @formats_count: size of formats array (in entries) / number of filled + * formats (set by driver) + * @formats_ptr: pointer to formats array or NULL + */ +struct drm_exynos_ioctl_ipp_get_caps { + __u32 ipp_id; + __u32 capabilities; + __u32 reserved; + __u32 formats_count; + __u64 formats_ptr; +}; + +enum drm_exynos_ipp_limit_type { + /* size (horizontal/vertial) limits, in pixels (min, max, alignment) */ + DRM_EXYNOS_IPP_LIMIT_TYPE_SIZE = 0x0001, + /* scale ratio (horizonta/vertial), 16.16 fixed point (min, max) */ + DRM_EXYNOS_IPP_LIMIT_TYPE_SCALE = 0x0002, + + /* image buffer area */ + DRM_EXYNOS_IPP_LIMIT_SIZE_BUFFER = 0x0001 << 16, + /* src/dst rectangle area */ + DRM_EXYNOS_IPP_LIMIT_SIZE_AREA = 0x0002 << 16, + /* src/dst rectangle area when rotation enabled */ + DRM_EXYNOS_IPP_LIMIT_SIZE_ROTATED = 0x0003 << 16, + + DRM_EXYNOS_IPP_LIMIT_TYPE_MASK = 0x000f, + DRM_EXYNOS_IPP_LIMIT_SIZE_MASK = 0x000f << 16, +}; + +struct drm_exynos_ipp_limit_val { + __u32 min; + __u32 max; + __u32 align; + __u32 reserved; +}; + +/** + * IPP module limitation. + * + * @type: limit type (see drm_exynos_ipp_limit_type enum) + * @reserved: padding + * @h: horizontal limits + * @v: vertical limits + */ +struct drm_exynos_ipp_limit { + __u32 type; + __u32 reserved; + struct drm_exynos_ipp_limit_val h; + struct drm_exynos_ipp_limit_val v; +}; + +/** + * Get IPP limits for given image format. + * + * @ipp_id: id of IPP module to query + * @fourcc: image format code (see DRM_FORMAT_* in drm_fourcc.h) + * @modifier: image format modifier (see DRM_FORMAT_MOD_* in drm_fourcc.h) + * @type: source/destination identifier (drm_exynos_ipp_format_flag enum) + * @limits_count: size of limits array (in entries) / number of filled entries + * (set by driver) + * @limits_ptr: pointer to limits array or NULL + */ +struct drm_exynos_ioctl_ipp_get_limits { + __u32 ipp_id; + __u32 fourcc; + __u64 modifier; + __u32 type; + __u32 limits_count; + __u64 limits_ptr; +}; + +enum drm_exynos_ipp_task_id { + /* buffer described by struct drm_exynos_ipp_task_buffer */ + DRM_EXYNOS_IPP_TASK_BUFFER = 0x0001, + /* rectangle described by struct drm_exynos_ipp_task_rect */ + DRM_EXYNOS_IPP_TASK_RECTANGLE = 0x0002, + /* transformation described by struct drm_exynos_ipp_task_transform */ + DRM_EXYNOS_IPP_TASK_TRANSFORM = 0x0003, + /* alpha configuration described by struct drm_exynos_ipp_task_alpha */ + DRM_EXYNOS_IPP_TASK_ALPHA = 0x0004, + + /* source image data (for buffer and rectangle chunks) */ + DRM_EXYNOS_IPP_TASK_TYPE_SOURCE = 0x0001 << 16, + /* destination image data (for buffer and rectangle chunks) */ + DRM_EXYNOS_IPP_TASK_TYPE_DESTINATION = 0x0002 << 16, +}; + +/** + * Memory buffer with image data. + * + * @id: must be DRM_EXYNOS_IPP_TASK_BUFFER + * other parameters are same as for AddFB2 generic DRM ioctl + */ +struct drm_exynos_ipp_task_buffer { + __u32 id; + __u32 fourcc; + __u32 width, height; + __u32 gem_id[4]; + __u32 offset[4]; + __u32 pitch[4]; + __u64 modifier; +}; + +/** + * Rectangle for processing. + * + * @id: must be DRM_EXYNOS_IPP_TASK_RECTANGLE + * @reserved: padding + * @x,@y: left corner in pixels + * @w,@h: width/height in pixels + */ +struct drm_exynos_ipp_task_rect { + __u32 id; + __u32 reserved; + __u32 x; + __u32 y; + __u32 w; + __u32 h; +}; + +/** + * Image tranformation description. + * + * @id: must be DRM_EXYNOS_IPP_TASK_TRANSFORM + * @rotation: DRM_MODE_ROTATE_* and DRM_MODE_REFLECT_* values + */ +struct drm_exynos_ipp_task_transform { + __u32 id; + __u32 rotation; +}; + +/** + * Image global alpha configuration for formats without alpha values. + * + * @id: must be DRM_EXYNOS_IPP_TASK_ALPHA + * @value: global alpha value (0-255) + */ +struct drm_exynos_ipp_task_alpha { + __u32 id; + __u32 value; +}; + +enum drm_exynos_ipp_flag { + /* generate DRM event after processing */ + DRM_EXYNOS_IPP_FLAG_EVENT = 0x01, + /* dry run, only check task parameters */ + DRM_EXYNOS_IPP_FLAG_TEST_ONLY = 0x02, + /* non-blocking processing */ + DRM_EXYNOS_IPP_FLAG_NONBLOCK = 0x04, +}; + +#define DRM_EXYNOS_IPP_FLAGS (DRM_EXYNOS_IPP_FLAG_EVENT |\ + DRM_EXYNOS_IPP_FLAG_TEST_ONLY | DRM_EXYNOS_IPP_FLAG_NONBLOCK) + +/** + * Perform image processing described by array of drm_exynos_ipp_task_* + * structures (parameters array). + * + * @ipp_id: id of IPP module to run the task + * @flags: bitmask of drm_exynos_ipp_flag values + * @reserved: padding + * @params_size: size of parameters array (in bytes) + * @params_ptr: pointer to parameters array or NULL + * @user_data: (optional) data for drm event + */ +struct drm_exynos_ioctl_ipp_commit { + __u32 ipp_id; + __u32 flags; + __u32 reserved; + __u32 params_size; + __u64 params_ptr; + __u64 user_data; +}; + #define DRM_EXYNOS_GEM_CREATE 0x00 #define DRM_EXYNOS_GEM_MAP 0x01 /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */ @@ -147,6 +360,11 @@ struct drm_exynos_g2d_exec { #define DRM_EXYNOS_G2D_EXEC 0x22 /* Reserved 0x30 ~ 0x33 for obsolete Exynos IPP ioctls */ +/* IPP - Image Post Processing */ +#define DRM_EXYNOS_IPP_GET_RESOURCES 0x40 +#define DRM_EXYNOS_IPP_GET_CAPS 0x41 +#define DRM_EXYNOS_IPP_GET_LIMITS 0x42 +#define DRM_EXYNOS_IPP_COMMIT 0x43 #define DRM_IOCTL_EXYNOS_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + \ DRM_EXYNOS_GEM_CREATE, struct drm_exynos_gem_create) @@ -165,8 +383,20 @@ struct drm_exynos_g2d_exec { #define DRM_IOCTL_EXYNOS_G2D_EXEC DRM_IOWR(DRM_COMMAND_BASE + \ DRM_EXYNOS_G2D_EXEC, struct drm_exynos_g2d_exec) +#define DRM_IOCTL_EXYNOS_IPP_GET_RESOURCES DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_GET_RESOURCES, \ + struct drm_exynos_ioctl_ipp_get_res) +#define DRM_IOCTL_EXYNOS_IPP_GET_CAPS DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_GET_CAPS, struct drm_exynos_ioctl_ipp_get_caps) +#define DRM_IOCTL_EXYNOS_IPP_GET_LIMITS DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_GET_LIMITS, \ + struct drm_exynos_ioctl_ipp_get_limits) +#define DRM_IOCTL_EXYNOS_IPP_COMMIT DRM_IOWR(DRM_COMMAND_BASE + \ + DRM_EXYNOS_IPP_COMMIT, struct drm_exynos_ioctl_ipp_commit) + /* EXYNOS specific events */ #define DRM_EXYNOS_G2D_EVENT 0x80000000 +#define DRM_EXYNOS_IPP_EVENT 0x80000002 struct drm_exynos_g2d_event { struct drm_event base; @@ -177,6 +407,16 @@ struct drm_exynos_g2d_event { __u32 reserved; }; +struct drm_exynos_ipp_event { + struct drm_event base; + __u64 user_data; + __u32 tv_sec; + __u32 tv_usec; + __u32 ipp_id; + __u32 sequence; + __u64 reserved; +}; + #if defined(__cplusplus) } #endif diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h index 16e452aa12d4..b14ca9695f1e 100644 --- a/include/drm-uapi/i915_drm.h +++ b/include/drm-uapi/i915_drm.h @@ -62,6 +62,26 @@ extern "C" { #define I915_ERROR_UEVENT "ERROR" #define I915_RESET_UEVENT "RESET" +/* + * i915_user_extension: Base class for defining a chain of extensions + * + * Many interfaces need to grow over time. In most cases we can simply + * extend the struct and have userspace pass in more data. Another option, + * as demonstrated by Vulkan's approach to providing extensions for forward + * and backward compatibility, is to use a list of optional structs to + * provide those extra details. + * + * The key advantage to using an extension chain is that it allows us to + * redefine the interface more easily than an ever growing struct of + * increasing complexity, and for large parts of that interface to be + * entirely optional. The downside is more pointer chasing; chasing across + * the boundary with pointers encapsulated inside u64. + */ +struct i915_user_extension { + __u64 next_extension; + __u64 name; +}; + /* * MOCS indexes used for GPU surfaces, defining the cacheability of the * surface data and the coherency for this data wrt. CPU vs. GPU accesses. @@ -367,6 +387,7 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GET_SPRITE_COLORKEY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GET_SPRITE_COLORKEY, struct drm_intel_sprite_colorkey) #define DRM_IOCTL_I915_GEM_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_WAIT, struct drm_i915_gem_wait) #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create) +#define DRM_IOCTL_I915_GEM_CONTEXT_CREATE_v2 DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create_v2) #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy) #define DRM_IOCTL_I915_REG_READ DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read) #define DRM_IOCTL_I915_GET_RESET_STATS DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats) @@ -412,6 +433,14 @@ typedef struct drm_i915_irq_wait { int irq_seq; } drm_i915_irq_wait_t; +/* + * Different modes of per-process Graphics Translation Table, + * see I915_PARAM_HAS_ALIASING_PPGTT + */ +#define I915_GEM_PPGTT_NONE 0 +#define I915_GEM_PPGTT_ALIASING 1 +#define I915_GEM_PPGTT_FULL 2 + /* Ioctl to query kernel params: */ #define I915_PARAM_IRQ_ACTIVE 1 @@ -529,6 +558,35 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_CS_TIMESTAMP_FREQUENCY 51 +/* + * Once upon a time we supposed that writes through the GGTT would be + * immediately in physical memory (once flushed out of the CPU path). However, + * on a few different processors and chipsets, this is not necessarily the case + * as the writes appear to be buffered internally. Thus a read of the backing + * storage (physical memory) via a different path (with different physical tags + * to the indirect write via the GGTT) will see stale values from before + * the GGTT write. Inside the kernel, we can for the most part keep track of + * the different read/write domains in use (e.g. set-domain), but the assumption + * of coherency is baked into the ABI, hence reporting its true state in this + * parameter. + * + * Reports true when writes via mmap_gtt are immediately visible following an + * lfence to flush the WCB. + * + * Reports false when writes via mmap_gtt are indeterminately delayed in an in + * internal buffer and are _not_ immediately visible to third parties accessing + * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC + * communications channel when reporting false is strongly disadvised. + */ +#define I915_PARAM_MMAP_GTT_COHERENT 52 + +/* + * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel + * execution through use of explicit fence support. + * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT. + */ +#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53 + typedef struct drm_i915_getparam { __s32 param; /* @@ -942,7 +1000,7 @@ struct drm_i915_gem_execbuffer2 { * struct drm_i915_gem_exec_fence *fences. */ __u64 cliprects_ptr; -#define I915_EXEC_RING_MASK (7<<0) +#define I915_EXEC_RING_MASK (0x3f) #define I915_EXEC_DEFAULT (0<<0) #define I915_EXEC_RENDER (1<<0) #define I915_EXEC_BSD (2<<0) @@ -1048,7 +1106,16 @@ struct drm_i915_gem_execbuffer2 { */ #define I915_EXEC_FENCE_ARRAY (1<<19) -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1)) +/* + * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent + * a sync_file fd to wait upon (in a nonblocking manner) prior to executing + * the batch. + * + * Returns -EINVAL if the sync_file fd cannot be found. + */ +#define I915_EXEC_FENCE_SUBMIT (1<<20) + +#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT<<1)) #define I915_EXEC_CONTEXT_ID_MASK (0xffffffff) #define i915_execbuffer2_set_context_id(eb2, context) \ @@ -1387,6 +1454,16 @@ struct drm_i915_gem_context_create { __u32 pad; }; +struct drm_i915_gem_context_create_v2 { + /* output: id of new context*/ + __u32 ctx_id; + __u32 flags; +#define I915_GEM_CONTEXT_SHARE_GTT 0x1 +#define I915_GEM_CONTEXT_SINGLE_TIMELINE 0x2 + __u32 share_ctx; + __u32 pad; +}; + struct drm_i915_gem_context_destroy { __u32 ctx_id; __u32 pad; @@ -1456,9 +1533,122 @@ struct drm_i915_gem_context_param { #define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */ #define I915_CONTEXT_DEFAULT_PRIORITY 0 #define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */ + +/* + * I915_CONTEXT_PARAM_ENGINES: + * + * Bind this context to operate on this subset of available engines. Henceforth, + * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as + * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0] + * and upwards. The array created is offset by 1, such that by default + * I915_EXEC_DEFAULT is left empty, to be filled in as directed. Slots 1...N + * are then filled in using the specified (class, instance). + * + * Setting the number of engines bound to the context will revert back to + * default settings. + * + * See struct i915_context_param_engines. + * + * Extensions: + * i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) + * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) + */ +#define I915_CONTEXT_PARAM_ENGINES 0x7 + +/* + * When using the following param, value should be a pointer to + * drm_i915_gem_context_param_sseu. + */ +#define I915_CONTEXT_PARAM_SSEU 0x8 + __u64 value; }; +/* + * i915_context_engines_load_balance: + * + * Enable load balancing across this set of engines. + * + * Into the I915_EXEC_DEFAULT slot, a virtual engine is created that when + * used will proxy the execbuffer request onto one of the set of engines + * in such a way as to distribute the load evenly across the set. + * + * The set of engines must be compatible (e.g. the same HW class) as they + * will share the same logical GPU context and ring. + * + * The context must be defined to use a single timeline for all engines. + */ +struct i915_context_engines_load_balance { + struct i915_user_extension base; + + __u64 flags; /* all undefined flags must be zero */ + __u64 engines_mask; + + __u64 mbz[4]; /* reserved for future use; must be zero */ +}; + +/* + * i915_context_engines_bond: + * + */ +struct i915_context_engines_bond { + struct i915_user_extension base; + + __u16 master_class; + __u16 master_instance; + __u32 flags; /* all undefined flags must be zero */ + __u64 sibling_mask; +}; + +struct i915_context_param_engines { + __u64 extensions; +#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 +#define I915_CONTEXT_ENGINES_EXT_BOND 1 + + struct { + __u16 class; /* see enum drm_i915_gem_engine_class */ + __u16 instance; + } class_instance[0]; +}; + +struct drm_i915_gem_context_param_sseu { + /* + * Engine class & instance to be configured or queried. + */ + __u16 class; + __u16 instance; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd1; + + /* + * Mask of slices to enable for the context. Valid values are a subset + * of the bitmask value returned for I915_PARAM_SLICE_MASK. + */ + __u64 slice_mask; + + /* + * Mask of subslices to enable for the context. Valid values are a + * subset of the bitmask value return by I915_PARAM_SUBSLICE_MASK. + */ + __u64 subslice_mask; + + /* + * Minimum/Maximum number of EUs to enable per subslice for the + * context. min_eus_per_subslice must be inferior or equal to + * max_eus_per_subslice. + */ + __u16 min_eus_per_subslice; + __u16 max_eus_per_subslice; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd2; +}; + enum drm_i915_oa_format { I915_OA_FORMAT_A13 = 1, /* HSW only */ I915_OA_FORMAT_A29, /* HSW only */ @@ -1620,6 +1810,7 @@ struct drm_i915_perf_oa_config { struct drm_i915_query_item { __u64 query_id; #define DRM_I915_QUERY_TOPOLOGY_INFO 1 +#define DRM_I915_QUERY_ENGINE_INFO 2 /* * When set to zero by userspace, this is filled with the size of the @@ -1717,6 +1908,50 @@ struct drm_i915_query_topology_info { __u8 data[]; }; +/** + * struct drm_i915_engine_info + * + * Describes one engine and it's capabilities as known to the driver. + */ +struct drm_i915_engine_info { + /** Engine class as in enum drm_i915_gem_engine_class. */ + __u16 class; + + /** Engine instance number. */ + __u16 instance; + + /** Reserved field. */ + __u32 rsvd0; + + /** Engine flags. */ + __u64 flags; + + /** Capabilities of this engine. */ + __u64 capabilities; +#define I915_VIDEO_CLASS_CAPABILITY_HEVC (1 << 0) +#define I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC (1 << 1) + + /** Reserved fields. */ + __u64 rsvd1[4]; +}; + +/** + * struct drm_i915_query_engine_info + * + * Engine info query enumerates all engines known to the driver by filling in + * an array of struct drm_i915_engine_info structures. + */ +struct drm_i915_query_engine_info { + /** Number of struct drm_i915_engine_info structs following. */ + __u32 num_engines; + + /** MBZ */ + __u32 rsvd[3]; + + /** Marker for drm_i915_engine_info structures. */ + struct drm_i915_engine_info engines[]; +}; + #if defined(__cplusplus) } #endif diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h index bbbaffad772d..c06d0a5bdd80 100644 --- a/include/drm-uapi/msm_drm.h +++ b/include/drm-uapi/msm_drm.h @@ -201,10 +201,12 @@ struct drm_msm_gem_submit_bo { #define MSM_SUBMIT_NO_IMPLICIT 0x80000000 /* disable implicit sync */ #define MSM_SUBMIT_FENCE_FD_IN 0x40000000 /* enable input fence_fd */ #define MSM_SUBMIT_FENCE_FD_OUT 0x20000000 /* enable output fence_fd */ +#define MSM_SUBMIT_SUDO 0x10000000 /* run submitted cmds from RB */ #define MSM_SUBMIT_FLAGS ( \ MSM_SUBMIT_NO_IMPLICIT | \ MSM_SUBMIT_FENCE_FD_IN | \ MSM_SUBMIT_FENCE_FD_OUT | \ + MSM_SUBMIT_SUDO | \ 0) /* Each cmdstream submit consists of a table of buffers involved, and diff --git a/include/drm-uapi/sync_file.h b/include/drm-uapi/sync_file.h deleted file mode 100644 index b4f2db009347..000000000000 --- a/include/drm-uapi/sync_file.h +++ /dev/null @@ -1,98 +0,0 @@ -/* SPDX-License-Identifier: GPL-1.0+ WITH Linux-syscall-note */ -/* - * Copyright (C) 2012 Google, Inc. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - */ - -#ifndef _LINUX_SYNC_H -#define _LINUX_SYNC_H - -#include -#include - -/** - * struct sync_merge_data - data passed to merge ioctl - * @name: name of new fence - * @fd2: file descriptor of second fence - * @fence: returns the fd of the new fence to userspace - * @flags: merge_data flags - * @pad: padding for 64-bit alignment, should always be zero - */ -struct sync_merge_data { - char name[32]; - __s32 fd2; - __s32 fence; - __u32 flags; - __u32 pad; -}; - -/** - * struct sync_fence_info - detailed fence information - * @obj_name: name of parent sync_timeline -* @driver_name: name of driver implementing the parent -* @status: status of the fence 0:active 1:signaled <0:error - * @flags: fence_info flags - * @timestamp_ns: timestamp of status change in nanoseconds - */ -struct sync_fence_info { - char obj_name[32]; - char driver_name[32]; - __s32 status; - __u32 flags; - __u64 timestamp_ns; -}; - -/** - * struct sync_file_info - data returned from fence info ioctl - * @name: name of fence - * @status: status of fence. 1: signaled 0:active <0:error - * @flags: sync_file_info flags - * @num_fences number of fences in the sync_file - * @pad: padding for 64-bit alignment, should always be zero - * @sync_fence_info: pointer to array of structs sync_fence_info with all - * fences in the sync_file - */ -struct sync_file_info { - char name[32]; - __s32 status; - __u32 flags; - __u32 num_fences; - __u32 pad; - - __u64 sync_fence_info; -}; - -#define SYNC_IOC_MAGIC '>' - -/** - * Opcodes 0, 1 and 2 were burned during a API change to avoid users of the - * old API to get weird errors when trying to handling sync_files. The API - * change happened during the de-stage of the Sync Framework when there was - * no upstream users available. - */ - -/** - * DOC: SYNC_IOC_MERGE - merge two fences - * - * Takes a struct sync_merge_data. Creates a new fence containing copies of - * the sync_pts in both the calling fd and sync_merge_data.fd2. Returns the - * new fence's fd in sync_merge_data.fence - */ -#define SYNC_IOC_MERGE _IOWR(SYNC_IOC_MAGIC, 3, struct sync_merge_data) - -/** - * DOC: SYNC_IOC_FILE_INFO - get detailed information on a sync_file - * - * Takes a struct sync_file_info. If num_fences is 0, the field is updated - * with the actual number of fences. If num_fences is > 0, the system will - * use the pointer provided on sync_fence_info to return up to num_fences of - * struct sync_fence_info, with detailed fence information. - */ -#define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info) - -#endif /* _LINUX_SYNC_H */ diff --git a/include/drm-uapi/tegra_drm.h b/include/drm-uapi/tegra_drm.h index 12f9bf848db1..6c07919c04e9 100644 --- a/include/drm-uapi/tegra_drm.h +++ b/include/drm-uapi/tegra_drm.h @@ -32,143 +32,615 @@ extern "C" { #define DRM_TEGRA_GEM_CREATE_TILED (1 << 0) #define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1) +/** + * struct drm_tegra_gem_create - parameters for the GEM object creation IOCTL + */ struct drm_tegra_gem_create { + /** + * @size: + * + * The size, in bytes, of the buffer object to be created. + */ __u64 size; + + /** + * @flags: + * + * A bitmask of flags that influence the creation of GEM objects: + * + * DRM_TEGRA_GEM_CREATE_TILED + * Use the 16x16 tiling format for this buffer. + * + * DRM_TEGRA_GEM_CREATE_BOTTOM_UP + * The buffer has a bottom-up layout. + */ __u32 flags; + + /** + * @handle: + * + * The handle of the created GEM object. Set by the kernel upon + * successful completion of the IOCTL. + */ __u32 handle; }; +/** + * struct drm_tegra_gem_mmap - parameters for the GEM mmap IOCTL + */ struct drm_tegra_gem_mmap { + /** + * @handle: + * + * Handle of the GEM object to obtain an mmap offset for. + */ __u32 handle; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; + + /** + * @offset: + * + * The mmap offset for the given GEM object. Set by the kernel upon + * successful completion of the IOCTL. + */ __u64 offset; }; +/** + * struct drm_tegra_syncpt_read - parameters for the read syncpoint IOCTL + */ struct drm_tegra_syncpt_read { + /** + * @id: + * + * ID of the syncpoint to read the current value from. + */ __u32 id; + + /** + * @value: + * + * The current syncpoint value. Set by the kernel upon successful + * completion of the IOCTL. + */ __u32 value; }; +/** + * struct drm_tegra_syncpt_incr - parameters for the increment syncpoint IOCTL + */ struct drm_tegra_syncpt_incr { + /** + * @id: + * + * ID of the syncpoint to increment. + */ __u32 id; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_syncpt_wait - parameters for the wait syncpoint IOCTL + */ struct drm_tegra_syncpt_wait { + /** + * @id: + * + * ID of the syncpoint to wait on. + */ __u32 id; + + /** + * @thresh: + * + * Threshold value for which to wait. + */ __u32 thresh; + + /** + * @timeout: + * + * Timeout, in milliseconds, to wait. + */ __u32 timeout; + + /** + * @value: + * + * The new syncpoint value after the wait. Set by the kernel upon + * successful completion of the IOCTL. + */ __u32 value; }; #define DRM_TEGRA_NO_TIMEOUT (0xffffffff) +/** + * struct drm_tegra_open_channel - parameters for the open channel IOCTL + */ struct drm_tegra_open_channel { + /** + * @client: + * + * The client ID for this channel. + */ __u32 client; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; + + /** + * @context: + * + * The application context of this channel. Set by the kernel upon + * successful completion of the IOCTL. This context needs to be passed + * to the DRM_TEGRA_CHANNEL_CLOSE or the DRM_TEGRA_SUBMIT IOCTLs. + */ __u64 context; }; +/** + * struct drm_tegra_close_channel - parameters for the close channel IOCTL + */ struct drm_tegra_close_channel { + /** + * @context: + * + * The application context of this channel. This is obtained from the + * DRM_TEGRA_OPEN_CHANNEL IOCTL. + */ __u64 context; }; +/** + * struct drm_tegra_get_syncpt - parameters for the get syncpoint IOCTL + */ struct drm_tegra_get_syncpt { + /** + * @context: + * + * The application context identifying the channel for which to obtain + * the syncpoint ID. + */ __u64 context; + + /** + * @index: + * + * Index of the client syncpoint for which to obtain the ID. + */ __u32 index; + + /** + * @id: + * + * The ID of the given syncpoint. Set by the kernel upon successful + * completion of the IOCTL. + */ __u32 id; }; +/** + * struct drm_tegra_get_syncpt_base - parameters for the get wait base IOCTL + */ struct drm_tegra_get_syncpt_base { + /** + * @context: + * + * The application context identifying for which channel to obtain the + * wait base. + */ __u64 context; + + /** + * @syncpt: + * + * ID of the syncpoint for which to obtain the wait base. + */ __u32 syncpt; + + /** + * @id: + * + * The ID of the wait base corresponding to the client syncpoint. Set + * by the kernel upon successful completion of the IOCTL. + */ __u32 id; }; +/** + * struct drm_tegra_syncpt - syncpoint increment operation + */ struct drm_tegra_syncpt { + /** + * @id: + * + * ID of the syncpoint to operate on. + */ __u32 id; + + /** + * @incrs: + * + * Number of increments to perform for the syncpoint. + */ __u32 incrs; }; +/** + * struct drm_tegra_cmdbuf - structure describing a command buffer + */ struct drm_tegra_cmdbuf { + /** + * @handle: + * + * Handle to a GEM object containing the command buffer. + */ __u32 handle; + + /** + * @offset: + * + * Offset, in bytes, into the GEM object identified by @handle at + * which the command buffer starts. + */ __u32 offset; + + /** + * @words: + * + * Number of 32-bit words in this command buffer. + */ __u32 words; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_reloc - GEM object relocation structure + */ struct drm_tegra_reloc { struct { + /** + * @cmdbuf.handle: + * + * Handle to the GEM object containing the command buffer for + * which to perform this GEM object relocation. + */ __u32 handle; + + /** + * @cmdbuf.offset: + * + * Offset, in bytes, into the command buffer at which to + * insert the relocated address. + */ __u32 offset; } cmdbuf; struct { + /** + * @target.handle: + * + * Handle to the GEM object to be relocated. + */ __u32 handle; + + /** + * @target.offset: + * + * Offset, in bytes, into the target GEM object at which the + * relocated data starts. + */ __u32 offset; } target; + + /** + * @shift: + * + * The number of bits by which to shift relocated addresses. + */ __u32 shift; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_waitchk - wait check structure + */ struct drm_tegra_waitchk { + /** + * @handle: + * + * Handle to the GEM object containing a command stream on which to + * perform the wait check. + */ __u32 handle; + + /** + * @offset: + * + * Offset, in bytes, of the location in the command stream to perform + * the wait check on. + */ __u32 offset; + + /** + * @syncpt: + * + * ID of the syncpoint to wait check. + */ __u32 syncpt; + + /** + * @thresh: + * + * Threshold value for which to check. + */ __u32 thresh; }; +/** + * struct drm_tegra_submit - job submission structure + */ struct drm_tegra_submit { + /** + * @context: + * + * The application context identifying the channel to use for the + * execution of this job. + */ __u64 context; + + /** + * @num_syncpts: + * + * The number of syncpoints operated on by this job. This defines the + * length of the array pointed to by @syncpts. + */ __u32 num_syncpts; + + /** + * @num_cmdbufs: + * + * The number of command buffers to execute as part of this job. This + * defines the length of the array pointed to by @cmdbufs. + */ __u32 num_cmdbufs; + + /** + * @num_relocs: + * + * The number of relocations to perform before executing this job. + * This defines the length of the array pointed to by @relocs. + */ __u32 num_relocs; + + /** + * @num_waitchks: + * + * The number of wait checks to perform as part of this job. This + * defines the length of the array pointed to by @waitchks. + */ __u32 num_waitchks; + + /** + * @waitchk_mask: + * + * Bitmask of valid wait checks. + */ __u32 waitchk_mask; + + /** + * @timeout: + * + * Timeout, in milliseconds, before this job is cancelled. + */ __u32 timeout; + + /** + * @syncpts: + * + * A pointer to an array of &struct drm_tegra_syncpt structures that + * specify the syncpoint operations performed as part of this job. + * The number of elements in the array must be equal to the value + * given by @num_syncpts. + */ __u64 syncpts; + + /** + * @cmdbufs: + * + * A pointer to an array of &struct drm_tegra_cmdbuf structures that + * define the command buffers to execute as part of this job. The + * number of elements in the array must be equal to the value given + * by @num_syncpts. + */ __u64 cmdbufs; + + /** + * @relocs: + * + * A pointer to an array of &struct drm_tegra_reloc structures that + * specify the relocations that need to be performed before executing + * this job. The number of elements in the array must be equal to the + * value given by @num_relocs. + */ __u64 relocs; + + /** + * @waitchks: + * + * A pointer to an array of &struct drm_tegra_waitchk structures that + * specify the wait checks to be performed while executing this job. + * The number of elements in the array must be equal to the value + * given by @num_waitchks. + */ __u64 waitchks; - __u32 fence; /* Return value */ - __u32 reserved[5]; /* future expansion */ + /** + * @fence: + * + * The threshold of the syncpoint associated with this job after it + * has been completed. Set by the kernel upon successful completion of + * the IOCTL. This can be used with the DRM_TEGRA_SYNCPT_WAIT IOCTL to + * wait for this job to be finished. + */ + __u32 fence; + + /** + * @reserved: + * + * This field is reserved for future use. Must be 0. + */ + __u32 reserved[5]; }; #define DRM_TEGRA_GEM_TILING_MODE_PITCH 0 #define DRM_TEGRA_GEM_TILING_MODE_TILED 1 #define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2 +/** + * struct drm_tegra_gem_set_tiling - parameters for the set tiling IOCTL + */ struct drm_tegra_gem_set_tiling { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to set the tiling parameters. + */ __u32 handle; + + /** + * @mode: + * + * The tiling mode to set. Must be one of: + * + * DRM_TEGRA_GEM_TILING_MODE_PITCH + * pitch linear format + * + * DRM_TEGRA_GEM_TILING_MODE_TILED + * 16x16 tiling format + * + * DRM_TEGRA_GEM_TILING_MODE_BLOCK + * 16Bx2 tiling format + */ __u32 mode; + + /** + * @value: + * + * The value to set for the tiling mode parameter. + */ __u32 value; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; +/** + * struct drm_tegra_gem_get_tiling - parameters for the get tiling IOCTL + */ struct drm_tegra_gem_get_tiling { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to query the tiling parameters. + */ __u32 handle; - /* output */ + + /** + * @mode: + * + * The tiling mode currently associated with the GEM object. Set by + * the kernel upon successful completion of the IOCTL. + */ __u32 mode; + + /** + * @value: + * + * The tiling mode parameter currently associated with the GEM object. + * Set by the kernel upon successful completion of the IOCTL. + */ __u32 value; + + /** + * @pad: + * + * Structure padding that may be used in the future. Must be 0. + */ __u32 pad; }; #define DRM_TEGRA_GEM_BOTTOM_UP (1 << 0) #define DRM_TEGRA_GEM_FLAGS (DRM_TEGRA_GEM_BOTTOM_UP) +/** + * struct drm_tegra_gem_set_flags - parameters for the set flags IOCTL + */ struct drm_tegra_gem_set_flags { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to set the flags. + */ __u32 handle; - /* output */ + + /** + * @flags: + * + * The flags to set for the GEM object. + */ __u32 flags; }; +/** + * struct drm_tegra_gem_get_flags - parameters for the get flags IOCTL + */ struct drm_tegra_gem_get_flags { - /* input */ + /** + * @handle: + * + * Handle to the GEM object for which to query the flags. + */ __u32 handle; - /* output */ + + /** + * @flags: + * + * The flags currently associated with the GEM object. Set by the + * kernel upon successful completion of the IOCTL. + */ __u32 flags; }; @@ -193,7 +665,7 @@ struct drm_tegra_gem_get_flags { #define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr) #define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait) #define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel) -#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel) +#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_close_channel) #define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt) #define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit) #define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base) diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h new file mode 100644 index 000000000000..7b6627783608 --- /dev/null +++ b/include/drm-uapi/v3d_drm.h @@ -0,0 +1,194 @@ +/* + * Copyright © 2014-2018 Broadcom + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef _V3D_DRM_H_ +#define _V3D_DRM_H_ + +#include "drm.h" + +#if defined(__cplusplus) +extern "C" { +#endif + +#define DRM_V3D_SUBMIT_CL 0x00 +#define DRM_V3D_WAIT_BO 0x01 +#define DRM_V3D_CREATE_BO 0x02 +#define DRM_V3D_MMAP_BO 0x03 +#define DRM_V3D_GET_PARAM 0x04 +#define DRM_V3D_GET_BO_OFFSET 0x05 + +#define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl) +#define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo) +#define DRM_IOCTL_V3D_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_CREATE_BO, struct drm_v3d_create_bo) +#define DRM_IOCTL_V3D_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_MMAP_BO, struct drm_v3d_mmap_bo) +#define DRM_IOCTL_V3D_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param) +#define DRM_IOCTL_V3D_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset) + +/** + * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D + * engine. + * + * This asks the kernel to have the GPU execute an optional binner + * command list, and a render command list. + */ +struct drm_v3d_submit_cl { + /* Pointer to the binner command list. + * + * This is the first set of commands executed, which runs the + * coordinate shader to determine where primitives land on the screen, + * then writes out the state updates and draw calls necessary per tile + * to the tile allocation BO. + */ + __u32 bcl_start; + + /** End address of the BCL (first byte after the BCL) */ + __u32 bcl_end; + + /* Offset of the render command list. + * + * This is the second set of commands executed, which will either + * execute the tiles that have been set up by the BCL, or a fixed set + * of tiles (in the case of RCL-only blits). + */ + __u32 rcl_start; + + /** End address of the RCL (first byte after the RCL) */ + __u32 rcl_end; + + /** An optional sync object to wait on before starting the BCL. */ + __u32 in_sync_bcl; + /** An optional sync object to wait on before starting the RCL. */ + __u32 in_sync_rcl; + /** An optional sync object to place the completion fence in. */ + __u32 out_sync; + + /* Offset of the tile alloc memory + * + * This is optional on V3D 3.3 (where the CL can set the value) but + * required on V3D 4.1. + */ + __u32 qma; + + /** Size of the tile alloc memory. */ + __u32 qms; + + /** Offset of the tile state data array. */ + __u32 qts; + + /* Pointer to a u32 array of the BOs that are referenced by the job. + */ + __u64 bo_handles; + + /* Number of BO handles passed in (size is that times 4). */ + __u32 bo_handle_count; + + /* Pad, must be zero-filled. */ + __u32 pad; +}; + +/** + * struct drm_v3d_wait_bo - ioctl argument for waiting for + * completion of the last DRM_V3D_SUBMIT_CL on a BO. + * + * This is useful for cases where multiple processes might be + * rendering to a BO and you want to wait for all rendering to be + * completed. + */ +struct drm_v3d_wait_bo { + __u32 handle; + __u32 pad; + __u64 timeout_ns; +}; + +/** + * struct drm_v3d_create_bo - ioctl argument for creating V3D BOs. + * + * There are currently no values for the flags argument, but it may be + * used in a future extension. + */ +struct drm_v3d_create_bo { + __u32 size; + __u32 flags; + /** Returned GEM handle for the BO. */ + __u32 handle; + /** + * Returned offset for the BO in the V3D address space. This offset + * is private to the DRM fd and is valid for the lifetime of the GEM + * handle. + * + * This offset value will always be nonzero, since various HW + * units treat 0 specially. + */ + __u32 offset; +}; + +/** + * struct drm_v3d_mmap_bo - ioctl argument for mapping V3D BOs. + * + * This doesn't actually perform an mmap. Instead, it returns the + * offset you need to use in an mmap on the DRM device node. This + * means that tools like valgrind end up knowing about the mapped + * memory. + * + * There are currently no values for the flags argument, but it may be + * used in a future extension. + */ +struct drm_v3d_mmap_bo { + /** Handle for the object being mapped. */ + __u32 handle; + __u32 flags; + /** offset into the drm node to use for subsequent mmap call. */ + __u64 offset; +}; + +enum drm_v3d_param { + DRM_V3D_PARAM_V3D_UIFCFG, + DRM_V3D_PARAM_V3D_HUB_IDENT1, + DRM_V3D_PARAM_V3D_HUB_IDENT2, + DRM_V3D_PARAM_V3D_HUB_IDENT3, + DRM_V3D_PARAM_V3D_CORE0_IDENT0, + DRM_V3D_PARAM_V3D_CORE0_IDENT1, + DRM_V3D_PARAM_V3D_CORE0_IDENT2, +}; + +struct drm_v3d_get_param { + __u32 param; + __u32 pad; + __u64 value; +}; + +/** + * Returns the offset for the BO in the V3D address space for this DRM fd. + * This is the same value returned by drm_v3d_create_bo, if that was called + * from this DRM fd. + */ +struct drm_v3d_get_bo_offset { + __u32 handle; + __u32 offset; +}; + +#if defined(__cplusplus) +} +#endif + +#endif /* _V3D_DRM_H_ */ diff --git a/include/drm-uapi/vc4_drm.h b/include/drm-uapi/vc4_drm.h index 4117117b4204..31f50de39acb 100644 --- a/include/drm-uapi/vc4_drm.h +++ b/include/drm-uapi/vc4_drm.h @@ -183,10 +183,17 @@ struct drm_vc4_submit_cl { /* ID of the perfmon to attach to this job. 0 means no perfmon. */ __u32 perfmonid; - /* Unused field to align this struct on 64 bits. Must be set to 0. - * If one ever needs to add an u32 field to this struct, this field - * can be used. + /* Syncobj handle to wait on. If set, processing of this render job + * will not start until the syncobj is signaled. 0 means ignore. */ + __u32 in_sync; + + /* Syncobj handle to export fence to. If set, the fence in the syncobj + * will be replaced with a fence that signals upon completion of this + * render job. 0 means ignore. + */ + __u32 out_sync; + __u32 pad2; }; diff --git a/include/drm-uapi/virtgpu_drm.h b/include/drm-uapi/virtgpu_drm.h index 91a31ffed828..9a781f0611df 100644 --- a/include/drm-uapi/virtgpu_drm.h +++ b/include/drm-uapi/virtgpu_drm.h @@ -63,6 +63,7 @@ struct drm_virtgpu_execbuffer { }; #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */ +#define VIRTGPU_PARAM_CAPSET_QUERY_FIX 2 /* do we have the capset fix */ struct drm_virtgpu_getparam { __u64 param; diff --git a/include/drm-uapi/vmwgfx_drm.h b/include/drm-uapi/vmwgfx_drm.h index 0bc784f5e0db..399f58317cff 100644 --- a/include/drm-uapi/vmwgfx_drm.h +++ b/include/drm-uapi/vmwgfx_drm.h @@ -40,6 +40,7 @@ extern "C" { #define DRM_VMW_GET_PARAM 0 #define DRM_VMW_ALLOC_DMABUF 1 +#define DRM_VMW_ALLOC_BO 1 #define DRM_VMW_UNREF_DMABUF 2 #define DRM_VMW_HANDLE_CLOSE 2 #define DRM_VMW_CURSOR_BYPASS 3 @@ -68,6 +69,8 @@ extern "C" { #define DRM_VMW_GB_SURFACE_REF 24 #define DRM_VMW_SYNCCPU 25 #define DRM_VMW_CREATE_EXTENDED_CONTEXT 26 +#define DRM_VMW_GB_SURFACE_CREATE_EXT 27 +#define DRM_VMW_GB_SURFACE_REF_EXT 28 /*************************************************************************/ /** @@ -79,6 +82,9 @@ extern "C" { * * DRM_VMW_PARAM_OVERLAY_IOCTL: * Does the driver support the overlay ioctl. + * + * DRM_VMW_PARAM_SM4_1 + * SM4_1 support is enabled. */ #define DRM_VMW_PARAM_NUM_STREAMS 0 @@ -94,6 +100,8 @@ extern "C" { #define DRM_VMW_PARAM_MAX_MOB_SIZE 10 #define DRM_VMW_PARAM_SCREEN_TARGET 11 #define DRM_VMW_PARAM_DX 12 +#define DRM_VMW_PARAM_HW_CAPS2 13 +#define DRM_VMW_PARAM_SM4_1 14 /** * enum drm_vmw_handle_type - handle type for ref ioctls @@ -356,9 +364,9 @@ struct drm_vmw_fence_rep { /*************************************************************************/ /** - * DRM_VMW_ALLOC_DMABUF + * DRM_VMW_ALLOC_BO * - * Allocate a DMA buffer that is visible also to the host. + * Allocate a buffer object that is visible also to the host. * NOTE: The buffer is * identified by a handle and an offset, which are private to the guest, but * useable in the command stream. The guest kernel may translate these @@ -366,27 +374,28 @@ struct drm_vmw_fence_rep { * be zero at all times, or it may disappear from the interface before it is * fixed. * - * The DMA buffer may stay user-space mapped in the guest at all times, + * The buffer object may stay user-space mapped in the guest at all times, * and is thus suitable for sub-allocation. * - * DMA buffers are mapped using the mmap() syscall on the drm device. + * Buffer objects are mapped using the mmap() syscall on the drm device. */ /** - * struct drm_vmw_alloc_dmabuf_req + * struct drm_vmw_alloc_bo_req * * @size: Required minimum size of the buffer. * - * Input data to the DRM_VMW_ALLOC_DMABUF Ioctl. + * Input data to the DRM_VMW_ALLOC_BO Ioctl. */ -struct drm_vmw_alloc_dmabuf_req { +struct drm_vmw_alloc_bo_req { __u32 size; __u32 pad64; }; +#define drm_vmw_alloc_dmabuf_req drm_vmw_alloc_bo_req /** - * struct drm_vmw_dmabuf_rep + * struct drm_vmw_bo_rep * * @map_handle: Offset to use in the mmap() call used to map the buffer. * @handle: Handle unique to this buffer. Used for unreferencing. @@ -395,50 +404,32 @@ struct drm_vmw_alloc_dmabuf_req { * @cur_gmr_offset: Offset to use in the command stream when this buffer is * referenced. See note above. * - * Output data from the DRM_VMW_ALLOC_DMABUF Ioctl. + * Output data from the DRM_VMW_ALLOC_BO Ioctl. */ -struct drm_vmw_dmabuf_rep { +struct drm_vmw_bo_rep { __u64 map_handle; __u32 handle; __u32 cur_gmr_id; __u32 cur_gmr_offset; __u32 pad64; }; +#define drm_vmw_dmabuf_rep drm_vmw_bo_rep /** - * union drm_vmw_dmabuf_arg + * union drm_vmw_alloc_bo_arg * * @req: Input data as described above. * @rep: Output data as described above. * - * Argument to the DRM_VMW_ALLOC_DMABUF Ioctl. + * Argument to the DRM_VMW_ALLOC_BO Ioctl. */ -union drm_vmw_alloc_dmabuf_arg { - struct drm_vmw_alloc_dmabuf_req req; - struct drm_vmw_dmabuf_rep rep; -}; - -/*************************************************************************/ -/** - * DRM_VMW_UNREF_DMABUF - Free a DMA buffer. - * - */ - -/** - * struct drm_vmw_unref_dmabuf_arg - * - * @handle: Handle indicating what buffer to free. Obtained from the - * DRM_VMW_ALLOC_DMABUF Ioctl. - * - * Argument to the DRM_VMW_UNREF_DMABUF Ioctl. - */ - -struct drm_vmw_unref_dmabuf_arg { - __u32 handle; - __u32 pad64; +union drm_vmw_alloc_bo_arg { + struct drm_vmw_alloc_bo_req req; + struct drm_vmw_bo_rep rep; }; +#define drm_vmw_alloc_dmabuf_arg drm_vmw_alloc_bo_arg /*************************************************************************/ /** @@ -1103,9 +1094,8 @@ union drm_vmw_extended_context_arg { * DRM_VMW_HANDLE_CLOSE - Close a user-space handle and release its * underlying resource. * - * Note that this ioctl is overlaid on the DRM_VMW_UNREF_DMABUF Ioctl. - * The ioctl arguments therefore need to be identical in layout. - * + * Note that this ioctl is overlaid on the deprecated DRM_VMW_UNREF_DMABUF + * Ioctl. */ /** @@ -1119,7 +1109,107 @@ struct drm_vmw_handle_close_arg { __u32 handle; __u32 pad64; }; +#define drm_vmw_unref_dmabuf_arg drm_vmw_handle_close_arg + +/*************************************************************************/ +/** + * DRM_VMW_GB_SURFACE_CREATE_EXT - Create a host guest-backed surface. + * + * Allocates a surface handle and queues a create surface command + * for the host on the first use of the surface. The surface ID can + * be used as the surface ID in commands referencing the surface. + * + * This new command extends DRM_VMW_GB_SURFACE_CREATE by adding version + * parameter and 64 bit svga flag. + */ + +/** + * enum drm_vmw_surface_version + * + * @drm_vmw_surface_gb_v1: Corresponds to current gb surface format with + * svga3d surface flags split into 2, upper half and lower half. + */ +enum drm_vmw_surface_version { + drm_vmw_gb_surface_v1 +}; + +/** + * struct drm_vmw_gb_surface_create_ext_req + * + * @base: Surface create parameters. + * @version: Version of surface create ioctl. + * @svga3d_flags_upper_32_bits: Upper 32 bits of svga3d flags. + * @multisample_pattern: Multisampling pattern when msaa is supported. + * @quality_level: Precision settings for each sample. + * @must_be_zero: Reserved for future usage. + * + * Input argument to the DRM_VMW_GB_SURFACE_CREATE_EXT Ioctl. + * Part of output argument for the DRM_VMW_GB_SURFACE_REF_EXT Ioctl. + */ +struct drm_vmw_gb_surface_create_ext_req { + struct drm_vmw_gb_surface_create_req base; + enum drm_vmw_surface_version version; + uint32_t svga3d_flags_upper_32_bits; + SVGA3dMSPattern multisample_pattern; + SVGA3dMSQualityLevel quality_level; + uint64_t must_be_zero; +}; + +/** + * union drm_vmw_gb_surface_create_ext_arg + * + * @req: Input argument as described above. + * @rep: Output argument as described above. + * + * Argument to the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl. + */ +union drm_vmw_gb_surface_create_ext_arg { + struct drm_vmw_gb_surface_create_rep rep; + struct drm_vmw_gb_surface_create_ext_req req; +}; + +/*************************************************************************/ +/** + * DRM_VMW_GB_SURFACE_REF_EXT - Reference a host surface. + * + * Puts a reference on a host surface with a given handle, as previously + * returned by the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl. + * A reference will make sure the surface isn't destroyed while we hold + * it and will allow the calling client to use the surface handle in + * the command stream. + * + * On successful return, the Ioctl returns the surface information given + * to and returned from the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl. + */ +/** + * struct drm_vmw_gb_surface_ref_ext_rep + * + * @creq: The data used as input when the surface was created, as described + * above at "struct drm_vmw_gb_surface_create_ext_req" + * @crep: Additional data output when the surface was created, as described + * above at "struct drm_vmw_gb_surface_create_rep" + * + * Output Argument to the DRM_VMW_GB_SURFACE_REF_EXT ioctl. + */ +struct drm_vmw_gb_surface_ref_ext_rep { + struct drm_vmw_gb_surface_create_ext_req creq; + struct drm_vmw_gb_surface_create_rep crep; +}; + +/** + * union drm_vmw_gb_surface_reference_ext_arg + * + * @req: Input data as described above at "struct drm_vmw_surface_arg" + * @rep: Output data as described above at + * "struct drm_vmw_gb_surface_ref_ext_rep" + * + * Argument to the DRM_VMW_GB_SURFACE_REF Ioctl. + */ +union drm_vmw_gb_surface_reference_ext_arg { + struct drm_vmw_gb_surface_ref_ext_rep rep; + struct drm_vmw_surface_arg req; +}; #if defined(__cplusplus) } From patchwork Thu Oct 18 15:28:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647499 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ECC7F112B for ; Thu, 18 Oct 2018 15:28:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDEAA28C20 for ; Thu, 18 Oct 2018 15:28:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D205628D0A; Thu, 18 Oct 2018 15:28:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0D99228C20 for ; Thu, 18 Oct 2018 15:28:39 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AC57D6E03B; Thu, 18 Oct 2018 15:28:36 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by gabe.freedesktop.org (Postfix) with ESMTPS id B0C6E897FD for ; Thu, 18 Oct 2018 15:28:26 +0000 (UTC) Received: by mail-wm1-x342.google.com with SMTP id i8-v6so711982wmg.0 for ; Thu, 18 Oct 2018 08:28:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=5RbdrFjlDPY+960EFBzZBGdHAbO8lIGKnzy2CaYcybc=; b=BPKgryQHh2m1L5ME8zVAWMIPLDLIUvRsoU2iVg/i9xEC0SSLho80loj/6ndDwG7Zhj KytHxLsRcL46MVgsVgAiUD5FRKkcDSwoEsvOXzUSZ56U8AftB/yuD9mIkxQqUeIrEAmv hswTPuGVi+Fi6S8MjlmyBv03X4qFa0/kf8uaysAucEJ3OtCjkzW4Vp0TE+CAt7QbVAmN acCjNRcfCj8sMc+cE4ZGWiwp0IizXeKd6LxAtPZyx/DoY7kzIwN9qKgUg2UR9EUwcL0d HYmd6ix0obRMD0p4y7H7QDpQqvB5XMGhAHzYHsWJswdVyX7kGfhbO6V01C/TTo6BRUO7 YMpg== X-Gm-Message-State: ABuFfoj2FsojmMIgII3/LpGAvml/yEp3wObmEeic62Lnrw4XoYniZl1m Vls0I4YwfQhBW1S1i7vgFF9Ahg== X-Google-Smtp-Source: ACcGV60GDTSJKXqyMBtyMOZ5aJEq2fqDog2uLqYZoD0Nv2HTXADEFTqjPHMwB/YVl7EWS8CsswylfA== X-Received: by 2002:a1c:b605:: with SMTP id g5-v6mr762262wmf.97.1539876505084; Thu, 18 Oct 2018 08:28:25 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:24 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:00 +0100 Message-Id: <20181018152815.31816-3-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 02/17] trace.pl: Virtual engine support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Add virtual/queue timelines to both stdout and HTML output. A new timeline is created for each queue/virtual engine to display associated requests in queued and runnable states. Once requests are submitted to a real engine for executing they show up on the physical engine timeline. Signed-off-by: Tvrtko Ursulin Cc: Chris Wilson --- scripts/trace.pl | 230 ++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 200 insertions(+), 30 deletions(-) diff --git a/scripts/trace.pl b/scripts/trace.pl index 18f9f3b18396..72747b046202 100755 --- a/scripts/trace.pl +++ b/scripts/trace.pl @@ -27,10 +27,15 @@ use warnings; use 5.010; my $gid = 0; -my (%db, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines); +my (%db, %vdb, %queue, %submit, %notify, %rings, %ctxdb, %ringmap, %reqwait, %ctxtimelines); +my (%cids, %ctxmap); +my $cid = 0; +my %queues; my @freqs; -my $max_items = 3000; +use constant VENG => '255:0'; + +my $max_requests = 1000; my $width_us = 32000; my $correct_durations = 0; my %ignore_ring; @@ -180,21 +185,21 @@ sub arg_trace return @_; } -sub arg_max_items +sub arg_max_requests { my $val; return unless scalar(@_); - if ($_[0] eq '--max-items' or $_[0] eq '-m') { + if ($_[0] eq '--max-requests' or $_[0] eq '-m') { shift @_; $val = shift @_; - } elsif ($_[0] =~ /--max-items=(\d+)/) { + } elsif ($_[0] =~ /--max-requests=(\d+)/) { shift @_; $val = $1; } - $max_items = int($val) if defined $val; + $max_requests = int($val) if defined $val; return @_; } @@ -291,7 +296,7 @@ while (@args) { @args = arg_avg_delay_stats(@args); @args = arg_gpu_timeline(@args); @args = arg_trace(@args); - @args = arg_max_items(@args); + @args = arg_max_requests(@args); @args = arg_zoom_width(@args); @args = arg_split_requests(@args); @args = arg_ignore_ring(@args); @@ -380,6 +385,7 @@ while (<>) { my %rw; next if exists $reqwait{$key}; + die if $ring eq VENG and not exists $queues{$ctx}; $rw{'key'} = $key; $rw{'ring'} = $ring; @@ -388,9 +394,19 @@ while (<>) { $rw{'start'} = $time; $reqwait{$key} = \%rw; } elsif ($tp_name eq 'i915:i915_request_wait_end:') { - next unless exists $reqwait{$key}; + die if $ring eq VENG and not exists $queues{$ctx}; + + if (exists $reqwait{$key}) { + $reqwait{$key}->{'end'} = $time; + } else { # Virtual engine + my $vkey = db_key(VENG, $ctx, $seqno); + + die unless exists $reqwait{$vkey}; - $reqwait{$key}->{'end'} = $time; + # If the wait started on the virtual engine, attribute + # it to it completely. + $reqwait{$vkey}->{'end'} = $time; + } } elsif ($tp_name eq 'i915:i915_request_add:') { if (exists $queue{$key}) { $ctxdb{$orig_ctx}++; @@ -401,19 +417,52 @@ while (<>) { } $queue{$key} = $time; + if ($ring eq VENG and not exists $queues{$ctx}) { + $queues{$ctx} = 1 ; + $cids{$ctx} = $cid++; + $ctxmap{$cids{$ctx}} = $ctx; + } } elsif ($tp_name eq 'i915:i915_request_submit:') { die if exists $submit{$key}; die unless exists $queue{$key}; + die if $ring eq VENG and not exists $queues{$ctx}; $submit{$key} = $time; } elsif ($tp_name eq 'i915:i915_request_in:') { + my ($q, $s); my %req; # preemption delete $db{$key} if exists $db{$key}; - die unless exists $queue{$key}; - die unless exists $submit{$key}; + unless (exists $queue{$key}) { + # Virtual engine + my $vkey = db_key(VENG, $ctx, $seqno); + my %req; + + die unless exists $queues{$ctx}; + die unless exists $queue{$vkey}; + die unless exists $submit{$vkey}; + + # Create separate request record on the queue timeline + $q = $queue{$vkey}; + $s = $submit{$vkey}; + $req{'queue'} = $q; + $req{'submit'} = $s; + $req{'start'} = $time; + $req{'end'} = $time; + $req{'ring'} = VENG; + $req{'seqno'} = $seqno; + $req{'ctx'} = $ctx; + $req{'name'} = $ctx . '/' . $seqno; + $req{'global'} = $tp{'global'}; + $req{'port'} = $tp{'port'}; + + $vdb{$vkey} = \%req; + } else { + $q = $queue{$key}; + $s = $submit{$key}; + } $req{'start'} = $time; $req{'ring'} = $ring; @@ -423,8 +472,9 @@ while (<>) { $req{'name'} = $ctx . '/' . $seqno; $req{'global'} = $tp{'global'}; $req{'port'} = $tp{'port'}; - $req{'queue'} = $queue{$key}; - $req{'submit'} = $submit{$key}; + $req{'queue'} = $q; + $req{'submit'} = $s; + $req{'virtual'} = 1 if exists $queues{$ctx}; $rings{$ring} = $gid++ unless exists $rings{$ring}; $ringmap{$rings{$ring}} = $ring; $db{$key} = \%req; @@ -720,8 +770,10 @@ foreach my $key (@sorted_keys) { $running{$ring} += $end - $start if $correct_durations or not exists $db{$key}->{'no-end'}; - $runnable{$ring} += $db{$key}->{'execute-delay'}; - $queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'}; + unless (exists $db{$key}->{'virtual'}) { + $runnable{$ring} += $db{$key}->{'execute-delay'}; + $queued{$ring} += $start - $db{$key}->{'execute-delay'} - $db{$key}->{'queue'}; + } $batch_count{$ring}++; @@ -840,6 +892,12 @@ foreach my $key (keys %reqwait) { $reqw{$reqwait{$key}->{'ring'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'}; } +# Add up all request waits per virtual engine +my %vreqw; +foreach my $key (keys %reqwait) { + $vreqw{$reqwait{$key}->{'ctx'}} += $reqwait{$key}->{'end'} - $reqwait{$key}->{'start'}; +} + say sprintf('GPU: %.2f%% idle, %.2f%% busy', $flat_busy{'gpu-idle'}, $flat_busy{'gpu-busy'}) unless $html; @@ -961,18 +1019,24 @@ ENDHTML sub html_stats { my ($stats, $group, $id) = @_; + my $veng = exists $stats->{'virtual'} ? 1 : 0; my $name; - $name = 'Ring' . $group; + $name = $veng ? 'Virtual' : 'Ring'; + $name .= $group; $name .= '

'; - $name .= sprintf('%.2f', $stats->{'idle'}) . '% idle

'; - $name .= sprintf('%.2f', $stats->{'busy'}) . '% busy
'; + unless ($veng) { + $name .= sprintf('%.2f', $stats->{'idle'}) . '% idle

'; + $name .= sprintf('%.2f', $stats->{'busy'}) . '% busy
'; + } $name .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable
'; $name .= sprintf('%.2f', $stats->{'queued'}) . '% queued

'; $name .= sprintf('%.2f', $stats->{'wait'}) . '% wait

'; $name .= $stats->{'count'} . ' batches
'; - $name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch
'; - $name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch
'; + unless ($veng) { + $name .= sprintf('%.2f', $stats->{'avg'}) . 'us avg batch
'; + $name .= sprintf('%.2f', $stats->{'total-avg'}) . 'us avg engine batch
'; + } $name .= '
'; print "\t{id: $id, content: '$name'},\n"; @@ -981,17 +1045,24 @@ sub html_stats sub stdio_stats { my ($stats, $group, $id) = @_; + my $veng = exists $stats->{'virtual'} ? 1 : 0; my $str; - $str = 'Ring' . $group . ': '; + $str = $veng ? 'Virtual' : 'Ring'; + $str .= $group . ': '; $str .= $stats->{'count'} . ' batches, '; - $str .= sprintf('%.2f (%.2f) avg batch us, ', $stats->{'avg'}, $stats->{'total-avg'}); - $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, '; - $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, '; + unless ($veng) { + $str .= sprintf('%.2f (%.2f) avg batch us, ', + $stats->{'avg'}, $stats->{'total-avg'}); + $str .= sprintf('%.2f', $stats->{'idle'}) . '% idle, '; + $str .= sprintf('%.2f', $stats->{'busy'}) . '% busy, '; + } + $str .= sprintf('%.2f', $stats->{'runnable'}) . '% runnable, '; $str .= sprintf('%.2f', $stats->{'queued'}) . '% queued, '; $str .= sprintf('%.2f', $stats->{'wait'}) . '% wait'; - if ($avg_delay_stats) { + + if ($avg_delay_stats and not $veng) { $str .= ', submit/execute/save-avg=('; $str .= sprintf('%.2f/%.2f/%.2f)', $stats->{'submit'}, $stats->{'execute'}, $stats->{'save'}); } @@ -1013,8 +1084,16 @@ foreach my $group (sort keys %rings) { $stats{'idle'} = (1.0 - $flat_busy{$ring} / $elapsed) * 100.0; $stats{'busy'} = $running{$ring} / $elapsed * 100.0; - $stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0; - $stats{'queued'} = $queued{$ring} / $elapsed * 100.0; + if (exists $runnable{$ring}) { + $stats{'runnable'} = $runnable{$ring} / $elapsed * 100.0; + } else { + $stats{'runnable'} = 0; + } + if (exists $queued{$ring}) { + $stats{'queued'} = $queued{$ring} / $elapsed * 100.0; + } else { + $stats{'queued'} = 0; + } $reqw{$ring} = 0 unless exists $reqw{$ring}; $stats{'wait'} = $reqw{$ring} / $elapsed * 100.0; $stats{'count'} = $batch_count{$ring}; @@ -1031,6 +1110,59 @@ foreach my $group (sort keys %rings) { } } +sub sortVQueue { + my $as = $vdb{$a}->{'queue'}; + my $bs = $vdb{$b}->{'queue'}; + my $val; + + $val = $as <=> $bs; + $val = $a cmp $b if $val == 0; + + return $val; +} + +my @sorted_vkeys = sort sortVQueue keys %vdb; +my (%vqueued, %vrunnable); + +foreach my $key (@sorted_vkeys) { + my $ctx = $vdb{$key}->{'ctx'}; + + $vdb{$key}->{'submit-delay'} = $vdb{$key}->{'submit'} - $vdb{$key}->{'queue'}; + $vdb{$key}->{'execute-delay'} = $vdb{$key}->{'start'} - $vdb{$key}->{'submit'}; + + $vqueued{$ctx} += $vdb{$key}->{'submit-delay'}; + $vrunnable{$ctx} += $vdb{$key}->{'execute-delay'}; +} + +my $veng_id = $engine_start_id + scalar(keys %rings); + +foreach my $cid (sort keys %ctxmap) { + my $ctx = $ctxmap{$cid}; + my $elapsed = $last_ts - $first_ts; + my %stats; + + $stats{'virtual'} = 1; + if (exists $vrunnable{$ctx}) { + $stats{'runnable'} = $vrunnable{$ctx} / $elapsed * 100.0; + } else { + $stats{'runnable'} = 0; + } + if (exists $vqueued{$ctx}) { + $stats{'queued'} = $vqueued{$ctx} / $elapsed * 100.0; + } else { + $stats{'queued'} = 0; + } + $vreqw{$ctx} = 0 unless exists $vreqw{$ctx}; + $stats{'wait'} = $vreqw{$ctx} / $elapsed * 100.0; + $stats{'count'} = scalar(grep {$ctx == $vdb{$_}->{'ctx'}} keys %vdb); + + if ($html) { + html_stats(\%stats, $cid, $veng_id++); + } else { + stdio_stats(\%stats, $cid, $veng_id++); + } +} + exit 0 unless $html; print <{'name'}, $db{$key}->{'ctx'}, $db{$key}->{'seqno'}); my ($queue, $start, $notify, $end) = ($db{$key}->{'queue'}, $db{$key}->{'start'}, $db{$key}->{'notify'}, $db{$key}->{'end'}); @@ -1147,7 +1280,7 @@ foreach my $key (sort sortQueue keys %db) { my $skey; # submit to execute - unless (exists $skip_box{'queue'}) { + unless (exists $skip_box{'queue'} or exists $db{$key}->{'virtual'}) { $skey = 2 * $max_seqno * $ctx + 2 * $seqno; $style = box_style($ctx, 'queue'); $content = "$name
$db{$key}->{'submit-delay'}us ($db{$key}->{'execute-delay'}us)"; @@ -1158,7 +1291,7 @@ foreach my $key (sort sortQueue keys %db) { # execute to start $engine_start = $db{$key}->{'start'} unless defined $engine_start; - unless (exists $skip_box{'ready'}) { + unless (exists $skip_box{'ready'} or exists $db{$key}->{'virtual'}) { $skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1; $style = box_style($ctx, 'ready'); $content = "$name
$db{$key}->{'execute-delay'}us
"; @@ -1199,7 +1332,7 @@ foreach my $key (sort sortQueue keys %db) { $last_ts = $end; - last if $i > $max_items; + last if ++$req > $max_requests; } push @freqs, [$prev_freq_ts, $last_ts, $prev_freq] if $prev_freq; @@ -1232,6 +1365,43 @@ if ($gpu_timeline) { } } +$req = 0; +$veng_id = $engine_start_id + scalar(keys %rings); +foreach my $key (@sorted_vkeys) { + my ($name, $ctx, $seqno) = ($vdb{$key}->{'name'}, $vdb{$key}->{'ctx'}, $vdb{$key}->{'seqno'}); + my $queue = $vdb{$key}->{'queue'}; + my $submit = $vdb{$key}->{'submit'}; + my $engine_start = $db{$key}->{'engine-start'}; + my ($content, $style, $startend, $skey); + my $group = $veng_id + $cids{$ctx}; + my $subgroup = $ctx - $min_ctx; + my $type = ' type: \'range\','; + my $duration; + + # submit to execute + unless (exists $skip_box{'queue'}) { + $skey = 2 * $max_seqno * $ctx + 2 * $seqno; + $style = box_style($ctx, 'queue'); + $content = "$name
$vdb{$key}->{'submit-delay'}us ($vdb{$key}->{'execute-delay'}us)"; + $startend = 'start: ' . $queue . ', end: ' . $submit; + print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n"; + $i++; + } + + # execute to start + $engine_start = $vdb{$key}->{'start'} unless defined $engine_start; + unless (exists $skip_box{'ready'}) { + $skey = 2 * $max_seqno * $ctx + 2 * $seqno + 1; + $style = box_style($ctx, 'ready'); + $content = "$name
$vdb{$key}->{'execute-delay'}us
"; + $startend = 'start: ' . $submit . ', end: ' . $engine_start; + print "\t{id: $i, key: $skey, $type group: $group, subgroup: $subgroup, subgroupOrder: $subgroup, content: '$content', $startend, style: \'$style\'},\n"; + $i++; + } + + last if ++$req > $max_requests; +} + my $end_ts = $first_ts + $width_us; $first_ts = $first_ts; From patchwork Thu Oct 18 15:28:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647495 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 50F7615E2 for ; Thu, 18 Oct 2018 15:28:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4120528C20 for ; Thu, 18 Oct 2018 15:28:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 35A2028CCF; Thu, 18 Oct 2018 15:28:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2876728D0A for ; Thu, 18 Oct 2018 15:28:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D179489F03; Thu, 18 Oct 2018 15:28:29 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by gabe.freedesktop.org (Postfix) with ESMTPS id 484C389B46 for ; Thu, 18 Oct 2018 15:28:27 +0000 (UTC) Received: by mail-wr1-x442.google.com with SMTP id 61-v6so34169850wrb.6 for ; Thu, 18 Oct 2018 08:28:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=G/o09CJEDF2kXcq32WI57yTpVmpC6fRDYCe1a0UzdCY=; b=fyi7kcREhj3HFVNwA5oVzXOfsvfCpBGXm+WfHkBlqAZx01MkPS84ICGd1n10B7hnXT WMzHamGfR1OsWsSk7X2tOuQcJxtM4Gg5Di3yHITH5td+QDAuJPNX3Aa0zDGe0rOy9I+h 4RTmivehvkKjWWwsMXtrdG+lnvpRVjUBlu0ChoN/TzUUyFVMY+SYOZmEXxWeBMNgUMp6 AsyiDRGImhrEt63Arab1lUXFU9wawQY1Hv41xYsIWAkACAgz3awbX0oFpLEvimWGzjS0 Xa8kPHmL5CzjZ8qqW214EOItMSlS0RSfMlaHyMwmb83wE3Ei6TUade1QOcHJjwrc7xaG MlNg== X-Gm-Message-State: ABuFfoibgUeek7I6ydpBXcpC4AbVXrULcLg4UD14ud4twNW+FYP3hSSK dr2+lPtVpAnLyT75yx1FBYlwGg== X-Google-Smtp-Source: ACcGV60Lkkz4eZysy4QNXAjuEceLpTQCBPUq+1aHSkV3HpmYzMkTuJSK5cYfPLVr5RTSpDuTBN1/Nw== X-Received: by 2002:a5d:4292:: with SMTP id k18-v6mr28590265wrq.225.1539876505817; Thu, 18 Oct 2018 08:28:25 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:25 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:01 +0100 Message-Id: <20181018152815.31816-4-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 03/17] trace.pl: Virtual engine preemption support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Use the 'completed?' tracepoint field to detect more robustly when a request has been preempted and remove it from the engine database if so. Otherwise the script can hit a scenario where the same global seqno will be mentioned multiple times (on an engine seqno) which aborts processing. Signed-off-by: Tvrtko Ursulin --- scripts/trace.pl | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/scripts/trace.pl b/scripts/trace.pl index 72747b046202..a55e4f39539a 100755 --- a/scripts/trace.pl +++ b/scripts/trace.pl @@ -481,12 +481,16 @@ while (<>) { } elsif ($tp_name eq 'i915:i915_request_out:') { my $gkey = global_key($ring, $tp{'global'}); - die unless exists $db{$key}; - die unless exists $db{$key}->{'start'}; - die if exists $db{$key}->{'end'}; + if ($tp{'completed?'}) { + die unless exists $db{$key}; + die unless exists $db{$key}->{'start'}; + die if exists $db{$key}->{'end'}; - $db{$key}->{'end'} = $time; - $db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey}; + $db{$key}->{'end'} = $time; + $db{$key}->{'notify'} = $notify{$gkey} if exists $notify{$gkey}; + } else { + delete $db{$key}; + } } elsif ($tp_name eq 'i915:intel_engine_notify:') { my $gkey = global_key($ring, $seqno); From patchwork Thu Oct 18 15:28:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647491 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF8AF112B for ; Thu, 18 Oct 2018 15:28:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B066F28CCF for ; Thu, 18 Oct 2018 15:28:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A54C228D0C; Thu, 18 Oct 2018 15:28:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B9C1528D0A for ; Thu, 18 Oct 2018 15:28:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CF2BB89B46; Thu, 18 Oct 2018 15:28:29 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id 36DAB89B42 for ; Thu, 18 Oct 2018 15:28:28 +0000 (UTC) Received: by mail-wm1-x341.google.com with SMTP id 189-v6so707113wmw.2 for ; Thu, 18 Oct 2018 08:28:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dmdNDMLUpI5WBRnLuUxrhzZSCTmQHDEj5JShA5kbQTc=; b=ivX8sycTBJzU5ApmqbZueUCDTiDT1LLVAmQKny6d/SWQ9SHUzw8RJdTfksCos5Ugqg UPpPeXfcd9hjFGa/7s7Be7LB5BLRq1apPwQ6GEoNEYN7FTLbn6Pub/n5Hn6gyxKspkkX rdQff+KZ9XSLDCKft3fGBD12lKBwEupVXpmeegpQSTnltJhbyqNvIa6f54W69Vsg8qNB J/bbszQyDm9Y78bVWg9GQWaHzvuxI64SCUnpYWiMKHnp7pQ/+67ZiV1L/Z7ITtMNHbWo gZ2dnLa1LNcWPkhj/3oNpMhqrxseF1AyZKB73+l6q90hSiQe6v7vrSwiFHbfzp2cTJR3 4Ggw== X-Gm-Message-State: AGRZ1gK43/NY/MkjOyD/2dC7uXKK3PwOkqowBdtfty/ln+9LvKEsVGXy IWCrRjy7GbfG5965jgm4oq5ErA== X-Google-Smtp-Source: ACcGV62Ev8wovNipNeduUqeu/OJBhHjvIrpqHhefo+ooyOXebDmwGshS063BD4ZFv1xmdAr9wiDpbQ== X-Received: by 2002:a1c:3c4:: with SMTP id 187-v6mr812532wmd.90.1539876506669; Thu, 18 Oct 2018 08:28:26 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:26 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:02 +0100 Message-Id: <20181018152815.31816-5-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 04/17] wsim/media-bench: i915 balancing X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Support i915 virtual engine from gem_wsim (-b i915) and media-bench.pl Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 289 ++++++++++++++++++++++++++++++++++------- scripts/media-bench.pl | 9 +- 2 files changed, 251 insertions(+), 47 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index e0709487897b..e1c73855150b 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -59,6 +59,20 @@ #define LOCAL_I915_EXEC_FENCE_IN (1<<16) #define LOCAL_I915_EXEC_FENCE_OUT (1<<17) +struct local_drm_i915_gem_context_create_v2 { + /* output: id of new context*/ + __u32 ctx_id; + __u32 flags; +#define LOCAL_I915_GEM_CONTEXT_SHARE_GTT 0x1 +#define LOCAL_I915_GEM_CONTEXT_SINGLE_TIMELINE 0x2 + __u32 share_ctx; + __u32 pad; +}; + +#define LOCAL_DRM_IOCTL_I915_GEM_CONTEXT_CREATE DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct local_drm_i915_gem_context_create_v2) + +#define LOCAL_I915_CONTEXT_PARAM_ENGINES 0x7 + enum intel_engine_id { RCS, BCS, @@ -143,6 +157,14 @@ struct w_step DECLARE_EWMA(uint64_t, rt, 4, 2) +struct ctx { + uint32_t id; + int priority; + bool targets_instance; + bool wants_balance; + unsigned int static_vcs; +}; + struct workload { unsigned int id; @@ -164,11 +186,7 @@ struct workload struct timespec repeat_start; unsigned int nr_ctxs; - struct { - uint32_t id; - int priority; - unsigned int static_vcs; - } *ctx_list; + struct ctx *ctx_list; int sync_timeline; uint32_t sync_seqno; @@ -225,6 +243,7 @@ static int fd; #define HEARTBEAT (1<<7) #define GLOBAL_BALANCE (1<<8) #define DEPSYNC (1<<9) +#define I915 (1<<10) #define SEQNO_IDX(engine) ((engine) * 16) #define SEQNO_OFFSET(engine) (SEQNO_IDX(engine) * sizeof(uint32_t)) @@ -836,7 +855,11 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb, if (engine == VCS2 && (flags & VCS2REMAP)) engine = BCS; - eb->flags = eb_engine_map[engine]; + if ((flags & I915) && engine == VCS) { + eb->flags = 0; + } else { + eb->flags = eb_engine_map[engine]; + } } static void @@ -862,6 +885,23 @@ get_status_objects(struct workload *wrk) return wrk->status_object; } +static struct ctx * +__get_ctx(struct workload *wrk, struct w_step *w) +{ + return &wrk->ctx_list[w->context * 2]; +} + +static uint32_t +get_ctxid(struct workload *wrk, struct w_step *w) +{ + struct ctx *ctx = __get_ctx(wrk, w); + + if (ctx->targets_instance && ctx->wants_balance && w->engine == VCS) + return wrk->ctx_list[w->context * 2 + 1].id; + else + return wrk->ctx_list[w->context * 2].id; +} + static void alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags) { @@ -914,7 +954,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags) w->eb.buffers_ptr = to_user_pointer(w->obj); w->eb.buffer_count = j + 1; - w->eb.rsvd1 = wrk->ctx_list[w->context].id; + w->eb.rsvd1 = get_ctxid(wrk, w); if (flags & SWAPVCS && engine == VCS1) engine = VCS2; @@ -927,17 +967,29 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags) printf("%x|", w->obj[i].handle); printf(" %10lu flags=%llx bb=%x[%u] ctx[%u]=%u\n", w->bb_sz, w->eb.flags, w->bb_handle, j, w->context, - wrk->ctx_list[w->context].id); + get_ctxid(wrk, w)); #endif } +static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio) +{ + struct drm_i915_gem_context_param param = { + .ctx_id = ctx_id, + .param = I915_CONTEXT_PARAM_PRIORITY, + .value = prio, + }; + + if (prio) + gem_context_set_param(fd, ¶m); +} + static void prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) { unsigned int ctx_vcs = 0; int max_ctx = -1; struct w_step *w; - int i; + int i, j; wrk->id = id; wrk->prng = rand(); @@ -968,44 +1020,174 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) } } + /* + * Pre-scan workload steps to allocate context list storage. + */ for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) { - if ((int)w->context > max_ctx) { - int delta = w->context + 1 - wrk->nr_ctxs; + int ctx = w->context * 2 + 1; /* Odd slots are special. */ + int delta; + + if (ctx <= max_ctx) + continue; + + delta = ctx + 1 - wrk->nr_ctxs; + + wrk->nr_ctxs += delta; + wrk->ctx_list = realloc(wrk->ctx_list, + wrk->nr_ctxs * sizeof(*wrk->ctx_list)); + memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0, + delta * sizeof(*wrk->ctx_list)); + + max_ctx = ctx; + } + + /* + * Identify if contexts target specific engine instances and if they + * want to be balanced. + */ + for (j = 0; j < wrk->nr_ctxs; j += 2) { + bool targets = false; + bool balance = false; + + for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) { + if (w->type != BATCH) + continue; + + if (w->context != (j / 2)) + continue; - wrk->nr_ctxs += delta; - wrk->ctx_list = realloc(wrk->ctx_list, - wrk->nr_ctxs * - sizeof(*wrk->ctx_list)); - memset(&wrk->ctx_list[wrk->nr_ctxs - delta], 0, - delta * sizeof(*wrk->ctx_list)); + if (w->engine == VCS) + balance = true; + else + targets = true; + } - max_ctx = w->context; + if (flags & I915) { + wrk->ctx_list[j].targets_instance = targets; + wrk->ctx_list[j].wants_balance = balance; } + } - if (!wrk->ctx_list[w->context].id) { - struct drm_i915_gem_context_create arg = {}; + /* + * Create and configure contexts. + */ + for (i = 0; i < wrk->nr_ctxs; i += 2) { + struct ctx *ctx = &wrk->ctx_list[i]; + uint32_t ctx_id, share_ctx = 0; - drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &arg); - igt_assert(arg.ctx_id); + if (ctx->id) + continue; - wrk->ctx_list[w->context].id = arg.ctx_id; + if (flags & I915) { + struct local_drm_i915_gem_context_create_v2 args = { }; - if (flags & GLOBAL_BALANCE) { - wrk->ctx_list[w->context].static_vcs = context_vcs_rr; - context_vcs_rr ^= 1; - } else { - wrk->ctx_list[w->context].static_vcs = ctx_vcs; - ctx_vcs ^= 1; - } + /* Find existing context to share ppgtt with. */ + for (j = 0; j < wrk->nr_ctxs; j++) { + if (!wrk->ctx_list[j].id) + continue; - if (wrk->prio) { - struct drm_i915_gem_context_param param = { - .ctx_id = arg.ctx_id, - .param = I915_CONTEXT_PARAM_PRIORITY, - .value = wrk->prio, - }; - gem_context_set_param(fd, ¶m); + args.flags |= LOCAL_I915_GEM_CONTEXT_SHARE_GTT; + args.share_ctx = share_ctx = + wrk->ctx_list[j].id; + break; } + + if (!ctx->targets_instance) + args.flags |= LOCAL_I915_GEM_CONTEXT_SINGLE_TIMELINE; + + drmIoctl(fd, LOCAL_DRM_IOCTL_I915_GEM_CONTEXT_CREATE, + &args); + + ctx_id = args.ctx_id; + } else { + struct drm_i915_gem_context_create args = {}; + + drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args); + ctx_id = args.ctx_id; + } + + igt_assert(ctx_id); + ctx->id = ctx_id; + + if (flags & GLOBAL_BALANCE) { + ctx->static_vcs = context_vcs_rr; + context_vcs_rr ^= 1; + } else { + ctx->static_vcs = ctx_vcs; + ctx_vcs ^= 1; + } + + __ctx_set_prio(ctx_id, wrk->prio); + + /* + * Do we need a separate context to satisfy this workloads which + * both want to target specific engines and be balanced by i915? + */ + if ((flags & I915) && ctx->wants_balance && + ctx->targets_instance) { + struct local_drm_i915_gem_context_create_v2 args = {}; + + igt_assert(share_ctx); + + args.flags = LOCAL_I915_GEM_CONTEXT_SINGLE_TIMELINE | + LOCAL_I915_GEM_CONTEXT_SHARE_GTT; + args.share_ctx = share_ctx; + + drmIoctl(fd, LOCAL_DRM_IOCTL_I915_GEM_CONTEXT_CREATE, + &args); + + igt_assert(args.ctx_id); + ctx_id = args.ctx_id; + wrk->ctx_list[i + 1].id = args.ctx_id; + + __ctx_set_prio(ctx_id, wrk->prio); + } + + if (ctx->wants_balance) { + #define LOCAL_I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 + + struct local_i915_user_extension { + __u64 next_extension; + __u64 name; + }; + + struct local_i915_context_engines_load_balance { + struct local_i915_user_extension base; + + __u64 flags; /* all undefined flags must be zero */ + __u64 engines_mask; + + __u64 mbz[4]; /* reserved for future use; must be zero */ + } load_balance = { + .base.name = LOCAL_I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE, + .engines_mask = -1, + }; + + struct local_i915_context_param_engines { + __u64 extensions; + + struct { + __u16 class; /* see enum drm_i915_gem_engine_class */ + __u16 instance; + } engines[2]; + } __attribute__((packed)) set_engines = { + .extensions = to_user_pointer(&load_balance), + .engines = { + { .class = I915_ENGINE_CLASS_VIDEO, + .instance = 0 }, + { .class = I915_ENGINE_CLASS_VIDEO, + .instance = 1 }, + }, + }; + + struct drm_i915_gem_context_param param = { + .ctx_id = ctx_id, + .param = LOCAL_I915_CONTEXT_PARAM_ENGINES, + .size = sizeof(set_engines), + .value = to_user_pointer(&set_engines), + }; + + gem_context_set_param(fd, ¶m); } } @@ -1380,7 +1562,7 @@ static enum intel_engine_id context_balance(const struct workload_balancer *balancer, struct workload *wrk, struct w_step *w) { - return get_vcs_engine(wrk->ctx_list[w->context].static_vcs); + return get_vcs_engine(__get_ctx(wrk, w)->static_vcs); } static unsigned int @@ -1574,6 +1756,12 @@ static const struct workload_balancer all_balancers[] = { .get_qd = get_engine_busy, .balance = busy_avg_balance, }, + { + .id = 11, + .name = "i915", + .desc = "i915 balancing.", + .flags = I915, + }, }; static unsigned int @@ -1952,7 +2140,8 @@ static void *run_workload(void *data) last_sync = false; wrk->nr_bb[engine]++; - if (engine == VCS && wrk->balancer) { + if (engine == VCS && wrk->balancer && + wrk->balancer->balance) { engine = wrk->balancer->balance(wrk->balancer, wrk, w); wrk->nr_bb[engine]++; @@ -2379,6 +2568,12 @@ int main(int argc, char **argv) return 1; } + if ((flags & VCS2REMAP) && (flags & I915)) { + if (verbose) + fprintf(stderr, "VCS remapping not supported with i915 balancing!\n"); + return 1; + } + if (!nop_calibration) { if (verbose > 1) printf("Calibrating nop delay with %u%% tolerance...\n", @@ -2464,11 +2659,17 @@ int main(int argc, char **argv) printf("%u client%s.\n", clients, clients > 1 ? "s" : ""); if (flags & SWAPVCS) printf("Swapping VCS rings between clients.\n"); - if (flags & GLOBAL_BALANCE) - printf("Using %s balancer in global mode.\n", - balancer->name); - else if (balancer) + if (flags & GLOBAL_BALANCE) { + if (flags & I915) { + printf("Ignoring global balancing with i915!\n"); + flags &= ~GLOBAL_BALANCE; + } else { + printf("Using %s balancer in global mode.\n", + balancer->name); + } + } else if (balancer) { printf("Using %s balancer.\n", balancer->name); + } } if (master_workload >= 0 && clients == 1) @@ -2485,7 +2686,7 @@ int main(int argc, char **argv) if (flags & SWAPVCS && i & 1) flags_ &= ~SWAPVCS; - if (flags & GLOBAL_BALANCE) { + if ((flags & GLOBAL_BALANCE) && !(flags & I915)) { w[i]->balancer = &global_balancer; w[i]->global_wrk = w[0]; w[i]->global_balancer = balancer; diff --git a/scripts/media-bench.pl b/scripts/media-bench.pl index 066b542f95df..ddf9c0ec05c8 100755 --- a/scripts/media-bench.pl +++ b/scripts/media-bench.pl @@ -49,10 +49,11 @@ my $nop; my %opts; my @balancers = ( 'rr', 'rand', 'qd', 'qdr', 'qdavg', 'rt', 'rtr', 'rtavg', - 'context', 'busy', 'busy-avg' ); + 'context', 'busy', 'busy-avg', 'i915' ); my %bal_skip_H = ( 'rr' => 1, 'rand' => 1, 'context' => 1, , 'busy' => 1, - 'busy-avg' => 1 ); -my %bal_skip_R = ( 'context' => 1 ); + 'busy-avg' => 1, 'i915' => 1 ); +my %bal_skip_R = ( 'context' => 1, 'i915' => 1 ); +my %bal_skip_G = ( 'i915' => 1 ); my @workloads = ( 'media_load_balance_17i7.wsim', @@ -498,6 +499,8 @@ foreach my $wrk (@saturation_workloads) { my $bid; if ($bal ne '') { + next GBAL if $G =~ '-G' and exists $bal_skip_G{$bal}; + push @xargs, "-b $bal"; push @xargs, '-R' unless exists $bal_skip_R{$bal}; push @xargs, $G if $G ne ''; From patchwork Thu Oct 18 15:28:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647497 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDB7615E2 for ; Thu, 18 Oct 2018 15:28:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DF29528C20 for ; Thu, 18 Oct 2018 15:28:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D3B0228D0A; Thu, 18 Oct 2018 15:28:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6DA4F28C20 for ; Thu, 18 Oct 2018 15:28:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 71D886E028; Thu, 18 Oct 2018 15:28:36 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4BEAD89B46 for ; Thu, 18 Oct 2018 15:28:29 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id a13-v6so34181341wrt.5 for ; Thu, 18 Oct 2018 08:28:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IMOZtAfKHgCLP3UGqfwGDGpyct1Bn+/Fe+cpSQ+behY=; b=i0+AAY4boqN8bdR/+RO8hZx4xfo8LaypFloZ91i6sZJt/gf3wybiCeQRvERd/33AFB NX/i0n+tOeF4lpwqbaZ8u8AM60IUTj3fSiCKPhQ7G7agRL+4f2owST2wHI9d00kZaD/n jtkupulqNa81LsH6snJT/5sQcxdjtk/W0GI2lk0/YVCHnXcoLRRk4W/JiYpNoRbIgFVT /VgNn9pXjd5mczGNaYPH7YHcGxEK2RAMidKFenWKGBBveVI7eie7OQ65SVdTZ6Exw+As e61d4fZm0nZmYia6B9lniH8NjkikIFTrYEwZy745mUjcRodm7ji+L3IbUXZa0ovowPZD ZxRw== X-Gm-Message-State: ABuFfohK1v4aX8XEhCCYC55m5NC/HA6Dzj/JAB4qCABeOKb7/xGZaoSc 9eTL7McyNp6jTPe2pLFYPHwpCw== X-Google-Smtp-Source: ACcGV60FD7e+BxILX+G2VQl4tfCuOdeEHVZHdWpfgQj726AbU0S3EwTdkO6PfDll8c6F2QWA3zbYPQ== X-Received: by 2002:adf:9206:: with SMTP id 6-v6mr29981838wrj.275.1539876507689; Thu, 18 Oct 2018 08:28:27 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:27 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:03 +0100 Message-Id: <20181018152815.31816-6-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 05/17] gem_wsim: Use IGT uapi headers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin We are moving towards bumping the uAPI headers more often instead of using too much local struct/ioctl/param definitions since the latter are more challenging for rebase and maintenance. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 68 +++++++++++-------------------------------- 1 file changed, 17 insertions(+), 51 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index e1c73855150b..adfc2b1bc819 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -41,7 +41,6 @@ #include #include - #include "intel_chipset.h" #include "intel_reg.h" #include "drm.h" @@ -56,23 +55,6 @@ #include "ewma.h" -#define LOCAL_I915_EXEC_FENCE_IN (1<<16) -#define LOCAL_I915_EXEC_FENCE_OUT (1<<17) - -struct local_drm_i915_gem_context_create_v2 { - /* output: id of new context*/ - __u32 ctx_id; - __u32 flags; -#define LOCAL_I915_GEM_CONTEXT_SHARE_GTT 0x1 -#define LOCAL_I915_GEM_CONTEXT_SINGLE_TIMELINE 0x2 - __u32 share_ctx; - __u32 pad; -}; - -#define LOCAL_DRM_IOCTL_I915_GEM_CONTEXT_CREATE DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct local_drm_i915_gem_context_create_v2) - -#define LOCAL_I915_CONTEXT_PARAM_ENGINES 0x7 - enum intel_engine_id { RCS, BCS, @@ -873,7 +855,7 @@ eb_update_flags(struct w_step *w, enum intel_engine_id engine, igt_assert(w->emit_fence <= 0); if (w->emit_fence) - w->eb.flags |= LOCAL_I915_EXEC_FENCE_OUT; + w->eb.flags |= I915_EXEC_FENCE_OUT; } static struct drm_i915_gem_exec_object2 * @@ -1079,24 +1061,23 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) continue; if (flags & I915) { - struct local_drm_i915_gem_context_create_v2 args = { }; + struct drm_i915_gem_context_create_v2 args = { }; /* Find existing context to share ppgtt with. */ for (j = 0; j < wrk->nr_ctxs; j++) { if (!wrk->ctx_list[j].id) continue; - args.flags |= LOCAL_I915_GEM_CONTEXT_SHARE_GTT; + args.flags |= I915_GEM_CONTEXT_SHARE_GTT; args.share_ctx = share_ctx = wrk->ctx_list[j].id; break; } if (!ctx->targets_instance) - args.flags |= LOCAL_I915_GEM_CONTEXT_SINGLE_TIMELINE; + args.flags |= I915_GEM_CONTEXT_SINGLE_TIMELINE; - drmIoctl(fd, LOCAL_DRM_IOCTL_I915_GEM_CONTEXT_CREATE, - &args); + drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args); ctx_id = args.ctx_id; } else { @@ -1125,16 +1106,15 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) */ if ((flags & I915) && ctx->wants_balance && ctx->targets_instance) { - struct local_drm_i915_gem_context_create_v2 args = {}; + struct drm_i915_gem_context_create_v2 args = {}; igt_assert(share_ctx); - args.flags = LOCAL_I915_GEM_CONTEXT_SINGLE_TIMELINE | - LOCAL_I915_GEM_CONTEXT_SHARE_GTT; + args.flags = I915_GEM_CONTEXT_SINGLE_TIMELINE | + I915_GEM_CONTEXT_SHARE_GTT; args.share_ctx = share_ctx; - drmIoctl(fd, LOCAL_DRM_IOCTL_I915_GEM_CONTEXT_CREATE, - &args); + drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args); igt_assert(args.ctx_id); ctx_id = args.ctx_id; @@ -1144,24 +1124,10 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) } if (ctx->wants_balance) { - #define LOCAL_I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 - - struct local_i915_user_extension { - __u64 next_extension; - __u64 name; - }; - - struct local_i915_context_engines_load_balance { - struct local_i915_user_extension base; - - __u64 flags; /* all undefined flags must be zero */ - __u64 engines_mask; - - __u64 mbz[4]; /* reserved for future use; must be zero */ - } load_balance = { - .base.name = LOCAL_I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE, - .engines_mask = -1, - }; + struct i915_context_engines_load_balance load_balance = + { .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE, + .engines_mask = -1, + }; struct local_i915_context_param_engines { __u64 extensions; @@ -1182,7 +1148,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) struct drm_i915_gem_context_param param = { .ctx_id = ctx_id, - .param = LOCAL_I915_CONTEXT_PARAM_ENGINES, + .param = I915_CONTEXT_PARAM_ENGINES, .size = sizeof(set_engines), .value = to_user_pointer(&set_engines), }; @@ -1994,16 +1960,16 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine, igt_assert(tgt >= 0 && tgt < w->idx); igt_assert(wrk->steps[tgt].emit_fence > 0); - w->eb.flags |= LOCAL_I915_EXEC_FENCE_IN; + w->eb.flags |= I915_EXEC_FENCE_IN; w->eb.rsvd2 = wrk->steps[tgt].emit_fence; } - if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT) + if (w->eb.flags & I915_EXEC_FENCE_OUT) gem_execbuf_wr(fd, &w->eb); else gem_execbuf(fd, &w->eb); - if (w->eb.flags & LOCAL_I915_EXEC_FENCE_OUT) { + if (w->eb.flags & I915_EXEC_FENCE_OUT) { w->emit_fence = w->eb.rsvd2 >> 32; igt_assert(w->emit_fence > 0); } From patchwork Thu Oct 18 15:28:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647501 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 92B5B15E2 for ; Thu, 18 Oct 2018 15:28:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8559928C20 for ; Thu, 18 Oct 2018 15:28:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 799A228D0A; Thu, 18 Oct 2018 15:28:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 46F3928C20 for ; Thu, 18 Oct 2018 15:28:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 712066E027; Thu, 18 Oct 2018 15:28:36 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by gabe.freedesktop.org (Postfix) with ESMTPS id E5C6D89F27 for ; Thu, 18 Oct 2018 15:28:29 +0000 (UTC) Received: by mail-wr1-x442.google.com with SMTP id 63-v6so34126203wra.11 for ; Thu, 18 Oct 2018 08:28:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QUpPyk+E/7qBU13aSyZPfOMN8atg//VVh+aOeAGvn4M=; b=kpja8JWUXM1eKOKlnDFZ5O0vW9t9mPU3St3oJEhymYevVCTALosx2+X0oIqP3v+g+Z Z5qkFUkwTNebYtjf9EGOqOclLimQolv7f3QebFo1TU2JQOY4TBC3kt89jTz7AJ4VLGU8 2HqSKkKazr8Xj8J1ingOi8AE23F+gebhrscIpJ/Gs5lbbXL30FL9VzE2Zh4DCMM9MFzT q91HiL7NfKslXXaxTWa1dbGIsWVLnMzOSjd5+DaJUkjLBZboBM8BMKstwZyCw/seEnX2 ljLcKpPs4zI4rpYEbduf8kG2XDrAvOu2/zRU/R/LoB1YcHtv/ZoYcPNY2y3/ix2JSW6c UjnQ== X-Gm-Message-State: ABuFfohpKRKUAlDX7aZgX0p9B/fW0/+dWGJcKLaXduJEqVz/gNQ/MWlh lSP8ETQKi6LAk9dHxirSnkx6CQ== X-Google-Smtp-Source: ACcGV609GdTnjFbmNNz5tYnMtawqovbk4QO7Oz2JVUivti7HkwvCFdzK/yh779MWa64aV40pVABsuw== X-Received: by 2002:adf:e8d2:: with SMTP id k18-v6mr28479556wrn.104.1539876508428; Thu, 18 Oct 2018 08:28:28 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:28 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:04 +0100 Message-Id: <20181018152815.31816-7-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 06/17] gem_wsim: Fix shadowed local X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 1 - 1 file changed, 1 deletion(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index adfc2b1bc819..2561817622f6 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -1170,7 +1170,6 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) */ for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) { struct w_step *w2; - int j; if (w->type != PREEMPTION) continue; From patchwork Thu Oct 18 15:28:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647503 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A274112B for ; Thu, 18 Oct 2018 15:28:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2B71428C20 for ; Thu, 18 Oct 2018 15:28:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2009028D0A; Thu, 18 Oct 2018 15:28:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 803BC28C20 for ; Thu, 18 Oct 2018 15:28:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CB1976E087; Thu, 18 Oct 2018 15:28:40 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2A51C6E022 for ; Thu, 18 Oct 2018 15:28:31 +0000 (UTC) Received: by mail-wr1-x443.google.com with SMTP id y16so34184003wrw.3 for ; Thu, 18 Oct 2018 08:28:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vLZ9XFT7yyUTuFaOcpkgrZN3DwIalaOSZw4YJbSi6cc=; b=XPVG+DqA+dUyKRJLyMIMHuEt4FqNWRLw0TgBfT0Y/grlF51ovg4fZkKX2TBEzxtHh5 hZVlyvI03cuV8SAevES4VzfKuxRrFTOUHbn8CCXhhjBjvtPQu7AoW/Two9m1kScK1+qE 3IpDGrWy59f3I7NU2V/jKBiinRKAR0j/QbdR9aehZ0jDLdZfcsKe9Yu0FxmgPgIr0tBx WdEXisypl4ccwBT5IW0dzLMU335Kcz+tQ2lnzs1nvqwxUwuBGp+3Ooh5uxOn3TljDSMS CPRStYr5zJSCSXa9bEvVCPYOHQNB7Ss46HjIyJKNKbE3Ei0s2ilKFhAYJX8woIJeO/5G YJXw== X-Gm-Message-State: ABuFfohjMhJkMD3skZmnXVnGoP8tsDqT0AoucnMZCKtbWS24QFcUSMW1 9xuhmEI6GPmpDRFGrYK8mRUEiw== X-Google-Smtp-Source: ACcGV63vuLuMXFW3O4goQsodiOe1Reh4JMAV00PCniRZoaWpZjIDUs2PtEwpUB1JTVq1IqmZStURJw== X-Received: by 2002:adf:d181:: with SMTP id h1-v6mr7184274wri.138.1539876509575; Thu, 18 Oct 2018 08:28:29 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:28 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:05 +0100 Message-Id: <20181018152815.31816-8-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 07/17] gem_wsim: Factor out common error handling X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin There is a repeated pattern with error handling which can be moved to a macro to for better readability in the command parsing loop. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 244 +++++++++++++++--------------------------- 1 file changed, 88 insertions(+), 156 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index 2561817622f6..a6ee6c493424 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -290,6 +290,27 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc) return 0; } +static void __attribute__((format(printf, 1, 2))) +wsim_err(const char *fmt, ...) +{ + va_list ap; + + if (!verbose) + return; + + va_start(ap, fmt); + vfprintf(stderr, fmt, ap); + va_end(ap); +} + +#define check_arg(cond, fmt, ...) \ +{ \ + if (cond) { \ + wsim_err(fmt, __VA_ARGS__); \ + return NULL; \ + } \ +} + static struct workload * parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) { @@ -320,14 +341,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp <= 0) { - if (verbose) - fprintf(stderr, - "Invalid delay at step %u!\n", - nr_steps); - return NULL; - } - + check_arg(tmp <= 0, + "Invalid delay at step %u!\n", + nr_steps); step.type = DELAY; step.delay = tmp; goto add_step; @@ -336,14 +352,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp <= 0) { - if (verbose) - fprintf(stderr, - "Invalid period at step %u!\n", - nr_steps); - return NULL; - } - + check_arg(tmp <= 0, + "Invalid period at step %u!\n", + nr_steps); step.type = PERIOD; step.period = tmp; goto add_step; @@ -353,25 +364,17 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) while ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp <= 0 && nr == 0) { - if (verbose) - fprintf(stderr, - "Invalid context at step %u!\n", - nr_steps); - return NULL; - } - - if (nr == 0) { + check_arg(nr == 0 && tmp <= 0, + "Invalid context at step %u!\n", + nr_steps); + check_arg(nr > 1, + "Invalid priority format at step %u!\n", + nr_steps); + + if (nr == 0) step.context = tmp; - } else if (nr == 1) { + else step.priority = tmp; - } else { - if (verbose) - fprintf(stderr, - "Invalid priority format at step %u!\n", - nr_steps); - return NULL; - } nr++; } @@ -382,15 +385,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp >= 0 || - ((int)nr_steps + tmp) < 0) { - if (verbose) - fprintf(stderr, - "Invalid sync target at step %u!\n", - nr_steps); - return NULL; - } - + check_arg(tmp >= 0 || + ((int)nr_steps + tmp) < 0, + "Invalid sync target at step %u!\n", + nr_steps); step.type = SYNC; step.target = tmp; goto add_step; @@ -399,14 +397,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp < 0) { - if (verbose) - fprintf(stderr, - "Invalid throttle at step %u!\n", - nr_steps); - return NULL; - } - + check_arg(tmp < 0, + "Invalid throttle at step %u!\n", + nr_steps); step.type = THROTTLE; step.throttle = tmp; goto add_step; @@ -415,14 +408,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp < 0) { - if (verbose) - fprintf(stderr, - "Invalid qd throttle at step %u!\n", - nr_steps); - return NULL; - } - + check_arg(tmp < 0, + "Invalid qd throttle at step %u!\n", + nr_steps); step.type = QD_THROTTLE; step.throttle = tmp; goto add_step; @@ -431,14 +419,9 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp >= 0) { - if (verbose) - fprintf(stderr, - "Invalid sw fence signal at step %u!\n", - nr_steps); - return NULL; - } - + check_arg(tmp >= 0, + "Invalid sw fence signal at step %u!\n", + nr_steps); step.type = SW_FENCE_SIGNAL; step.target = tmp; goto add_step; @@ -451,31 +434,20 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) while ((field = strtok_r(fstart, ".", &fctx)) != NULL) { tmp = atoi(field); - if (tmp <= 0 && nr == 0) { - if (verbose) - fprintf(stderr, - "Invalid context at step %u!\n", - nr_steps); - return NULL; - } else if (tmp < 0 && nr == 1) { - if (verbose) - fprintf(stderr, - "Invalid preemption period at step %u!\n", - nr_steps); - return NULL; - } - - if (nr == 0) { + check_arg(nr == 0 && tmp <= 0, + "Invalid context at step %u!\n", + nr_steps); + check_arg(nr == 1 && tmp < 0, + "Invalid preemption period at step %u!\n", + nr_steps); + check_arg(nr > 1, + "Invalid preemption format at step %u!\n", + nr_steps); + + if (nr == 0) step.context = tmp; - } else if (nr == 1) { + else step.period = tmp; - } else { - if (verbose) - fprintf(stderr, - "Invalid preemption format at step %u!\n", - nr_steps); - return NULL; - } nr++; } @@ -485,13 +457,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) } tmp = atoi(field); - if (tmp < 0) { - if (verbose) - fprintf(stderr, - "Invalid ctx id at step %u!\n", - nr_steps); - return NULL; - } + check_arg(tmp < 0, "Invalid ctx id at step %u!\n", + nr_steps); step.context = tmp; valid++; @@ -512,13 +479,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) } } - if (old_valid == valid) { - if (verbose) - fprintf(stderr, - "Invalid engine id at step %u!\n", - nr_steps); - return NULL; - } + check_arg(old_valid == valid, + "Invalid engine id at step %u!\n", nr_steps); } if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { @@ -528,25 +490,19 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) fstart = NULL; tmpl = strtol(field, &sep, 10); - if (tmpl <= 0 || tmpl == LONG_MIN || tmpl == LONG_MAX) { - if (verbose) - fprintf(stderr, - "Invalid duration at step %u!\n", - nr_steps); - return NULL; - } + check_arg(tmpl <= 0 || tmpl == LONG_MIN || + tmpl == LONG_MAX, + "Invalid duration at step %u!\n", nr_steps); step.duration.min = tmpl; if (sep && *sep == '-') { tmpl = strtol(sep + 1, NULL, 10); - if (tmpl <= 0 || tmpl <= step.duration.min || - tmpl == LONG_MIN || tmpl == LONG_MAX) { - if (verbose) - fprintf(stderr, - "Invalid duration range at step %u!\n", - nr_steps); - return NULL; - } + check_arg(tmpl <= 0 || + tmpl <= step.duration.min || + tmpl == LONG_MIN || + tmpl == LONG_MAX, + "Invalid duration range at step %u!\n", + nr_steps); step.duration.max = tmpl; } else { step.duration.max = step.duration.min; @@ -559,13 +515,8 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) fstart = NULL; tmp = parse_dependencies(nr_steps, &step, field); - if (tmp < 0) { - if (verbose) - fprintf(stderr, - "Invalid dependency at step %u!\n", - nr_steps); - return NULL; - } + check_arg(tmp < 0, + "Invalid dependency at step %u!\n", nr_steps); valid++; } @@ -573,25 +524,16 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { fstart = NULL; - if (strlen(field) != 1 || - (field[0] != '0' && field[0] != '1')) { - if (verbose) - fprintf(stderr, - "Invalid wait boolean at step %u!\n", - nr_steps); - return NULL; - } + check_arg(strlen(field) != 1 || + (field[0] != '0' && field[0] != '1'), + "Invalid wait boolean at step %u!\n", + nr_steps); step.sync = field[0] - '0'; valid++; } - if (valid != 5) { - if (verbose) - fprintf(stderr, "Invalid record at step %u!\n", - nr_steps); - return NULL; - } + check_arg(valid != 5, "Invalid record at step %u!\n", nr_steps); step.type = BATCH; @@ -636,15 +578,10 @@ add_step: for (i = 0; i < nr_steps; i++) { for (j = 0; j < steps[i].fence_deps.nr; j++) { tmp = steps[i].idx + steps[i].fence_deps.list[j]; - if (tmp < 0 || tmp >= i || - (steps[tmp].type != BATCH && - steps[tmp].type != SW_FENCE)) { - if (verbose) - fprintf(stderr, - "Invalid dependency target %u!\n", - i); - return NULL; - } + check_arg(tmp < 0 || tmp >= i || + (steps[tmp].type != BATCH && + steps[tmp].type != SW_FENCE), + "Invalid dependency target %u!\n", i); steps[tmp].emit_fence = -1; } } @@ -653,14 +590,9 @@ add_step: for (i = 0; i < nr_steps; i++) { if (steps[i].type == SW_FENCE_SIGNAL) { tmp = steps[i].idx + steps[i].target; - if (tmp < 0 || tmp >= i || - steps[tmp].type != SW_FENCE) { - if (verbose) - fprintf(stderr, - "Invalid sw fence target %u!\n", - i); - return NULL; - } + check_arg(tmp < 0 || tmp >= i || + steps[tmp].type != SW_FENCE, + "Invalid sw fence target %u!\n", i); } } From patchwork Thu Oct 18 15:28:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647511 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9828C112B for ; Thu, 18 Oct 2018 15:28:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89F3828C20 for ; Thu, 18 Oct 2018 15:28:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E3D928D9D; Thu, 18 Oct 2018 15:28:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 224C428C20 for ; Thu, 18 Oct 2018 15:28:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 065A06E052; Thu, 18 Oct 2018 15:28:42 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2BC176E01C for ; Thu, 18 Oct 2018 15:28:32 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id 61-v6so34170084wrb.6 for ; Thu, 18 Oct 2018 08:28:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=EPf+MrJHS7n0XSaVExB0CZlfFvnYLkrVBlBgKICa4NM=; b=A3dksdmauRcnLiktSCtw6+/1mu0Sksk0nJR/jvrlh/uwHCk+yIY+IbkvH2er3Msvdq LFrD+feKwVAC05Hskdzq7JVh2qF5Ro5sThztMTTuHFJAXBYMg3EBxaFNCoxn6fQ3pkTC m58mW6NWRWKNrui/YtmBVcDfCotfgKvLvD+V7ssOU6fv2Gdot60AtD9VXDIdlb3zTikS qDBa2+7CI5gVWh/UGdoJlqA1T/Lsq3tiFAdcmtxyQ+1GWqAUKaWjZ6IN+Y+0V8uN9uhV tj5xrPGTttiHYyE5sW/5Vj1rMjcetDXgQb1VH8baTrb/MAP4Cc3Nn351S2pcsVjat3gs H/5g== X-Gm-Message-State: ABuFfojxd9EZX195tyQC29sjsJ2vPBfeFGYcCu+wAlwxgHtJmudDnW7c WpmMYjGF/SFpTfrU8KiSUUBlyFhCdYU= X-Google-Smtp-Source: ACcGV62WSKOxlPplhyDEFDs7EYlQzvNHTnor8xlh8NQYxTp9ZLCtQdIHSz+4qZ4HaVbzRnq4mLkCMA== X-Received: by 2002:adf:da43:: with SMTP id r3-v6mr30079773wrl.221.1539876510636; Thu, 18 Oct 2018 08:28:30 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:29 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:06 +0100 Message-Id: <20181018152815.31816-9-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 08/17] gem_wsim: More wsim_err X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin A few more opportunities to compact the code by using the error logging helper. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 54 ++++++++++++------------------------------- 1 file changed, 15 insertions(+), 39 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index a6ee6c493424..0010f46c357d 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -2366,9 +2366,7 @@ int main(int argc, char **argv) switch (c) { case 'W': if (master_workload >= 0) { - if (verbose) - fprintf(stderr, - "Only one master workload can be given!\n"); + wsim_err("Only one master workload can be given!\n"); return 1; } master_workload = nr_w_args; @@ -2381,9 +2379,7 @@ int main(int argc, char **argv) break; case 'a': if (append_workload_arg) { - if (verbose) - fprintf(stderr, - "Only one append workload can be given!\n"); + wsim_err("Only one append workload can be given!\n"); return 1; } append_workload_arg = optarg; @@ -2444,10 +2440,8 @@ int main(int argc, char **argv) } if (!balancer) { - if (verbose) - fprintf(stderr, - "Unknown balancing mode '%s'!\n", - optarg); + wsim_err("Unknown balancing mode '%s'!\n", + optarg); return 1; } break; @@ -2460,14 +2454,12 @@ int main(int argc, char **argv) } if ((flags & HEARTBEAT) && !(flags & SEQNO)) { - if (verbose) - fprintf(stderr, "Heartbeat needs a seqno based balancer!\n"); + wsim_err("Heartbeat needs a seqno based balancer!\n"); return 1; } if ((flags & VCS2REMAP) && (flags & I915)) { - if (verbose) - fprintf(stderr, "VCS remapping not supported with i915 balancing!\n"); + wsim_err("VCS remapping not supported with i915 balancing!\n"); return 1; } @@ -2484,31 +2476,24 @@ int main(int argc, char **argv) } if (!nr_w_args) { - if (verbose) - fprintf(stderr, "No workload descriptor(s)!\n"); + wsim_err("No workload descriptor(s)!\n"); return 1; } if (nr_w_args > 1 && clients > 1) { - if (verbose) - fprintf(stderr, - "Cloned clients cannot be combined with multiple workloads!\n"); + wsim_err("Cloned clients cannot be combined with multiple workloads!\n"); return 1; } if ((flags & GLOBAL_BALANCE) && !balancer) { - if (verbose) - fprintf(stderr, - "Balancer not specified in global balancing mode!\n"); + wsim_err("Balancer not specified in global balancing mode!\n"); return 1; } if (append_workload_arg) { append_workload_arg = load_workload_descriptor(append_workload_arg); if (!append_workload_arg) { - if (verbose) - fprintf(stderr, - "Failed to load append workload descriptor!\n"); + wsim_err("Failed to load append workload descriptor!\n"); return 1; } } @@ -2517,9 +2502,7 @@ int main(int argc, char **argv) struct w_arg arg = { NULL, append_workload_arg, 0 }; app_w = parse_workload(&arg, flags, NULL); if (!app_w) { - if (verbose) - fprintf(stderr, - "Failed to parse append workload!\n"); + wsim_err("Failed to parse append workload!\n"); return 1; } } @@ -2531,18 +2514,13 @@ int main(int argc, char **argv) w_args[i].desc = load_workload_descriptor(w_args[i].filename); if (!w_args[i].desc) { - if (verbose) - fprintf(stderr, - "Failed to load workload descriptor %u!\n", - i); + wsim_err("Failed to load workload descriptor %u!\n", i); return 1; } wrk[i] = parse_workload(&w_args[i], flags, app_w); if (!wrk[i]) { - if (verbose) - fprintf(stderr, - "Failed to parse workload %u!\n", i); + wsim_err("Failed to parse workload %u!\n", i); return 1; } } @@ -2602,10 +2580,8 @@ int main(int argc, char **argv) if (balancer && balancer->init) { int ret = balancer->init(balancer, w[i]); if (ret) { - if (verbose) - fprintf(stderr, - "Failed to initialize balancing! (%u=%d)\n", - i, ret); + wsim_err("Failed to initialize balancing! (%u=%d)\n", + i, ret); return 1; } } From patchwork Thu Oct 18 15:28:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647509 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A416217DE for ; Thu, 18 Oct 2018 15:28:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9621A28D97 for ; Thu, 18 Oct 2018 15:28:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8AD4B28D9F; Thu, 18 Oct 2018 15:28:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 342EC28D97 for ; Thu, 18 Oct 2018 15:28:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 321206E090; Thu, 18 Oct 2018 15:28:41 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1BA376E00E for ; Thu, 18 Oct 2018 15:28:33 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id o17-v6so27944wmh.0 for ; Thu, 18 Oct 2018 08:28:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=hNe1CSdyen8y0gdzAy8S59EArkMzd+2lLtc42KDzZH0=; b=kipcJXPuyObxTN+NHlAQOvyk8XdyjlSBAK+DO2w5QaAADXmzPjKldOOhA7E+NkIt2n JntBDnIlqoG7AXQm1C2k7A3Ym2lb2lGTMJEqpw3YfPMVI/aZbvBjqnE/QSXMJRCYrTQS GvEkZYvgFK1oGidcpYY7sBcTI65u3ubsME1cBSW0YF413BPme0CwQr8pYD+d9OQizSTS F+ZJOMadaJJ4cFGRMUX//h/hA/2UDBdcjI2/m3bMhPxgxfq8/ZWxvZDfFK6uDADiMDyF wQjkSQ6iwbpKdxEYHi+6ppXWMY22R/Uxdcrczvm8mAGJNI/uQEQ3LIPgtUvGhTqb6cJq k7Zg== X-Gm-Message-State: ABuFfoje2f1CiqYvv6B4Fx1bn2BCf2j3vrs8iNmWCcZDDf0iVUUrxwJf knbFZxbckVA6trb8IkkBOZriCA== X-Google-Smtp-Source: ACcGV63huCHJC+VVc4W91YiOsMs3BWguMubEzDNqOqpvM4CqiTrggzhLMAdp7/77sdIij5VGhEqd/w== X-Received: by 2002:a1c:9816:: with SMTP id a22-v6mr802333wme.13.1539876511658; Thu, 18 Oct 2018 08:28:31 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:31 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:07 +0100 Message-Id: <20181018152815.31816-10-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 09/17] gem_wsim: Submit fence support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Add support for submit fences in a way similar to how normal input fences are handled. Eg: 1.RCS.500-1000.0.0 1.VCS1.3000.s-1.0 1.VCS2.3000.s-2.0 Submit fences are signalled when the originating request enters the submission backend. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 20 ++++++++++++++++---- benchmarks/wsim/README | 17 +++++++++++++++++ 2 files changed, 33 insertions(+), 4 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index 0010f46c357d..a77a322ee309 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -86,6 +86,7 @@ enum w_type struct deps { int nr; + bool submit_fence; int *list; }; @@ -254,17 +255,23 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc) w->data_deps.list == w->fence_deps.list); while ((token = strtok_r(tstart, "/", &tctx)) != NULL) { + bool submit_fence = false; char *str = token; struct deps *deps; int dep; tstart = NULL; - if (strlen(token) > 1 && token[0] == 'f') { + if (str[0] == '-' || (str[0] >= '0' && str[0] <= '9')) { + deps = &w->data_deps; + } else { + if (str[0] == 's') + submit_fence = true; + else if (str[0] != 'f') + return -1; + deps = &w->fence_deps; str++; - } else { - deps = &w->data_deps; } dep = atoi(str); @@ -282,6 +289,7 @@ parse_dependencies(unsigned int nr_steps, struct w_step *w, char *_desc) sizeof(*deps->list) * deps->nr); igt_assert(deps->list); deps->list[deps->nr - 1] = dep; + deps->submit_fence = submit_fence; } } @@ -1891,7 +1899,11 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine, igt_assert(tgt >= 0 && tgt < w->idx); igt_assert(wrk->steps[tgt].emit_fence > 0); - w->eb.flags |= I915_EXEC_FENCE_IN; + if (w->fence_deps.submit_fence) + w->eb.flags |= I915_EXEC_FENCE_SUBMIT; + else + w->eb.flags |= I915_EXEC_FENCE_IN; + w->eb.rsvd2 = wrk->steps[tgt].emit_fence; } diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README index 205cd6c93afb..4786f116b4ac 100644 --- a/benchmarks/wsim/README +++ b/benchmarks/wsim/README @@ -114,6 +114,23 @@ runnable. When the second RCS batch completes the standalone fence is signaled which allows the two VCS batches to be executed. Finally we wait until the both VCS batches have completed before starting the (optional) next iteration. +Submit fences +------------- + +Submit fences are a type of input fence which are signalled when the originating +batch buffer is submitted to the GPU. (In contrary to normal sync fences, which +are signalled when completed.) + +Submit fences have the identical syntax as the sync fences with the lower-case +'s' being used to select them. Eg: + + 1.RCS.500-1000.0.0 + 1.VCS1.3000.s-1.0 + 1.VCS2.3000.s-2.0 + +Here VCS1 and VCS2 batches will only be submitted for executing once the RCS +batch enters the GPU. + Context priority ---------------- From patchwork Thu Oct 18 15:28:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647517 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 51C1C15E2 for ; Thu, 18 Oct 2018 15:29:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 44D7428C76 for ; Thu, 18 Oct 2018 15:29:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3999228DB0; Thu, 18 Oct 2018 15:29:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EA1EF28C76 for ; Thu, 18 Oct 2018 15:29:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 53CBB6E0AE; Thu, 18 Oct 2018 15:28:42 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0EA836E00E for ; Thu, 18 Oct 2018 15:28:34 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id 193-v6so743425wme.3 for ; Thu, 18 Oct 2018 08:28:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FKNnijBCogBjwARsSAFqkZlxMQLld3MIdgnPvq6X//s=; b=mp0jyU6qhr8byJU98OWF2+o4lnyCio4WdUYMe+lNXPDN7V9b4Efz2eOjABA1Vm/ZWe LwesT+cmVVo6ntjgB8AkpJ9j8pSpXmOtON4y/MU4lp2ssSeWCe2FdslUYVFJY/BDdOnS ji7MI8qutP1D5xmC8e/mXrsf4Jk2i2gBNNUKU/HP5IVCVLQ3LIhzrZN9rzlB/jocFRNj wmH2y+5iVMrgcNMUWz9X7+n7k3E/8FPde6BKNTYtKvj76T1fk5y0QhM1qYrxhcNHtq6P WayInp/cE/X6k42g8gCJnKmiV6QVE0efxycwLj5aeC2jg4Iq/B7gpHROS9fzfg4lpFTM t2Ug== X-Gm-Message-State: ABuFfoheG6UGUWxL01jdBeCYMv6dqkja7qqb6SKHP6b9ljgaReEmrF7q zwGa6Qr+2b6ifU5OPbajXqHWWQ== X-Google-Smtp-Source: ACcGV61oYPFiXbJ7NWNL3C5c4JXybjcDnOiRCETHKvLbhp4oqFmdg9GCHmfA5nvsqNGyGI9LnyN4cw== X-Received: by 2002:a1c:f003:: with SMTP id a3-v6mr852368wmb.50.1539876512632; Thu, 18 Oct 2018 08:28:32 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:32 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:08 +0100 Message-Id: <20181018152815.31816-11-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 10/17] gem_wsim: Extract str to engine lookup X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index a77a322ee309..17325d2ceaf6 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -319,6 +319,18 @@ wsim_err(const char *fmt, ...) } \ } +static int str_to_engine(const char *str) +{ + unsigned int i; + + for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) { + if (!strcasecmp(str, ring_str_map[i])) + return i; + } + + return -1; +} + static struct workload * parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) { @@ -473,22 +485,18 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) } if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { - unsigned int old_valid = valid; - fstart = NULL; - for (i = 0; i < ARRAY_SIZE(ring_str_map); i++) { - if (!strcasecmp(field, ring_str_map[i])) { - step.engine = i; - if (step.engine == BCS) - bcs_used = true; - valid++; - break; - } - } - - check_arg(old_valid == valid, + i = str_to_engine(field); + check_arg(i < 0, "Invalid engine id at step %u!\n", nr_steps); + if (i >= 0) + valid++; + + step.engine = i; + + if (step.engine == BCS) + bcs_used = true; } if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { From patchwork Thu Oct 18 15:28:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647505 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A1F6112B for ; Thu, 18 Oct 2018 15:28:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3609128C20 for ; Thu, 18 Oct 2018 15:28:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2841D28D0A; Thu, 18 Oct 2018 15:28:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 649AE28C20 for ; Thu, 18 Oct 2018 15:28:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 812AA6E09B; Thu, 18 Oct 2018 15:28:41 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by gabe.freedesktop.org (Postfix) with ESMTPS id 622336E01C for ; Thu, 18 Oct 2018 15:28:35 +0000 (UTC) Received: by mail-wr1-x442.google.com with SMTP id w5-v6so34158775wrt.2 for ; Thu, 18 Oct 2018 08:28:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=p6kejeN9GmUjQ3uXGEBOBtol18l2LPcsGg90G6zLK4g=; b=gD+DoLhWFOWdxFVgG5vvxjOGkFH4Vj4b9yPEEHk9f1lGcl4oxa5mp8FY/39QsPQzhL rGSbCu3MJnDLeVkMq4I4eFkoOau/nhgHmoUyK/M1Y8QgNvWR7ENZooCmtboVKI0q3Yjo yLEA8WS9V4Rpp/wl8snWCx/7alDVH/jkTv8PH200lMeRXXi4hMkFDWIO3aqnFD3tgls4 IsMfnwAHW+iiqhyTZFp7LYyu9OiZ98dCKMAOs9VURAh2TONCEUGObj5WhhLJukA9FKBT H8R6IhjwPZ74/zWwZm75dzSlPfk17bWt13mKP2zY3q5e2oA65W4ySljro+qFmEHTKXqg Qu4w== X-Gm-Message-State: AGRZ1gIlgjCFktQZeUpz8oOmXQ58Fo8Ddg6YUnRXcDLP/ubyEu4K3vxm RoOLgBAt/sbQiYBVfwZ/idkK2g== X-Google-Smtp-Source: AJdET5e9+mj+Xq4q8yC+2QSnNE6QENKhpbgsHdFIct+MhPonSO2Y2DyEbFxY+KwOeA0/PaAklTP8sA== X-Received: by 2002:adf:dc89:: with SMTP id r9-v6mr3130326wrj.212.1539876513570; Thu, 18 Oct 2018 08:28:33 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:32 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:09 +0100 Message-Id: <20181018152815.31816-12-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 11/17] gem_wsim: Engine map support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Support new i915 uAPI for configuring contexts with engine maps. Please refer to the README file for more detailed explanation. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 212 ++++++++++++++++++++++++++++++++++------- benchmarks/wsim/README | 17 +++- 2 files changed, 192 insertions(+), 37 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index 17325d2ceaf6..fbec23ad1753 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -56,6 +56,7 @@ #include "ewma.h" enum intel_engine_id { + DEFAULT, RCS, BCS, VCS, @@ -80,7 +81,8 @@ enum w_type SW_FENCE, SW_FENCE_SIGNAL, CTX_PRIORITY, - PREEMPTION + PREEMPTION, + ENGINE_MAP }; struct deps @@ -114,6 +116,10 @@ struct w_step int throttle; int fence_signal; int priority; + struct { + unsigned int engine_map_count; + enum intel_engine_id *engine_map; + }; }; /* Implementation details */ @@ -143,6 +149,8 @@ DECLARE_EWMA(uint64_t, rt, 4, 2) struct ctx { uint32_t id; int priority; + unsigned int engine_map_count; + enum intel_engine_id *engine_map; bool targets_instance; bool wants_balance; unsigned int static_vcs; @@ -201,10 +209,10 @@ struct workload int fd; bool first; unsigned int num_engines; - unsigned int engine_map[5]; + unsigned int engine_map[NUM_ENGINES]; uint64_t t_prev; - uint64_t prev[5]; - double busy[5]; + uint64_t prev[NUM_ENGINES]; + double busy[NUM_ENGINES]; } busy_balancer; }; @@ -235,6 +243,7 @@ static int fd; #define REG(x) (volatile uint32_t *)((volatile char *)igt_global_mmio + x) static const char *ring_str_map[NUM_ENGINES] = { + [DEFAULT] = "DEFAULT", [RCS] = "RCS", [BCS] = "BCS", [VCS] = "VCS", @@ -331,6 +340,37 @@ static int str_to_engine(const char *str) return -1; } +static int parse_engine_map(struct w_step *step, const char *_str) +{ + char *token, *tctx = NULL, *tstart = (char *)_str; + + while ((token = strtok_r(tstart, "|", &tctx))) { + enum intel_engine_id engine; + + tstart = NULL; + + if (!strcmp(token, "DEFAULT")) + return -1; + else if (!strcmp(token, "VCS")) + return -1; + + engine = str_to_engine(token); + if ((int)engine < 0) + return -1; + + if (engine != VCS1 && engine != VCS2) + return -1; /* TODO */ + + step->engine_map_count++; + step->engine_map = realloc(step->engine_map, + step->engine_map_count * + sizeof(step->engine_map[0])); + step->engine_map[step->engine_map_count - 1] = engine; + } + + return 0; +} + static struct workload * parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) { @@ -449,6 +489,33 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) } else if (!strcmp(field, "f")) { step.type = SW_FENCE; goto add_step; + } else if (!strcmp(field, "M")) { + unsigned int nr = 0; + while ((field = strtok_r(fstart, ".", &fctx)) != + NULL) { + tmp = atoi(field); + check_arg(nr == 0 && tmp <= 0, + "Invalid context at step %u!\n", + nr_steps); + check_arg(nr > 1, + "Invalid engine map format at step %u!\n", + nr_steps); + + if (nr == 0) { + step.context = tmp; + } else { + tmp = parse_engine_map(&step, + field); + check_arg(tmp < 0, + "Invalid engine map list at step %u!\n", + nr_steps); + } + + nr++; + } + + step.type = ENGINE_MAP; + goto add_step; } else if (!strcmp(field, "X")) { unsigned int nr = 0; while ((field = strtok_r(fstart, ".", &fctx)) != @@ -490,9 +557,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) i = str_to_engine(field); check_arg(i < 0, "Invalid engine id at step %u!\n", nr_steps); - if (i >= 0) - valid++; - + valid++; step.engine = i; if (step.engine == BCS) @@ -769,6 +834,7 @@ terminate_bb(struct w_step *w, unsigned int flags) } static const unsigned int eb_engine_map[NUM_ENGINES] = { + [DEFAULT] = I915_EXEC_DEFAULT, [RCS] = I915_EXEC_RENDER, [BCS] = I915_EXEC_BLT, [VCS] = I915_EXEC_BSD, @@ -785,18 +851,42 @@ eb_set_engine(struct drm_i915_gem_execbuffer2 *eb, if (engine == VCS2 && (flags & VCS2REMAP)) engine = BCS; - if ((flags & I915) && engine == VCS) { + if ((flags & I915) && engine == VCS) eb->flags = 0; - } else { + else eb->flags = eb_engine_map[engine]; +} + +static unsigned int +find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine) +{ + unsigned int i; + + for (i = 0; i < ctx->engine_map_count; i++) { + if (ctx->engine_map[i] == engine) + return i + 1; } + + igt_assert(0); + return 0; +} + +static struct ctx * +__get_ctx(struct workload *wrk, struct w_step *w) +{ + return &wrk->ctx_list[w->context * 2]; } static void -eb_update_flags(struct w_step *w, enum intel_engine_id engine, - unsigned int flags) +eb_update_flags(struct workload *wrk, struct w_step *w, + enum intel_engine_id engine, unsigned int flags) { - eb_set_engine(&w->eb, engine, flags); + struct ctx *ctx = __get_ctx(wrk, w); + + if (ctx->engine_map) + w->eb.flags = find_engine_in_map(ctx, engine); + else + eb_set_engine(&w->eb, engine, flags); w->eb.flags |= I915_EXEC_HANDLE_LUT; w->eb.flags |= I915_EXEC_NO_RELOC; @@ -815,12 +905,6 @@ get_status_objects(struct workload *wrk) return wrk->status_object; } -static struct ctx * -__get_ctx(struct workload *wrk, struct w_step *w) -{ - return &wrk->ctx_list[w->context * 2]; -} - static uint32_t get_ctxid(struct workload *wrk, struct w_step *w) { @@ -890,7 +974,7 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags) engine = VCS2; else if (flags & SWAPVCS && engine == VCS2) engine = VCS1; - eb_update_flags(w, engine, flags); + eb_update_flags(wrk, w, engine, flags); #ifdef DEBUG printf("%u: %u:|", w->idx, w->eb.buffer_count); for (i = 0; i <= j; i++) @@ -913,7 +997,7 @@ static void __ctx_set_prio(uint32_t ctx_id, unsigned int prio) gem_context_set_param(fd, ¶m); } -static void +static int prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) { unsigned int ctx_vcs = 0; @@ -974,30 +1058,53 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) /* * Identify if contexts target specific engine instances and if they * want to be balanced. + * + * Transfer over engine map configuration from the workload step. */ for (j = 0; j < wrk->nr_ctxs; j += 2) { bool targets = false; bool balance = false; for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) { - if (w->type != BATCH) - continue; - if (w->context != (j / 2)) continue; - if (w->engine == VCS) - balance = true; - else - targets = true; + if (w->type == BATCH) { + if (w->engine == VCS) + balance = true; + else + targets = true; + } else if (w->type == ENGINE_MAP) { + wrk->ctx_list[j].engine_map = w->engine_map; + wrk->ctx_list[j].engine_map_count = + w->engine_map_count; + } } - if (flags & I915) { - wrk->ctx_list[j].targets_instance = targets; + wrk->ctx_list[j].targets_instance = targets; + if (flags & I915) wrk->ctx_list[j].wants_balance = balance; + } + + /* + * Ensure VCS is not allowed with engine map contexts. + */ + for (j = 0; j < wrk->nr_ctxs; j += 2) { + for (i = 0, w = wrk->steps; i < wrk->nr_steps; i++, w++) { + if (w->context != (j / 2)) + continue; + + if (w->type != BATCH) + continue; + + if (wrk->ctx_list[j].engine_map && w->engine == VCS) { + wsim_err("Batches targetting engine maps must use explicit engines!\n"); + return -1; + } } } + /* * Create and configure contexts. */ @@ -1008,7 +1115,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) if (ctx->id) continue; - if (flags & I915) { + if ((flags & I915) || ctx->engine_map) { struct drm_i915_gem_context_create_v2 args = { }; /* Find existing context to share ppgtt with. */ @@ -1022,7 +1129,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) break; } - if (!ctx->targets_instance) + if ((!ctx->engine_map && !ctx->targets_instance)) args.flags |= I915_GEM_CONTEXT_SINGLE_TIMELINE; drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args); @@ -1053,7 +1160,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) * both want to target specific engines and be balanced by i915? */ if ((flags & I915) && ctx->wants_balance && - ctx->targets_instance) { + ctx->targets_instance && !ctx->engine_map) { struct drm_i915_gem_context_create_v2 args = {}; igt_assert(share_ctx); @@ -1071,7 +1178,33 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) __ctx_set_prio(ctx_id, wrk->prio); } - if (ctx->wants_balance) { + if (ctx->engine_map) { + struct local_i915_context_param_engines { + __u64 extensions; + + struct { + __u16 class; /* see enum drm_i915_gem_engine_class */ + __u16 instance; + } engines[ctx->engine_map_count]; + } __attribute__((packed)) set_engines; + struct drm_i915_gem_context_param param = { + .ctx_id = ctx_id, + .param = I915_CONTEXT_PARAM_ENGINES, + .size = sizeof(set_engines), + .value = to_user_pointer(&set_engines), + }; + + set_engines.extensions = 0; + + for (j = 0; j < ctx->engine_map_count; j++) { + set_engines.engines[j].class = + I915_ENGINE_CLASS_VIDEO; /* FIXME */ + set_engines.engines[j].instance = + ctx->engine_map[j] - VCS1; /* FIXME */ + } + + gem_context_set_param(fd, ¶m); + } else if (ctx->wants_balance) { struct i915_context_engines_load_balance load_balance = { .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE, .engines_mask = -1, @@ -1151,6 +1284,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) alloc_step_batch(wrk, w, _flags); } + + return 0; } static double elapsed(const struct timespec *start, const struct timespec *end) @@ -1888,7 +2023,7 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine, uint32_t seqno = new_seqno(wrk, engine); unsigned int i; - eb_update_flags(w, engine, flags); + eb_update_flags(wrk, w, engine, flags); if (flags & SEQNO) update_bb_seqno(w, engine, seqno); @@ -2037,7 +2172,8 @@ static void *run_workload(void *data) w->priority; } continue; - } else if (w->type == PREEMPTION) { + } else if (w->type == PREEMPTION || + w->type == ENGINE_MAP) { continue; } @@ -2595,7 +2731,11 @@ int main(int argc, char **argv) w[i]->print_stats = verbose > 1 || (verbose > 0 && master_workload == i); - prepare_workload(i, w[i], flags_); + if (prepare_workload(i, w[i], flags_)) { + wsim_err("Failed to prepare workload %u!\n", i); + return 1; + } + if (balancer && balancer->init) { int ret = balancer->init(balancer, w[i]); diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README index 4786f116b4ac..20e3e358cd2e 100644 --- a/benchmarks/wsim/README +++ b/benchmarks/wsim/README @@ -3,6 +3,7 @@ Workload descriptor format ctx.engine.duration_us.dependency.wait,... ..[-].[/][...].<0|1>,... +M..[|]... P|X.. d|p|s|t|q|a.,... f @@ -23,10 +24,11 @@ Additional workload steps are also supported: 'q' - Throttle to n max queue depth. 'f' - Create a sync fence. 'a' - Advance the previously created sync fence. + 'M' - Set up engine map. 'P' - Context priority. 'X' - Context preemption control. -Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS +Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS Example (leading spaces must not be present in the actual file): ---------------------------------------------------------------- @@ -161,3 +163,16 @@ The same context is then marked to have batches which can be preempted every Same as with context priority, context preemption commands are valid until optionally overriden by another preemption control change on the same context. + +Engine maps +----------- + +Engine maps are a per context feature which changes the way engine selection is +done in the driver. + +Example: + + 1.M.VCS1|VCS2 + +This sets up context 1 with an engine map containing VCS1 and VCS2 engine. +Submission to this context can now only reference these two engines. From patchwork Thu Oct 18 15:28:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647507 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0FE3B15E2 for ; Thu, 18 Oct 2018 15:28:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 01FC428D97 for ; Thu, 18 Oct 2018 15:28:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EA95928DB0; Thu, 18 Oct 2018 15:28:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B8EA128D97 for ; Thu, 18 Oct 2018 15:28:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 124876E08F; Thu, 18 Oct 2018 15:28:41 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by gabe.freedesktop.org (Postfix) with ESMTPS id CF1BD6E028 for ; Thu, 18 Oct 2018 15:28:35 +0000 (UTC) Received: by mail-wm1-x343.google.com with SMTP id y144-v6so734515wmd.4 for ; Thu, 18 Oct 2018 08:28:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=W6EZhQuLWIkir+2S6UoA6LvCeFfGEynyUKj02a0muBU=; b=UYDbB6oRMsEXjDayn5Li9FborP5Xbm70KLHmFA2QoXjUCBKBelQExmZagCRwfksJXd tm7U7EVGeXwmA97YAmnqTWBv8WKnzLW1QV8y1cU1fr6h7TW2X8qaqMDthAI08CU6GWrZ Deq94VSKQNpDb2h96fQGGy6p7Fc9wilpmxYS7Art5UwcXfSCZ5Hq9KvixETWOy1DRVl3 grx/1vNN3UiES/u7BS3DL3qVnnIwZqOdgFrjRUPNWP224pkeAjqfpTs2qjF3X1aCNV+U eKC1DBENMbvyNALpelZGQ9mxoH/0eM2bLA/7cSPU1JMlYCW9OVpRVwR+TNW3pzAyazkA ITRg== X-Gm-Message-State: ABuFfoh6hcLdOi/etvxJWFKG/V9FoaAnuwzib+8cW71w142qnbQjKMju 4mXcyfhEs4pH8IGvCsN/tLaqtQ== X-Google-Smtp-Source: ACcGV62Cn7Mb9KeVeG5jg/o1zvgzoX1/9zy3Ihb41wvNMMpv6hKKTLtCgfWDlLs7LLXzuGpxnULIzg== X-Received: by 2002:a1c:118c:: with SMTP id 134-v6mr773407wmr.75.1539876514362; Thu, 18 Oct 2018 08:28:34 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:33 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:10 +0100 Message-Id: <20181018152815.31816-13-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 12/17] gem_wsim: Save some lines by changing to implicit NULL checking X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin We can improve the parsing loop readability a bit more by avoiding some line breaks caused by explicit NULL checks. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 39 +++++++++++++++------------------------ 1 file changed, 15 insertions(+), 24 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index fbec23ad1753..59243af5cde8 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -386,7 +386,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) igt_assert(desc); - while ((_token = strtok_r(tstart, ",", &tctx)) != NULL) { + while ((_token = strtok_r(tstart, ",", &tctx))) { tstart = NULL; token = strdup(_token); igt_assert(token); @@ -394,12 +394,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) valid = 0; memset(&step, 0, sizeof(step)); - if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { fstart = NULL; if (!strcmp(field, "d")) { - if ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(tmp <= 0, "Invalid delay at step %u!\n", @@ -409,8 +408,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) goto add_step; } } else if (!strcmp(field, "p")) { - if ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(tmp <= 0, "Invalid period at step %u!\n", @@ -421,8 +419,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) } } else if (!strcmp(field, "P")) { unsigned int nr = 0; - while ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + while ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(nr == 0 && tmp <= 0, "Invalid context at step %u!\n", @@ -442,8 +439,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) step.type = CTX_PRIORITY; goto add_step; } else if (!strcmp(field, "s")) { - if ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(tmp >= 0 || ((int)nr_steps + tmp) < 0, @@ -454,8 +450,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) goto add_step; } } else if (!strcmp(field, "t")) { - if ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(tmp < 0, "Invalid throttle at step %u!\n", @@ -465,8 +460,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) goto add_step; } } else if (!strcmp(field, "q")) { - if ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(tmp < 0, "Invalid qd throttle at step %u!\n", @@ -476,8 +470,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) goto add_step; } } else if (!strcmp(field, "a")) { - if ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(tmp >= 0, "Invalid sw fence signal at step %u!\n", @@ -491,8 +484,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) goto add_step; } else if (!strcmp(field, "M")) { unsigned int nr = 0; - while ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + while ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(nr == 0 && tmp <= 0, "Invalid context at step %u!\n", @@ -518,8 +510,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) goto add_step; } else if (!strcmp(field, "X")) { unsigned int nr = 0; - while ((field = strtok_r(fstart, ".", &fctx)) != - NULL) { + while ((field = strtok_r(fstart, ".", &fctx))) { tmp = atoi(field); check_arg(nr == 0 && tmp <= 0, "Invalid context at step %u!\n", @@ -551,7 +542,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) valid++; } - if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { fstart = NULL; i = str_to_engine(field); @@ -564,7 +555,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) bcs_used = true; } - if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { char *sep = NULL; long int tmpl; @@ -592,7 +583,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) valid++; } - if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { fstart = NULL; tmp = parse_dependencies(nr_steps, &step, field); @@ -602,7 +593,7 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) valid++; } - if ((field = strtok_r(fstart, ".", &fctx)) != NULL) { + if ((field = strtok_r(fstart, ".", &fctx))) { fstart = NULL; check_arg(strlen(field) != 1 || From patchwork Thu Oct 18 15:28:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647513 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87020112B for ; Thu, 18 Oct 2018 15:28:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 791D128D97 for ; Thu, 18 Oct 2018 15:28:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6D4BE28DB0; Thu, 18 Oct 2018 15:28:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0202828D97 for ; Thu, 18 Oct 2018 15:28:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4B0766E0AB; Thu, 18 Oct 2018 15:28:42 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by gabe.freedesktop.org (Postfix) with ESMTPS id A76486E01C for ; Thu, 18 Oct 2018 15:28:36 +0000 (UTC) Received: by mail-wr1-x441.google.com with SMTP id a13-v6so34181769wrt.5 for ; Thu, 18 Oct 2018 08:28:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LriBVgEfqVxeGpLncpjFF9w/c3vH91sw2Lq13Ztti0k=; b=Zb2hNxgMMPpKWFX3Blk/Vu2acOxpZuNC34ocYuHXuRUlMe1GKvZn+RkpBG1k0tfEsS ediQZ3pzvyJanM/CickEGbBrhFEQZQoxRJzVUdECtfn2D3iGdYZNXETACidDdTrrW6JI J5aK7s+g9Q6AmLc7E6XA8o4l7dVWxEKnzTP/Z7kBAbGybRr4OMyvCMesZs0olafAvcNs N/acvLZT84YayLi8tH4FKE6TlUz68vB+hb3QRWpJvNCqcxB/ml5Ue1mR2Lhm3PgJ9vB2 4Nz6UjoyZc8ZF3VCCO39n54OXnINdif7XxQhy4tm/nAJcM5JOoc6vLs1ZnyndAMQKrPH R9Rg== X-Gm-Message-State: ABuFfog3k2+xY+j4na6bQbwlnSgi+q9JYtBGKGT5hDAjPKfRGj68C06C /sGdkn80gOWYiYViCk0TJ6NSuw== X-Google-Smtp-Source: ACcGV62bqq04bXgznFZ665B1zXyqeX2t/O0w9rjWibJprM7E3qvcsfHA/vd3aPqZz6/2X+/lcACHdQ== X-Received: by 2002:a5d:6902:: with SMTP id t2-v6mr29124831wru.323.1539876515244; Thu, 18 Oct 2018 08:28:35 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:34 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:11 +0100 Message-Id: <20181018152815.31816-14-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 13/17] gem_wsim: Compact int command parsing with a macro X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Parsing an integer workload descriptor field is a common pattern which we can extract to a helper macro and by doing so further improve the readability of the main parsing loop. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 80 ++++++++++++++----------------------------- 1 file changed, 25 insertions(+), 55 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index 59243af5cde8..b805ecd9a680 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -371,6 +371,15 @@ static int parse_engine_map(struct w_step *step, const char *_str) return 0; } +#define int_field(_STEP_, _FIELD_, _COND_, _ERR_) \ + if ((field = strtok_r(fstart, ".", &fctx))) { \ + tmp = atoi(field); \ + check_arg(_COND_, _ERR_, nr_steps); \ + step.type = _STEP_; \ + step._FIELD_ = tmp; \ + goto add_step; \ + } \ + static struct workload * parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) { @@ -398,25 +407,11 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) fstart = NULL; if (!strcmp(field, "d")) { - if ((field = strtok_r(fstart, ".", &fctx))) { - tmp = atoi(field); - check_arg(tmp <= 0, - "Invalid delay at step %u!\n", - nr_steps); - step.type = DELAY; - step.delay = tmp; - goto add_step; - } + int_field(DELAY, delay, tmp <= 0, + "Invalid delay at step %u!\n"); } else if (!strcmp(field, "p")) { - if ((field = strtok_r(fstart, ".", &fctx))) { - tmp = atoi(field); - check_arg(tmp <= 0, - "Invalid period at step %u!\n", - nr_steps); - step.type = PERIOD; - step.period = tmp; - goto add_step; - } + int_field(PERIOD, period, tmp <= 0, + "Invalid period at step %u!\n"); } else if (!strcmp(field, "P")) { unsigned int nr = 0; while ((field = strtok_r(fstart, ".", &fctx))) { @@ -439,46 +434,21 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) step.type = CTX_PRIORITY; goto add_step; } else if (!strcmp(field, "s")) { - if ((field = strtok_r(fstart, ".", &fctx))) { - tmp = atoi(field); - check_arg(tmp >= 0 || - ((int)nr_steps + tmp) < 0, - "Invalid sync target at step %u!\n", - nr_steps); - step.type = SYNC; - step.target = tmp; - goto add_step; - } + int_field(SYNC, target, + tmp >= 0 || ((int)nr_steps + tmp) < 0, + "Invalid sync target at step %u!\n"); } else if (!strcmp(field, "t")) { - if ((field = strtok_r(fstart, ".", &fctx))) { - tmp = atoi(field); - check_arg(tmp < 0, - "Invalid throttle at step %u!\n", - nr_steps); - step.type = THROTTLE; - step.throttle = tmp; - goto add_step; - } + int_field(THROTTLE, throttle, + tmp < 0, + "Invalid throttle at step %u!\n"); } else if (!strcmp(field, "q")) { - if ((field = strtok_r(fstart, ".", &fctx))) { - tmp = atoi(field); - check_arg(tmp < 0, - "Invalid qd throttle at step %u!\n", - nr_steps); - step.type = QD_THROTTLE; - step.throttle = tmp; - goto add_step; - } + int_field(QD_THROTTLE, throttle, + tmp < 0, + "Invalid qd throttle at step %u!\n"); } else if (!strcmp(field, "a")) { - if ((field = strtok_r(fstart, ".", &fctx))) { - tmp = atoi(field); - check_arg(tmp >= 0, - "Invalid sw fence signal at step %u!\n", - nr_steps); - step.type = SW_FENCE_SIGNAL; - step.target = tmp; - goto add_step; - } + int_field(SW_FENCE_SIGNAL, target, + tmp >= 0, + "Invalid sw fence signal at step %u!\n"); } else if (!strcmp(field, "f")) { step.type = SW_FENCE; goto add_step; From patchwork Thu Oct 18 15:28:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647521 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC14115E2 for ; Thu, 18 Oct 2018 15:29:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C85E28C76 for ; Thu, 18 Oct 2018 15:29:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8F67128D6A; Thu, 18 Oct 2018 15:29:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 71C4628D9D for ; Thu, 18 Oct 2018 15:29:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 056FF6E0C4; Thu, 18 Oct 2018 15:28:49 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0DD756E046 for ; Thu, 18 Oct 2018 15:28:38 +0000 (UTC) Received: by mail-wm1-x341.google.com with SMTP id z25-v6so2106596wmf.1 for ; Thu, 18 Oct 2018 08:28:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gfeeCR0ktdW4ZiIsms24sfxaB4glyLIcPBU2i4H3DjQ=; b=uJGDQc8AqC/EuGh90pUBGHJxZZecXoK8xPwjY1fENEuSVc5Emz4ND6n1FvNbxDxU57 Bpni0ZTZyTkL9HrOHm3oxjR9KCXb/F04iA3nPm3noMCtujDX56+c6kQ5eSHrb4OUVkUv o9ySL1Fnn5DVB1fxT/Rs2nQU1U2RRCTAEBGWnNKginx9DqA4pisUxE3zCOYfTRg7NZnA I/Hb4+XqeCQuv2KC/83HCe28tmYKLvw5fixT/SY+OJhHvFiNPFVR9TQjsrkHVcRWCGY9 CfZFYBYms2MIz3quPF9VoCrw4PTM7rj6Pg74dm8m2CDLHjjydA7oRUTDxdK20uWF6N0o 5MZA== X-Gm-Message-State: AGRZ1gLKzGsIhSi7ch8gJJmkiTpitHk3+zCWYryPQhra+80WxLBt7Eqq dBcdJcyxfvcq/71BgPeZCVtCEg== X-Google-Smtp-Source: ACcGV61725/B3yizuS7lzEJPmvH6MKY5sezocu6wSjLOjya5p9iXWZ4S+R0z3i50VqoLCGBzLf8qlg== X-Received: by 2002:a1c:3c4:: with SMTP id 187-v6mr813120wmd.90.1539876516317; Thu, 18 Oct 2018 08:28:36 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:35 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:12 +0100 Message-Id: <20181018152815.31816-15-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 14/17] gem_wsim: Engine map load balance command X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin A new workload command for enabling a load balanced context map (aka Virtual Engine). Example usage: 1.B This turns on load balancing for context one, assuming it has already been configured with an engine map. Only DEFAULT engine specifier can be used with load balanced engine maps. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 62 +++++++++++++++++++++++++++++++++++++----- benchmarks/wsim/README | 18 ++++++++++++ 2 files changed, 73 insertions(+), 7 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index b805ecd9a680..a772e2c588b5 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -82,7 +82,8 @@ enum w_type SW_FENCE_SIGNAL, CTX_PRIORITY, PREEMPTION, - ENGINE_MAP + ENGINE_MAP, + LOAD_BALANCE, }; struct deps @@ -120,6 +121,7 @@ struct w_step unsigned int engine_map_count; enum intel_engine_id *engine_map; }; + bool load_balance; }; /* Implementation details */ @@ -502,6 +504,25 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) step.type = PREEMPTION; goto add_step; + } else if (!strcmp(field, "B")) { + unsigned int nr = 0; + while ((field = strtok_r(fstart, ".", &fctx))) { + tmp = atoi(field); + check_arg(nr == 0 && tmp <= 0, + "Invalid context at step %u!\n", + nr_steps); + check_arg(nr > 0, + "Invalid load balance format at step %u!\n", + nr_steps); + + step.context = tmp; + step.load_balance = true; + + nr++; + } + + step.type = LOAD_BALANCE; + goto add_step; } tmp = atoi(field); @@ -828,7 +849,7 @@ find_engine_in_map(struct ctx *ctx, enum intel_engine_id engine) return i + 1; } - igt_assert(0); + igt_assert(ctx->wants_balance); return 0; } @@ -1039,12 +1060,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) wrk->ctx_list[j].engine_map = w->engine_map; wrk->ctx_list[j].engine_map_count = w->engine_map_count; + } else if (w->type == LOAD_BALANCE) { + if (!wrk->ctx_list[j].engine_map) { + wsim_err("Load balancing needs an engine map!\n"); + return 1; + } + wrk->ctx_list[j].wants_balance = + w->load_balance; } } wrk->ctx_list[j].targets_instance = targets; if (flags & I915) - wrk->ctx_list[j].wants_balance = balance; + wrk->ctx_list[j].wants_balance |= balance; } /* @@ -1058,10 +1086,19 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) if (w->type != BATCH) continue; - if (wrk->ctx_list[j].engine_map && w->engine == VCS) { + if (wrk->ctx_list[j].engine_map && + !wrk->ctx_list[j].wants_balance && + (w->engine == VCS || w->engine == DEFAULT)) { wsim_err("Batches targetting engine maps must use explicit engines!\n"); return -1; } + + if (wrk->ctx_list[j].engine_map && + wrk->ctx_list[j].wants_balance && + w->engine != DEFAULT) { + wsim_err("Batches targetting load balanced maps must not use explicit engines!\n"); + return -1; + } } } @@ -1090,7 +1127,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) break; } - if ((!ctx->engine_map && !ctx->targets_instance)) + if ((!ctx->engine_map && !ctx->targets_instance) || + (ctx->engine_map && ctx->wants_balance)) args.flags |= I915_GEM_CONTEXT_SINGLE_TIMELINE; drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &args); @@ -1154,8 +1192,17 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) .size = sizeof(set_engines), .value = to_user_pointer(&set_engines), }; + struct i915_context_engines_load_balance load_balance = + { .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE, + .engines_mask = -1, + }; - set_engines.extensions = 0; + if (ctx->wants_balance) { + set_engines.extensions = + to_user_pointer(&load_balance); + } else { + set_engines.extensions = 0; + } for (j = 0; j < ctx->engine_map_count; j++) { set_engines.engines[j].class = @@ -2134,7 +2181,8 @@ static void *run_workload(void *data) } continue; } else if (w->type == PREEMPTION || - w->type == ENGINE_MAP) { + w->type == ENGINE_MAP || + w->type == LOAD_BALANCE) { continue; } diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README index 20e3e358cd2e..58dada675357 100644 --- a/benchmarks/wsim/README +++ b/benchmarks/wsim/README @@ -3,6 +3,7 @@ Workload descriptor format ctx.engine.duration_us.dependency.wait,... ..[-].[/][...].<0|1>,... +B. M..[|]... P|X.. d|p|s|t|q|a.,... @@ -24,6 +25,7 @@ Additional workload steps are also supported: 'q' - Throttle to n max queue depth. 'f' - Create a sync fence. 'a' - Advance the previously created sync fence. + 'B' - Turn on context load balancing. 'M' - Set up engine map. 'P' - Context priority. 'X' - Context preemption control. @@ -176,3 +178,19 @@ Example: This sets up context 1 with an engine map containing VCS1 and VCS2 engine. Submission to this context can now only reference these two engines. + +Context load balancing +---------------------- + +Context load balancing (aka Virtual Engine) is an i915 feature where the driver +will pick the best engine (most idle) to submit to given previously configured +engine map. + +Example: + + 1.B + +This enables load balancing for context number one. + +Submissions to load balanced contexts are only allowed to use the DEFAULT engine +specifier. From patchwork Thu Oct 18 15:28:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647523 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F4F915E2 for ; Thu, 18 Oct 2018 15:29:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F336828C76 for ; Thu, 18 Oct 2018 15:29:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E7A3628D97; Thu, 18 Oct 2018 15:29:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 618FC28C76 for ; Thu, 18 Oct 2018 15:29:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 57F6F6E0D0; Thu, 18 Oct 2018 15:28:49 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id D85A56E046 for ; Thu, 18 Oct 2018 15:28:38 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id 143-v6so705875wmf.1 for ; Thu, 18 Oct 2018 08:28:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zF9IPHD4gKP/6Zix86oPEwrf5vlPGi7/+W7j+mg4BS0=; b=GrabtT9Y9hCb6WSC9vQB9e7Fw5VTkkGcM4HH5GFtuTPH7g88vOA6csJYlREa3feVQg au0/Fbs2BPqR+OKhO2bvl0yRj+v9AzGz+ONAM8O1aotrtnYvsUfgPpWP12leN2VjrHkt Ss3zZ88jGiTvD1GOqDE+2eI3XheTmkkdEmdaUKBbzDUmWN/y/O001BSh16l8F4VtI9h9 P0h8YnBCS3cacGq6fP2o5TQl50L9Vf0pio2KQqdPThFJi6VXSYHEN/iNZzj1i490M+MU linKZ8LcKP/eftdYo6Ecsl+pakb2ROuxakXv+DjccDw5Olf63xYzHfT7eMa7f2uAsJxI +8aQ== X-Gm-Message-State: ABuFfoiN4XZjJzhQNSEhTikfwGV26JDILzFRI1lqVOuWzLS+NH/PfOm9 YGR+IoAEErxuMQYkdMS36jLhUyiDHaw= X-Google-Smtp-Source: ACcGV63+ECv/DNiiQin+M3J6biv3xGoF1cpDVAJDOmDLFLVdhr3Zi9Tmq0tttzAdg6FiHoJqUXsmDg== X-Received: by 2002:a1c:f514:: with SMTP id t20-v6mr799497wmh.129.1539876517182; Thu, 18 Oct 2018 08:28:37 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:36 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:13 +0100 Message-Id: <20181018152815.31816-16-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 15/17] gem_wsim: Engine bond command X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Engine bonds are an i915 uAPI applicable to load balanced contexts with engine map. They allow expression rules of engine selection between two contexts when submissions are also tied with submit fences. Please refer to the README for a more detailed description. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 100 ++++++++++++++++++++++++++++++++++++++--- benchmarks/wsim/README | 50 +++++++++++++++++++++ 2 files changed, 143 insertions(+), 7 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index a772e2c588b5..b5ade7b33883 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -84,6 +84,7 @@ enum w_type PREEMPTION, ENGINE_MAP, LOAD_BALANCE, + BOND, }; struct deps @@ -99,6 +100,11 @@ struct w_arg { int prio; }; +struct bond { + uint64_t mask; + enum intel_engine_id master; +}; + struct w_step { /* Workload step metadata */ @@ -122,6 +128,10 @@ struct w_step enum intel_engine_id *engine_map; }; bool load_balance; + struct { + uint64_t bond_mask; + enum intel_engine_id bond_master; + }; }; /* Implementation details */ @@ -153,6 +163,8 @@ struct ctx { int priority; unsigned int engine_map_count; enum intel_engine_id *engine_map; + unsigned int bond_count; + struct bond *bonds; bool targets_instance; bool wants_balance; unsigned int static_vcs; @@ -523,6 +535,40 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) step.type = LOAD_BALANCE; goto add_step; + } else if (!strcmp(field, "b")) { + unsigned int nr = 0; + while ((field = strtok_r(fstart, ".", &fctx))) { + tmp = atoi(field); + check_arg(nr == 0 && tmp <= 0, + "Invalid context at step %u!\n", + nr_steps); + check_arg(nr == 1 && + (tmp < -1 || tmp == 0), + "Invalid siblings mask at step %u!\n", + nr_steps); + check_arg(nr > 2, + "Invalid bond format at step %u!\n", + nr_steps); + + if (nr == 0) { + step.context = tmp; + } else if (nr == 1) { + step.bond_mask = tmp; + } else if (nr == 2) { + tmp = str_to_engine(field); + check_arg(tmp <= 0 || + tmp == VCS || + tmp == DEFAULT, + "Invalid master engine at step %u!\n", + nr_steps); + step.bond_master = tmp; + } + + nr++; + } + + step.type = BOND; + goto add_step; } tmp = atoi(field); @@ -1044,6 +1090,8 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) * Transfer over engine map configuration from the workload step. */ for (j = 0; j < wrk->nr_ctxs; j += 2) { + struct ctx *ctx = &wrk->ctx_list[j]; + bool targets = false; bool balance = false; @@ -1057,16 +1105,28 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) else targets = true; } else if (w->type == ENGINE_MAP) { - wrk->ctx_list[j].engine_map = w->engine_map; - wrk->ctx_list[j].engine_map_count = - w->engine_map_count; + ctx->engine_map = w->engine_map; + ctx->engine_map_count = w->engine_map_count; } else if (w->type == LOAD_BALANCE) { - if (!wrk->ctx_list[j].engine_map) { + if (!ctx->engine_map) { wsim_err("Load balancing needs an engine map!\n"); return 1; } - wrk->ctx_list[j].wants_balance = - w->load_balance; + ctx->wants_balance = w->load_balance; + } else if (w->type == BOND) { + if (!ctx->wants_balance) { + wsim_err("Engine bonds need load balancing engine map!\n"); + return 1; + } + ctx->bond_count++; + ctx->bonds = realloc(ctx->bonds, + ctx->bond_count * + sizeof(struct bond)); + igt_assert(ctx->bonds); + ctx->bonds[ctx->bond_count - 1].mask = + w->bond_mask; + ctx->bonds[ctx->bond_count - 1].master = + w->bond_master; } } @@ -1196,6 +1256,7 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) { .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE, .engines_mask = -1, }; + struct i915_context_engines_bond *bonds = NULL; if (ctx->wants_balance) { set_engines.extensions = @@ -1211,7 +1272,31 @@ prepare_workload(unsigned int id, struct workload *wrk, unsigned int flags) ctx->engine_map[j] - VCS1; /* FIXME */ } + if (ctx->bond_count) { + bonds = calloc(ctx->bond_count, sizeof(*bonds)); + load_balance.base.next_extension = + to_user_pointer(&bonds[0]); + } + + for (j = 0; j < ctx->bond_count; j++) { + struct i915_context_engines_bond *bond = + &bonds[j]; + + if (j < (ctx->bond_count - 1)) + bond->base.next_extension = + to_user_pointer(bond + 1); + + bond->base.name = I915_CONTEXT_ENGINES_EXT_BOND; + bond->master_class = I915_ENGINE_CLASS_VIDEO; + bond->master_instance = + ctx->bonds[j].master - VCS1; + bond->sibling_mask = ctx->bonds[j].mask; + } + gem_context_set_param(fd, ¶m); + + if (bonds) + free(bonds); } else if (ctx->wants_balance) { struct i915_context_engines_load_balance load_balance = { .base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE, @@ -2182,7 +2267,8 @@ static void *run_workload(void *data) continue; } else if (w->type == PREEMPTION || w->type == ENGINE_MAP || - w->type == LOAD_BALANCE) { + w->type == LOAD_BALANCE || + w->type == BOND) { continue; } diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README index 58dada675357..f2974992ab68 100644 --- a/benchmarks/wsim/README +++ b/benchmarks/wsim/README @@ -7,6 +7,7 @@ B. M..[|]... P|X.. d|p|s|t|q|a.,... +b... f For duration a range can be given from which a random value will be picked @@ -26,6 +27,7 @@ Additional workload steps are also supported: 'f' - Create a sync fence. 'a' - Advance the previously created sync fence. 'B' - Turn on context load balancing. + 'b' - Set up engine bonds. 'M' - Set up engine map. 'P' - Context priority. 'X' - Context preemption control. @@ -194,3 +196,51 @@ This enables load balancing for context number one. Submissions to load balanced contexts are only allowed to use the DEFAULT engine specifier. + +Engine bonds +------------ + +Engine bonds are extensions on load balanced contexts. They allow expressing +rules of engine selection between two co-operating contexts tied with submit +fences. In other words, the rule expression is telling the driver: "If you pick +this engine for context one, then you have to pick that engine for context two". + +Syntax is: + b... + +Engine mask is a bitmask representing engines in the engine map configured for +the same context. + +There can be multiple bonds tied to the same context. + +Example: + + M.1.RCS|VECS + B.1 + M.2.VCS1|VCS2 + B.2 + b.2.1.RCS + b.2.2.VECS + +This tells the driver that if it picked RCS for context one, it has to pick VCS1 +for context two. And if it picked VECS for context one, it has to pick VCS1 for +context two. + +If we extend the above example with more workload directives: + + 1.DEFAULT.1000.0.0 + 2.DEFAULT.1000.s-1.0 + +We get to a fully functional example where two batch buffers are submitted in a +load balanced fashion, telling the driver they should run simultaneously and +that valid engine pairs are either RCS + VCS1 (for two contexts respectively), +or VECS + VCS2. + +This can also be extended using sync fences to improve chances of the first +submission not getting on the hardware after the second one. Second block would +then look like: + + f + 1.DEFAULT.1000.f-1.0 + 2.DEFAULT.1000.s-1.0 + a.-3 From patchwork Thu Oct 18 15:28:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647515 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1BE7C15E2 for ; Thu, 18 Oct 2018 15:28:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0EAB028C20 for ; Thu, 18 Oct 2018 15:28:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 031BA28D9D; Thu, 18 Oct 2018 15:28:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A548A28C20 for ; Thu, 18 Oct 2018 15:28:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 161C96E09E; Thu, 18 Oct 2018 15:28:42 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7CA3889144 for ; Thu, 18 Oct 2018 15:28:39 +0000 (UTC) Received: by mail-wr1-x441.google.com with SMTP id y16so34184518wrw.3 for ; Thu, 18 Oct 2018 08:28:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YQLmVfNC6jYTfyvr6IC0BHEzqLAj8I3PP76hKr0UNsA=; b=K+WGNr44zhtLirlD8PPkqNFfbQUvI5lCVFuboArwQO0aRThn2RohUKea2F7gjTYN9H R7IPXyFj2n34yV0G38Poe8GwO/TCHHevGqCWbnnwVFLrid76Yl37FbBamzpVSMIcwczX gqk4EBBvnAQsrje2b9cmNF+1F2XrKjzQcZ3uLbFtGIxuUiJQb+I3jx2bhZrtA7McZ8YE aoPOJ9TS8VHG1z887Qzw4KDxP2UNqXK9qIkYKz/JwsF/pLXv4zP4p1799Nubo1WP7R+R FQjpGiAkWkArySJLHZMSNleUIVOAZAPN59YHvM9uAK2XjGgJ0J0cWVgyW2IF3F34QqPx 9/pA== X-Gm-Message-State: ABuFfojfZLS299/OUdOtizF5Qk/sy5MMGHZw+xFWyU2vxiZKuvbICanh hcCS1aFEDpBvoR5MMzXKaQem5Yqfh3Q= X-Google-Smtp-Source: ACcGV62Pb9p5/UE1m/8BBEtFBo2Pp16+znCkNfGhT+Qqsu7iL3jMGjr1D2t9XoFYNK6hEbMOm76u+g== X-Received: by 2002:adf:d181:: with SMTP id h1-v6mr7184869wri.138.1539876518080; Thu, 18 Oct 2018 08:28:38 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:37 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:14 +0100 Message-Id: <20181018152815.31816-17-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 16/17] gem_wsim: Some more example workloads X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin A few additional workloads useful for experimenting with scheduling. Signed-off-by: Tvrtko Ursulin --- benchmarks/wsim/frame-split-60fps.wsim | 16 ++++++++++++++++ benchmarks/wsim/high-composited-game.wsim | 11 +++++++++++ benchmarks/wsim/media-1080p-player.wsim | 5 +++++ benchmarks/wsim/medium-composited-game.wsim | 9 +++++++++ 4 files changed, 41 insertions(+) create mode 100644 benchmarks/wsim/frame-split-60fps.wsim create mode 100644 benchmarks/wsim/high-composited-game.wsim create mode 100644 benchmarks/wsim/media-1080p-player.wsim create mode 100644 benchmarks/wsim/medium-composited-game.wsim diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim new file mode 100644 index 000000000000..cfbfcd39be7d --- /dev/null +++ b/benchmarks/wsim/frame-split-60fps.wsim @@ -0,0 +1,16 @@ +X.1.0 +M.1.VCS1 +B.1 +X.2.0 +M.2.VCS2 +B.2 +b.2.1.VCS1 +f +1.DEFAULT.4000-6000.f-1.0 +2.DEFAULT.4000-6000.s-1.0 +a.-3 +3.RCS.2000-4000.-3/-2.0 +3.VECS.2000.-1.0 +4.BCS.1000.-1.0 +s.-2 +p.16667 diff --git a/benchmarks/wsim/high-composited-game.wsim b/benchmarks/wsim/high-composited-game.wsim new file mode 100644 index 000000000000..a90a2b2be95b --- /dev/null +++ b/benchmarks/wsim/high-composited-game.wsim @@ -0,0 +1,11 @@ +1.RCS.500.0.0 +1.RCS.2000.0.0 +1.RCS.2000.0.0 +1.RCS.2000.0.0 +1.RCS.2000.0.0 +1.RCS.2000.0.0 +1.RCS.2000.0.0 +P.2.1 +2.BCS.1000.-2.0 +2.RCS.2000.-1.1 +p.16667 diff --git a/benchmarks/wsim/media-1080p-player.wsim b/benchmarks/wsim/media-1080p-player.wsim new file mode 100644 index 000000000000..bcbb0cfd2ad3 --- /dev/null +++ b/benchmarks/wsim/media-1080p-player.wsim @@ -0,0 +1,5 @@ +1.VCS.5000-10000.0.0 +2.RCS.1000-2000.-1.0 +P.3.1 +3.BCS.1000.-2.0 +p.16667 diff --git a/benchmarks/wsim/medium-composited-game.wsim b/benchmarks/wsim/medium-composited-game.wsim new file mode 100644 index 000000000000..580883516168 --- /dev/null +++ b/benchmarks/wsim/medium-composited-game.wsim @@ -0,0 +1,9 @@ +1.RCS.1000-2000.0.0 +1.RCS.1000-2000.0.0 +1.RCS.1000-2000.0.0 +1.RCS.1000-2000.0.0 +1.RCS.1000-2000.0.0 +P.2.1 +2.BCS.1000.-2.0 +2.RCS.2000.-1.1 +p.16667 From patchwork Thu Oct 18 15:28:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647519 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD6E1112B for ; Thu, 18 Oct 2018 15:29:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE56328C76 for ; Thu, 18 Oct 2018 15:29:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2AF828D97; Thu, 18 Oct 2018 15:29:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F3DA728C76 for ; Thu, 18 Oct 2018 15:29:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C3A7D6E0BC; Thu, 18 Oct 2018 15:28:42 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8D5CB6E052 for ; Thu, 18 Oct 2018 15:28:40 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id n1-v6so34168556wrt.10 for ; Thu, 18 Oct 2018 08:28:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7WMlJ1auIa7+yo/IAAkai1q06gDhnk2fvcXjM65lY0o=; b=N2faA8R3xzbclqv4PSN4zU2akX91ZOrIYBedxXPEzKA+GBXrm0PTMZSIaIoumEORfN 9ZdeN+2953HWPrAlrATzDSv/4jXk4qRUvZ5A6PCoS0SaGlk1HgQqiA5okuFuhZShVKAz 3aejV7hqCw4QJJhsPaNp5ajXlhF5W7Yi3ESiQIkIBQNoIr0/bdp8UVKujtJ/f9ntSn4b DotZO78i2kUrlDrDu11CdmBNb88gUyEIqDaSbHjqRh5V8qPgOTtK2Tnn1lwthntmnISy lL5dLR0JyyeNWj7DCplX1TnUw3aShy5UhcE5N3ahb9nGIhALF5xV6n+DJAtS7xpsHrQk CP9Q== X-Gm-Message-State: AGRZ1gLoLHQFfjuQHtiD7TPN3ZyIvmn9fx1laU2P7G75GJhwsSQdFR0c l/IlqNIQNlw2kjvzXBEPydaUR1oDW4s= X-Google-Smtp-Source: AJdET5e5m5PYOHmGTGhud3rYMO0OlFhIAfocjUpldBSI9DU2YI/gOMeZjuqnNCsCN4YmiON9+Grr2A== X-Received: by 2002:a5d:498c:: with SMTP id r12-v6mr2642560wrq.232.1539876518928; Thu, 18 Oct 2018 08:28:38 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:38 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:15 +0100 Message-Id: <20181018152815.31816-18-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 17/17] gem_wsim: Infinite batch support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin For simulating frame split workloads it is useful to express a batch which ends at the same time as the parallel submission on the respective bonded engine. For this we add support for infinite batch durations and the batch terminate command ('T'). Syntax looks like this: 1.RCS.*.0.0 T.-1 First step starts an infinite batch, and second command terminates the infinite batch with the usual relative workload step addressing. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 119 +++++++++++++++++++------ benchmarks/wsim/README | 9 +- benchmarks/wsim/frame-split-60fps.wsim | 6 +- 3 files changed, 102 insertions(+), 32 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index b5ade7b33883..3669c1f7f1c9 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -85,6 +85,7 @@ enum w_type ENGINE_MAP, LOAD_BALANCE, BOND, + TERMINATE, }; struct deps @@ -112,6 +113,7 @@ struct w_step unsigned int context; unsigned int engine; struct duration duration; + bool unbound_duration; struct deps data_deps; struct deps fence_deps; int emit_fence; @@ -142,7 +144,7 @@ struct w_step struct drm_i915_gem_execbuffer2 eb; struct drm_i915_gem_exec_object2 *obj; - struct drm_i915_gem_relocation_entry reloc[4]; + struct drm_i915_gem_relocation_entry reloc[5]; unsigned long bb_sz; uint32_t bb_handle; uint32_t *mapped_batch; @@ -153,6 +155,7 @@ struct w_step uint32_t *rt1_address; uint32_t *latch_value; uint32_t *latch_address; + uint32_t *recursive_bb_start; unsigned int mapped_len; }; @@ -492,6 +495,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) step.type = ENGINE_MAP; goto add_step; + } else if (!strcmp(field, "T")) { + int_field(TERMINATE, target, + tmp >= 0 || ((int)nr_steps + tmp) < 0, + "Invalid terminate target at step %u!\n"); } else if (!strcmp(field, "X")) { unsigned int nr = 0; while ((field = strtok_r(fstart, ".", &fctx))) { @@ -598,23 +605,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) fstart = NULL; - tmpl = strtol(field, &sep, 10); - check_arg(tmpl <= 0 || tmpl == LONG_MIN || - tmpl == LONG_MAX, - "Invalid duration at step %u!\n", nr_steps); - step.duration.min = tmpl; - - if (sep && *sep == '-') { - tmpl = strtol(sep + 1, NULL, 10); - check_arg(tmpl <= 0 || - tmpl <= step.duration.min || - tmpl == LONG_MIN || + if (field[0] == '*') { + step.unbound_duration = true; + } else { + tmpl = strtol(field, &sep, 10); + check_arg(tmpl <= 0 || tmpl == LONG_MIN || tmpl == LONG_MAX, - "Invalid duration range at step %u!\n", + "Invalid duration at step %u!\n", nr_steps); - step.duration.max = tmpl; - } else { - step.duration.max = step.duration.min; + step.duration.min = tmpl; + + if (sep && *sep == '-') { + tmpl = strtol(sep + 1, NULL, 10); + check_arg(tmpl <= 0 || + tmpl <= step.duration.min || + tmpl == LONG_MIN || + tmpl == LONG_MAX, + "Invalid duration range at step %u!\n", + nr_steps); + step.duration.max = tmpl; + } else { + step.duration.max = step.duration.min; + } } valid++; @@ -773,7 +785,7 @@ init_bb(struct w_step *w, unsigned int flags) unsigned int i; uint32_t *ptr; - if (!arb_period) + if (w->unbound_duration || !arb_period) return; gem_set_domain(fd, w->bb_handle, @@ -793,6 +805,7 @@ terminate_bb(struct w_step *w, unsigned int flags) const uint32_t bbe = 0xa << 23; unsigned long mmap_start, mmap_len; unsigned long batch_start = w->bb_sz; + unsigned int r = 0; uint32_t *ptr, *cs; igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT)); @@ -803,6 +816,9 @@ terminate_bb(struct w_step *w, unsigned int flags) if (flags & RT) batch_start -= 12 * sizeof(uint32_t); + if (w->unbound_duration) + batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */ + mmap_start = rounddown(batch_start, PAGE_SIZE); mmap_len = w->bb_sz - mmap_start; @@ -812,8 +828,19 @@ terminate_bb(struct w_step *w, unsigned int flags) ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE); cs = (uint32_t *)((char *)ptr + batch_start - mmap_start); + if (w->unbound_duration) { + w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t); + batch_start += 4 * sizeof(uint32_t); + + *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP; + w->recursive_bb_start = cs; + *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1; + *cs++ = 0; + *cs++ = 0; + } + if (flags & SEQNO) { - w->reloc[0].offset = batch_start + sizeof(uint32_t); + w->reloc[r++].offset = batch_start + sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = MI_STORE_DWORD_IMM; @@ -825,7 +852,7 @@ terminate_bb(struct w_step *w, unsigned int flags) } if (flags & RT) { - w->reloc[1].offset = batch_start + sizeof(uint32_t); + w->reloc[r++].offset = batch_start + sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = MI_STORE_DWORD_IMM; @@ -835,7 +862,7 @@ terminate_bb(struct w_step *w, unsigned int flags) w->rt0_value = cs; *cs++ = 0; - w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t); + w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */ @@ -844,7 +871,7 @@ terminate_bb(struct w_step *w, unsigned int flags) *cs++ = 0; *cs++ = 0; - w->reloc[3].offset = batch_start + sizeof(uint32_t); + w->reloc[r++].offset = batch_start + sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = MI_STORE_DWORD_IMM; @@ -979,19 +1006,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags) } } - w->bb_sz = get_bb_sz(w->duration.max); - w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz); + if (w->unbound_duration) + /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */ + w->bb_sz = max(64, get_bb_sz(w->preempt_us)) + + (1 + 3) * sizeof(uint32_t); + else + w->bb_sz = get_bb_sz(w->duration.max); + w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0)); init_bb(w, flags); terminate_bb(w, flags); - if (flags & SEQNO) { + if ((flags & SEQNO) || w->unbound_duration) { w->obj[j].relocs_ptr = to_user_pointer(&w->reloc); + if (flags & SEQNO) + w->obj[j].relocation_count++; if (flags & RT) - w->obj[j].relocation_count = 4; - else - w->obj[j].relocation_count = 1; + w->obj[j].relocation_count += 3; + if (w->unbound_duration) + w->obj[j].relocation_count++; for (i = 0; i < w->obj[j].relocation_count; i++) w->reloc[i].target_handle = 1; + if (w->unbound_duration) + w->reloc[0].target_handle = j; } w->eb.buffers_ptr = to_user_pointer(w->obj); @@ -1988,6 +2024,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno) } } +static void +update_bb_start(struct w_step *w) +{ + if (!w->unbound_duration) + return; + + gem_set_domain(fd, w->bb_handle, + I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC); + + *w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1; +} + static void w_sync_to(struct workload *wrk, struct w_step *w, int target) { if (target < 0) @@ -2123,9 +2171,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine, if (flags & RT) update_bb_rt(w, engine, seqno); + update_bb_start(w); + w->eb.batch_start_offset = + w->unbound_duration ? + 0 : ALIGN(w->bb_sz - get_bb_sz(get_duration(w)), - 2 * sizeof(uint32_t)); + 2 * sizeof(uint32_t)); for (i = 0; i < w->fence_deps.nr; i++) { int tgt = w->idx + w->fence_deps.list[i]; @@ -2265,6 +2317,17 @@ static void *run_workload(void *data) w->priority; } continue; + } else if (w->type == TERMINATE) { + unsigned int t_idx = i + w->target; + + igt_assert(t_idx >= 0 && t_idx < i); + igt_assert(wrk->steps[t_idx].type == BATCH); + igt_assert(wrk->steps[t_idx].unbound_duration); + + *wrk->steps[t_idx].recursive_bb_start = + MI_BATCH_BUFFER_END; + __sync_synchronize(); + continue; } else if (w->type == PREEMPTION || w->type == ENGINE_MAP || w->type == LOAD_BALANCE || diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README index f2974992ab68..439ea3650e3d 100644 --- a/benchmarks/wsim/README +++ b/benchmarks/wsim/README @@ -2,11 +2,11 @@ Workload descriptor format ========================== ctx.engine.duration_us.dependency.wait,... -..[-].[/][...].<0|1>,... +..[-]|*.[/][...].<0|1>,... B. M..[|]... P|X.. -d|p|s|t|q|a.,... +d|p|s|t|q|a|T.,... b... f @@ -30,6 +30,7 @@ Additional workload steps are also supported: 'b' - Set up engine bonds. 'M' - Set up engine map. 'P' - Context priority. + 'T' - Terminate an infinite batch. 'X' - Context preemption control. Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS @@ -77,6 +78,10 @@ Example: I this case the last step has a data dependency on both first and second steps. +Batch durations can also be specified as infinite by using the '*' in the +duration field. Such batches must be ended by the terminate command ('T') +otherwise they will cause a GPU hang to be reported. + Sync (fd) fences ---------------- diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim index cfbfcd39be7d..ea89da3add48 100644 --- a/benchmarks/wsim/frame-split-60fps.wsim +++ b/benchmarks/wsim/frame-split-60fps.wsim @@ -6,10 +6,12 @@ M.2.VCS2 B.2 b.2.1.VCS1 f -1.DEFAULT.4000-6000.f-1.0 +1.DEFAULT.*.f-1.0 2.DEFAULT.4000-6000.s-1.0 a.-3 -3.RCS.2000-4000.-3/-2.0 +s.-2 +T.-4 +3.RCS.2000-4000.-5/-4.0 3.VECS.2000.-1.0 4.BCS.1000.-1.0 s.-2