From patchwork Tue Jan 16 13:56:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 10167079 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8C98960325 for ; Tue, 16 Jan 2018 13:57:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D27328433 for ; Tue, 16 Jan 2018 13:57:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 717742844B; Tue, 16 Jan 2018 13:57:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 59E1028433 for ; Tue, 16 Jan 2018 13:57:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EB55D6E243; Tue, 16 Jan 2018 13:57:19 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM03-CO1-obe.outbound.protection.outlook.com (mail-co1nam03on0084.outbound.protection.outlook.com [104.47.40.84]) by gabe.freedesktop.org (Postfix) with ESMTPS id B3F356E239 for ; Tue, 16 Jan 2018 13:57:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=onevmw.onmicrosoft.com; s=selector1-vmware-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=zaad3mfa95sMq1cLVWpM2kxSjzjS+TBlLaTD9KqDbcU=; b=fUreXoznt+ji//6G0cJ0SoazQiQWszdll2xSWTxKVhfA5CPRDhAXEM5Kut+dmd52QyfZqmyHfh8Sr8zzYirSrmANCyvRZJ5DVkbKHSh+tiMvGGox9yflNTUaE8ZvP61QpChOzmeMX8fDfz2cNa+WA81JLEWDncJ6XQPZRti3awo= Received: from ubuntu.localdomain (155.4.205.56) by BLUPR05MB754.namprd05.prod.outlook.com (10.141.208.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.428.9; Tue, 16 Jan 2018 13:57:11 +0000 From: Thomas Hellstrom To: dri-devel@lists.freedesktop.org Subject: [PATCH 4/5] drm/vmwgfx: Add a cpu blit utility that can be used for page-backed bos Date: Tue, 16 Jan 2018 14:56:43 +0100 Message-Id: <1516111004-10247-5-git-send-email-thellstrom@vmware.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1516111004-10247-1-git-send-email-thellstrom@vmware.com> References: <1516111004-10247-1-git-send-email-thellstrom@vmware.com> MIME-Version: 1.0 X-Originating-IP: [155.4.205.56] X-ClientProxiedBy: HE1PR0802CA0014.eurprd08.prod.outlook.com (10.172.123.152) To BLUPR05MB754.namprd05.prod.outlook.com (10.141.208.142) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7580c6c4-c66b-4bb6-39d2-08d55ce90a9d X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4604075)(4534125)(4602075)(4627221)(201703031133081)(201702281549075)(2017052603307)(7153060)(7193020); SRVR:BLUPR05MB754; X-Microsoft-Exchange-Diagnostics: 1; BLUPR05MB754; 3:SWYA83qqPJzNCNt4a03MhJ1+VOZpChIWtJV9pMjMgU00liITnRd/2HG6Ev4JwkBp1Qki/dEW5XN1qb+w4jXIWRdEd6quAdVcvXuqVJXGJmPFlGCC3kT9rV2IPOgv2R8iVhj70dfzP0cWk5vGkj01+vFjmHKerC/e16wjJagt0A6DVskJJ5vvPuSBgZqpKhTYa03jMHgdhjUqflp9Ac2weYroaZ85czqODOUfXmRu5qvMRre0FOi1efq7iuPhEejO; 25:ALfCq9n5tIhMaRliYnsL+iKPcUiDmPYuFayQN1qwQXw9M6ZaBAxpkUS17lDcbdYjfFNYnlOk+SQMDJdYpE6J2CxIvKHAWqGaK7zqhh/mMwdoN2sx6JjZ4kDylohOLpxlMUzJ2E27md+8g/jXwITPvd9cN30yznSwdnz173RvdytdsrQRqQSMjmulE93L0FYOMqG2+7HhD8N8ZIhrpqcMom9Q66ku2iUqwFw5bjPhkrT8HTOdmXYDWT+s1N91D4Abc+MkzCQmxImCtnZ2YNLlyMTEBaLZpEHCezTvTaUvKaHDBH/Qw1bW1VtVfENW+q9idGO7QpikjnbBdj86776QwQ==; 31:7a9VjPi6OAvH15WdVg/2b7hbrtQRS4q/OIvN4URZGTAWFvCP9osbw67QLNKcgwnigHl41dau1hzjIEWj7DdRdiL7XGgusWVG3/ozuGyI09iFkVXj0QWTdtgoYhwytAd4Gbq27xuMkiyKwaEm2ck0Dez5QhP3weYFARiTb7xTwrJCWSOCuMRgYKzIl85PRrx7GUNu35tgBezexR8CAyM1dZXdDn/XrbObPMu1x7hr14c= X-MS-TrafficTypeDiagnostic: BLUPR05MB754: Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=thellstrom@vmware.com; X-Microsoft-Exchange-Diagnostics: 1; BLUPR05MB754; 20:uJm37mqBaeDmJcjHUGXVhb4yFnKCuzoSBKz8576WBnvh60ojeXh5NOJ0HDriyGPCO85ihZoDwoy5BV0fw5PSLq2OrD6nmzk5nF1La5AdL9rwMQO8i/p3CXurrrkBqczKZErST9x7j9CZ2bfuXIBCm5qnkVhQYTT8gEDWnFGAlwWjMBwuttZHpCLVSHWD86WhgMveCdHu5nsvMIZQmaz54P/O+jXgs0mRAFdudBJZC7HFxNjZzcLWYgiUKBu24mhd/HgPMjb8KsvuAuXcPQVmmGQuEl/jTpkptrRL52e9NWXGqoNypIzgfPAqmWs1GvJhsQ5YaWxWM8NkToYO7Qt6r1V9zmOo4aiJ3UNkeakNMawS1JmWxx8I9UFTKXoPmDm1rOAhJcXiwCuzIGhTbDy79ltzG+/aGzdmu99r2QrjeIcGQBMjzryRuWSyMDcXtxB40aA04lLIP2tqb/jsloghOXQo4x6FNRxg5wTsVkb5lXcEODiUoOa1B77waLnydKv6; 4:4IIsVJZwf37SIhEXRSGzn/X+HTca2BHD2pBmlv6RmC7auE8GyZUQazunRyEcHOYZIKxo2hVZP90hgUNdE7T9fr54GNO9GERk39KFtXXiJuD9S2IQztjNIx6IgzUF2/J+2fTR5mcqSLg6aormjEWVB2yxQd+nw+iHEaG5rV85XXR9eu+FgC3VIOMdJNm8ki+Qd0F4cfC6vCjldJh7pn0eWwvhXUjFvVSOeYhWNzO+Aj8tEsy+Zcxqc8WMEwUOeHMCm2GfLFC6LNN6AdcqD2mTyMAhyorNWHncmwX/wDOUSnVUvZF4am5gi3H46UMpU5f+DJ4u5diUu8reKlMLdHsc+M2IxIcJZRJILah0vagsc7w= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(61668805478150)(278428928389397); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040470)(2401047)(5005006)(8121501046)(3231023)(2400035)(944501161)(10201501046)(3002001)(93006095)(93001095)(6041268)(20161123562045)(20161123564045)(20161123560045)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:BLUPR05MB754; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:BLUPR05MB754; X-Forefront-PRVS: 0554B1F54F X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(396003)(39380400002)(39860400002)(346002)(376002)(366004)(189003)(199004)(6116002)(2351001)(6506007)(386003)(5660300001)(2870700001)(3846002)(4326008)(305945005)(8676002)(68736007)(16526018)(86362001)(59450400001)(2361001)(97736004)(81156014)(7736002)(50226002)(81166006)(8936002)(2906002)(36756003)(6916009)(52116002)(6512007)(6486002)(50466002)(76176011)(66066001)(53936002)(47776003)(2950100002)(23676004)(105586002)(106356001)(478600001)(107886003)(6666003)(316002)(25786009)(26005); DIR:OUT; SFP:1101; SCL:1; SRVR:BLUPR05MB754; H:ubuntu.localdomain; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: vmware.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtCTFVQUjA1TUI3NTQ7MjM6YkNpSUZ5anVtRWdYNkMrSFNRcmFzK01nY08w?= =?utf-8?B?L2VBQ2ZibmZIRU9uR2dHVm44eTQ4SkVGNVlTbDB4QzJuNTNIbTBxN3ZtZHRh?= =?utf-8?B?K1NzRzFEZFpDSTBCWDhRM2Y1TmFvcExWeUdyS2xsZEpDZHhFWnRQYSs1c2F2?= =?utf-8?B?TExkTDNOTlNHUWpGTW9XS2lObmluL1o2Y0NnMVQ2eitISU8wZFhiaUN3WUVk?= =?utf-8?B?RktMbjFBcnZXZHppV3EzV25oUWg1bmxOMkxQYXgvcEU2d3dkVWhYZy82WXp5?= =?utf-8?B?cHVNUzlZVTV3YWxTcExPYkZTLzRQcy95aVJ2VnJsb1kyMTNlUytpeE1mWU9V?= =?utf-8?B?VFZMMU13YkFHWGpvQXB4SzVnWGJ6V1Z6UHNGY1loMkJsQStFVjVRT0c0YklP?= =?utf-8?B?RzlNU25hWmNwYSszcVRKY3JNdEhIOG1hOW16cEdkTFd0MkhwV0xvYTV3S1NF?= =?utf-8?B?WHh6U1dPaklyV1dWc3lwZlB5SEZkb2RMc1MybXR4Y05MdXpBNnZDbU1WSUph?= =?utf-8?B?S2o3UjlLVTdGbVEwNXcwVmp4K2tyM3pFWEpUWTRvQWYvYjBNZ2lhbzRjL0wx?= =?utf-8?B?U1RpQ1hlRmkyeW04cG1VT21FOHg2NVVwUStyWmZZQlJXTVF5bk9Yem5lVGVX?= =?utf-8?B?OEJpbzN1TVhydE8zMmpRNFJMUUlGWDhRY2JQK3JxNDNQS04xUWlZS2tqZ2xR?= =?utf-8?B?OVhvcjNIeThaaVYxWEd4TzhxTEF3NFZUbFgyQitQcG12VkFRZU1Rd2kzbEFz?= =?utf-8?B?Y2hycGcybU8ydHMvQ2UvN0xyYkh6Z2xYTUMyK3I0YTEyV05oU1ZxeEJ4UEs5?= =?utf-8?B?VWNvaXVvOXBKVnJKYnFha0YyNjlvcDhyY2l1Uy9BRmFJN3YvamVPOFFSOVZj?= =?utf-8?B?elpxbUVLNml0NGpET21jY3gvODJSTnhYUVk1Y295R0JPZ1hrYThXMTVGbTl0?= =?utf-8?B?YndHYnpsMXZXQStwOEZySEdpQldRdTk5UUZ2ZHM5WmVGRFdham56cWVaSWcr?= =?utf-8?B?R21mRWFiMUhWZlp3d3JKSlRNMkNrbXFVNm1VTzh6ZTJOWW12ZFNra1BiTENT?= =?utf-8?B?WUJLekk3Q0FmSzlMR1k4QzhCTGlxYUZvZy9IMzM4MmJvVDg1cGxGUHp1Wk1m?= =?utf-8?B?M1hwdEhPMVBKbGNWa1RjTG9ZNHZReThkMjJQaHBFc2IyeWNPdXFiWXl3dGtV?= =?utf-8?B?dDVHN3Jid3AwVGlkR2o3ZnRQa3RyMnJnRUJ6TXRSVWFGZnFKT1RzTC8yT282?= =?utf-8?B?N3ZDVjRDQXFSMlBpQm9YOG5JZUovamtFaTlucjhHWXlnYlNkbzJpdzFwZVZY?= =?utf-8?B?c1hyVi9xVDRtOXF2LzBnZS96WE9wQldvSGMvR1hQc3U5VEpDY3AxUzZxOExh?= =?utf-8?B?QVZxQWsyakhNUXVkLzZONFQrNDN1RW9kK2NuM1ZLNWR3RWljMWM1UGxUdmo1?= =?utf-8?B?a0c5ZGhXcEE4RERxU21LcXVTS2NkSmRpd1NGYlZBOU95UnBmNnlrOGpmeEw2?= =?utf-8?B?QzNCYjJ6eUpVS1lncFlOSndhdDFPaWF0bDdROXVCbFhkVCtyTnphZkY3ZjZr?= =?utf-8?B?RHdJcmJVcWplOTV3TDVaVUJrS1NlZz09?= X-Microsoft-Exchange-Diagnostics: 1; BLUPR05MB754; 6:7/hGFDMVLqfXYA7B8YWiFKgIe0y1ZELchoa2dYtmeRPxmtpxr0n6rHAjeAAp7g4JNOiwDoIqEjVonNZoSVKCNCpdM/jF/KrF1PJM/IyxSGdvj9nop0fRcQCP4y+WUfHJmaFlSp10srfriity4jV/atDjvIYVGKZIQADit/z3ohmQ4o3clJYX5U80B4ijS9duOp0Lw5ZKPs1UP0U0BvBZRhi3iDGDcSgm2h/doj5N+4A3g2IyeKf3/d2OmMira4S048P+YUYgop21lqzTuePfomeo//FfOMS1/FRpsjSjP78J2hS+pXkxwp9dFdmGYkVJxtxj3VxqD3tWtRRYw4yckbyWC5vBfh6xk0mVkMJKqmI=; 5:J5M9/qxaopSMG5eTnxnDicBL59DbWqO1B5i0kE5PeHLtw1oxR5WPMtZ4tCdShpGPffSorWmJ6sd8l8bk0x37I8wH8BUVuuAECxXsTmKKp4DgNXXYBwz/+eZkkhOima3Q1wcMmB9Kf5IieLeGXGGNLWFTwOWziqHvuqwUuvzNAyc=; 24:sebeWUbYGU4iE8tcqw4tlFXYDTdayGIxiFcvgrskzchmfApZBPI5ap/hqjMH+fc2HabQik+HYUKOB0AoLXkYHZNtXAVrVmqi7xpG0f15Y1Q=; 7:2yaJXySEqTuKIVIuHT32YcsXsK2WUK11kF4EiiBwpo6HRf8N3mfadkBYTrwm3wxDT456nD5oQW95foSovmMKNqoTttR1tkmAMttdX4tqJ+07jq8m0N0Kf4QDPLZLengNzaWclpzzH2WYl7eMXSEZA0Qz1T1/lcyNRc4Sn/DwqthXz+51kJd7/qHjXrgpBk8HOKPnTItjD9rqGP491eAwxwzjm4LyMO4TKbyoNeG4Qko4W8IiFZo5aTYES3izZImz SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; BLUPR05MB754; 20:G6yrfDSNJtXyZKVC6MgDz992XQXeTDLm5KFpldeIFqTXfdw+bi/xfx5I0ywynql16dWhDdHRHMytrmRZ4mS/YxxxvgnF0xb+5Ky18CfrluU9syzuPrb5XGUbrPYZIcF79mqFUsaAYMdQfm9HYplZXhHEyfz2ETwNFQev+uzzclM= X-OriginatorOrg: vmware.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jan 2018 13:57:11.2379 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7580c6c4-c66b-4bb6-39d2-08d55ce90a9d X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR05MB754 Cc: Thomas Hellstrom X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP The utility uses kmap_atomic() instead of vmapping the whole buffer object. As a result there will be more book-keeping but on some architectures this will help avoid exhausting vmalloc space and also avoid expensive TLB flushes. The blit utility also adds a provision to compute a bounding box of changed content, which is very useful to optimize presentation speed of ill-behaved applications that don't supply proper damage regions, and for page-flips. The cost of computing the bounding box is not that expensive when done in a cpu-blit utility like this. Signed-off-by: Thomas Hellstrom Reviewed-by: Brian Paul --- drivers/gpu/drm/vmwgfx/Makefile | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 506 +++++++++++++++++++++++++++++++++++ drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 48 ++++ 3 files changed, 555 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c diff --git a/drivers/gpu/drm/vmwgfx/Makefile b/drivers/gpu/drm/vmwgfx/Makefile index ad80211..794cc9d 100644 --- a/drivers/gpu/drm/vmwgfx/Makefile +++ b/drivers/gpu/drm/vmwgfx/Makefile @@ -7,6 +7,6 @@ vmwgfx-y := vmwgfx_execbuf.o vmwgfx_gmr.o vmwgfx_kms.o vmwgfx_drv.o \ vmwgfx_surface.o vmwgfx_prime.o vmwgfx_mob.o vmwgfx_shader.o \ vmwgfx_cmdbuf_res.o vmwgfx_cmdbuf.o vmwgfx_stdu.o \ vmwgfx_cotable.o vmwgfx_so.o vmwgfx_binding.o vmwgfx_msg.o \ - vmwgfx_simple_resource.o vmwgfx_va.o + vmwgfx_simple_resource.o vmwgfx_va.o vmwgfx_blit.o obj-$(CONFIG_DRM_VMWGFX) := vmwgfx.o diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c new file mode 100644 index 0000000..2730403 --- /dev/null +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c @@ -0,0 +1,506 @@ +/************************************************************************** + * + * Copyright © 2017 VMware, Inc., Palo Alto, CA., USA + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +#include "vmwgfx_drv.h" + +/* + * Template that implements find_first_diff() for a generic + * unsigned integer type. @size and return value are in bytes. + */ +#define VMW_FIND_FIRST_DIFF(_type) \ +static size_t vmw_find_first_diff_ ## _type \ + (const _type * dst, const _type * src, size_t size)\ +{ \ + size_t i; \ + \ + for (i = 0; i < size; i += sizeof(_type)) { \ + if (*dst++ != *src++) \ + break; \ + } \ + \ + return i; \ +} + + +/* + * Template that implements find_last_diff() for a generic + * unsigned integer type. Pointers point to the item following the + * *end* of the area to be examined. @size and return value are in + * bytes. + */ +#define VMW_FIND_LAST_DIFF(_type) \ +static ssize_t vmw_find_last_diff_ ## _type( \ + const _type * dst, const _type * src, size_t size) \ +{ \ + while (size) { \ + if (*--dst != *--src) \ + break; \ + \ + size -= sizeof(_type); \ + } \ + return size; \ +} + + +/* + * Instantiate find diff functions for relevant unsigned integer sizes, + * assuming that wider integers are faster (including aligning) up to the + * architecture native width, which is assumed to be 32 bit unless + * CONFIG_64BIT is defined. + */ +VMW_FIND_FIRST_DIFF(u8); +VMW_FIND_LAST_DIFF(u8); + +VMW_FIND_FIRST_DIFF(u16); +VMW_FIND_LAST_DIFF(u16); + +VMW_FIND_FIRST_DIFF(u32); +VMW_FIND_LAST_DIFF(u32); + +#ifdef CONFIG_64BIT +VMW_FIND_FIRST_DIFF(u64); +VMW_FIND_LAST_DIFF(u64); +#endif + + +/* We use size aligned copies. This computes (addr - align(addr)) */ +#define SPILL(_var, _type) ((unsigned long) _var & (sizeof(_type) - 1)) + + +/* + * Template to compute find_first_diff() for a certain integer type + * including a head copy for alignment, and adjustment of parameters + * for tail find or increased resolution find using an unsigned integer find + * of smaller width. If finding is complete, and resolution is sufficient, + * the macro executes a return statement. Otherwise it falls through. + */ +#define VMW_TRY_FIND_FIRST_DIFF(_type) \ +do { \ + unsigned int spill = SPILL(dst, _type); \ + size_t diff_offs; \ + \ + if (spill && spill == SPILL(src, _type) && \ + sizeof(_type) - spill <= size) { \ + spill = sizeof(_type) - spill; \ + diff_offs = vmw_find_first_diff_u8(dst, src, spill); \ + if (diff_offs < spill) \ + return round_down(offset + diff_offs, granularity); \ + \ + dst += spill; \ + src += spill; \ + size -= spill; \ + offset += spill; \ + spill = 0; \ + } \ + if (!spill && !SPILL(src, _type)) { \ + size_t to_copy = size & ~(sizeof(_type) - 1); \ + \ + diff_offs = vmw_find_first_diff_ ## _type \ + ((_type *) dst, (_type *) src, to_copy); \ + if (diff_offs >= size || granularity == sizeof(_type)) \ + return (offset + diff_offs); \ + \ + dst += diff_offs; \ + src += diff_offs; \ + size -= diff_offs; \ + offset += diff_offs; \ + } \ +} while (0) \ + + +/** + * vmw_find_first_diff - find the first difference between dst and src + * + * @dst: The destination address + * @src: The source address + * @size: Number of bytes to compare + * @granularity: The granularity needed for the return value in bytes. + * return: The offset from find start where the first difference was + * encountered in bytes. If no difference was found, the function returns + * a value >= @size. + */ +static size_t vmw_find_first_diff(const u8 *dst, const u8 *src, size_t size, + size_t granularity) +{ + size_t offset = 0; + + /* + * Try finding with large integers if alignment allows, or we can + * fix it. Fall through if we need better resolution or alignment + * was bad. + */ +#ifdef CONFIG_64BIT + VMW_TRY_FIND_FIRST_DIFF(u64); +#endif + VMW_TRY_FIND_FIRST_DIFF(u32); + VMW_TRY_FIND_FIRST_DIFF(u16); + + return round_down(offset + vmw_find_first_diff_u8(dst, src, size), + granularity); +} + + +/* + * Template to compute find_last_diff() for a certain integer type + * including a tail copy for alignment, and adjustment of parameters + * for head find or increased resolution find using an unsigned integer find + * of smaller width. If finding is complete, and resolution is sufficient, + * the macro executes a return statement. Otherwise it falls through. + */ +#define VMW_TRY_FIND_LAST_DIFF(_type) \ +do { \ + unsigned int spill = SPILL(dst, _type); \ + ssize_t location; \ + ssize_t diff_offs; \ + \ + if (spill && spill <= size && spill == SPILL(src, _type)) { \ + diff_offs = vmw_find_last_diff_u8(dst, src, spill); \ + if (diff_offs) { \ + location = size - spill + diff_offs - 1; \ + return round_down(location, granularity); \ + } \ + \ + dst -= spill; \ + src -= spill; \ + size -= spill; \ + spill = 0; \ + } \ + if (!spill && !SPILL(src, _type)) { \ + size_t to_copy = round_down(size, sizeof(_type)); \ + \ + diff_offs = vmw_find_last_diff_ ## _type \ + ((_type *) dst, (_type *) src, to_copy); \ + location = size - to_copy + diff_offs - sizeof(_type); \ + if (location < 0 || granularity == sizeof(_type)) \ + return location; \ + \ + dst -= to_copy - diff_offs; \ + src -= to_copy - diff_offs; \ + size -= to_copy - diff_offs; \ + } \ +} while (0) + + +/** + * vmw_find_last_diff - find the last difference between dst and src + * + * @dst: The destination address + * @src: The source address + * @size: Number of bytes to compare + * @granularity: The granularity needed for the return value in bytes. + * return: The offset from find start where the last difference was + * encountered in bytes, or a negative value if no difference was found. + */ +static ssize_t vmw_find_last_diff(const u8 *dst, const u8 *src, size_t size, + size_t granularity) +{ + dst += size; + src += size; + +#ifdef CONFIG_64BIT + VMW_TRY_FIND_LAST_DIFF(u64); +#endif + VMW_TRY_FIND_LAST_DIFF(u32); + VMW_TRY_FIND_LAST_DIFF(u16); + + return round_down(vmw_find_last_diff_u8(dst, src, size) - 1, + granularity); +} + + +/** + * vmw_memcpy - A wrapper around kernel memcpy with allowing to plug it into a + * struct vmw_diff_cpy. + * + * @diff: The struct vmw_diff_cpy closure argument (unused). + * @dest: The copy destination. + * @src: The copy source. + * @n: Number of bytes to copy. + */ +void vmw_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, size_t n) +{ + memcpy(dest, src, n); +} + + +/** + * vmw_adjust_rect - Adjust rectangle coordinates for newly found difference + * + * @diff: The struct vmw_diff_cpy used to track the modified bounding box. + * @diff_offs: The offset from @diff->line_offset where the difference was + * found. + */ +static void vmw_adjust_rect(struct vmw_diff_cpy *diff, size_t diff_offs) +{ + size_t offs = (diff_offs + diff->line_offset) / diff->cpp; + struct drm_rect *rect = &diff->rect; + + rect->x1 = min_t(int, rect->x1, offs); + rect->x2 = max_t(int, rect->x2, offs + 1); + rect->y1 = min_t(int, rect->y1, diff->line); + rect->y2 = max_t(int, rect->y2, diff->line + 1); +} + +/** + * vmw_diff_memcpy - memcpy that creates a bounding box of modified content. + * + * @diff: The struct vmw_diff_cpy used to track the modified bounding box. + * @dest: The copy destination. + * @src: The copy source. + * @n: Number of bytes to copy. + * + * In order to correctly track the modified content, the field @diff->line must + * be pre-loaded with the current line number, the field @diff->line_offset must + * be pre-loaded with the line offset in bytes where the copy starts, and + * finally the field @diff->cpp need to be preloaded with the number of bytes + * per unit in the horizontal direction of the area we're examining. + * Typically bytes per pixel. + * This is needed to know the needed granularity of the difference computing + * operations. A higher cpp generally leads to faster execution at the cost of + * bounding box width precision. + */ +void vmw_diff_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, + size_t n) +{ + ssize_t csize, byte_len; + + if (WARN_ON_ONCE(round_down(n, diff->cpp) != n)) + return; + + /* TODO: Possibly use a single vmw_find_first_diff per line? */ + csize = vmw_find_first_diff(dest, src, n, diff->cpp); + if (csize < n) { + vmw_adjust_rect(diff, csize); + byte_len = diff->cpp; + + /* + * Starting from where first difference was found, find + * location of last difference, and then copy. + */ + diff->line_offset += csize; + dest += csize; + src += csize; + n -= csize; + csize = vmw_find_last_diff(dest, src, n, diff->cpp); + if (csize >= 0) { + byte_len += csize; + vmw_adjust_rect(diff, csize); + } + memcpy(dest, src, byte_len); + } + diff->line_offset += n; +} + +/** + * struct vmw_bo_blit_line_data - Convenience argument to vmw_bo_cpu_blit_line + * + * @mapped_dst: Already mapped destination page index in @dst_pages. + * @dst_addr: Kernel virtual address of mapped destination page. + * @dst_pages: Array of destination bo pages. + * @dst_num_pages: Number of destination bo pages. + * @dst_prot: Destination bo page protection. + * @mapped_src: Already mapped source page index in @dst_pages. + * @src_addr: Kernel virtual address of mapped source page. + * @src_pages: Array of source bo pages. + * @src_num_pages: Number of source bo pages. + * @src_prot: Source bo page protection. + * @diff: Struct vmw_diff_cpy, in the end forwarded to the memcpy routine. + */ +struct vmw_bo_blit_line_data { + u32 mapped_dst; + u8 *dst_addr; + struct page **dst_pages; + u32 dst_num_pages; + pgprot_t dst_prot; + u32 mapped_src; + u8 *src_addr; + struct page **src_pages; + u32 src_num_pages; + pgprot_t src_prot; + struct vmw_diff_cpy *diff; +}; + +/** + * vmw_bo_cpu_blit_line - Blit part of a line from one bo to another. + * + * @d: Blit data as described above. + * @dst_offset: Destination copy start offset from start of bo. + * @src_offset: Source copy start offset from start of bo. + * @bytes_to_copy: Number of bytes to copy in this line. + */ +static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, + u32 dst_offset, + u32 src_offset, + u32 bytes_to_copy) +{ + struct vmw_diff_cpy *diff = d->diff; + + while (bytes_to_copy) { + u32 copy_size = bytes_to_copy; + u32 dst_page = dst_offset >> PAGE_SHIFT; + u32 src_page = src_offset >> PAGE_SHIFT; + u32 dst_page_offset = dst_offset & ~PAGE_MASK; + u32 src_page_offset = src_offset & ~PAGE_MASK; + bool unmap_dst = d->dst_addr && dst_page != d->mapped_dst; + bool unmap_src = d->src_addr && (src_page != d->mapped_src || + unmap_dst); + + copy_size = min_t(u32, copy_size, PAGE_SIZE - dst_page_offset); + copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset); + + if (unmap_src) { + ttm_kunmap_atomic_prot(d->src_addr, d->src_prot); + d->src_addr = NULL; + } + + if (unmap_dst) { + ttm_kunmap_atomic_prot(d->dst_addr, d->dst_prot); + d->dst_addr = NULL; + } + + if (!d->dst_addr) { + if (WARN_ON_ONCE(dst_page >= d->dst_num_pages)) + return -EINVAL; + + d->dst_addr = + ttm_kmap_atomic_prot(d->dst_pages[dst_page], + d->dst_prot); + if (!d->dst_addr) + return -ENOMEM; + + d->mapped_dst = dst_page; + } + + if (!d->src_addr) { + if (WARN_ON_ONCE(src_page >= d->src_num_pages)) + return -EINVAL; + + d->src_addr = + ttm_kmap_atomic_prot(d->src_pages[src_page], + d->src_prot); + if (!d->src_addr) + return -ENOMEM; + + d->mapped_src = src_page; + } + diff->memcpy(diff, d->dst_addr + dst_page_offset, + d->src_addr + src_page_offset, copy_size); + + bytes_to_copy -= copy_size; + dst_offset += copy_size; + src_offset += copy_size; + } + + return 0; +} + +/** + * ttm_bo_cpu_blit - in-kernel cpu blit. + * + * @dst: Destination buffer object. + * @dst_offset: Destination offset of blit start in bytes. + * @dst_stride: Destination stride in bytes. + * @src: Source buffer object. + * @src_offset: Source offset of blit start in bytes. + * @src_stride: Source stride in bytes. + * @w: Width of blit. + * @h: Height of blit. + * return: Zero on success. Negative error value on failure. Will print out + * kernel warnings on caller bugs. + * + * Performs a CPU blit from one buffer object to another avoiding a full + * bo vmap which may exhaust- or fragment vmalloc space. + * On supported architectures (x86), we're using kmap_atomic which avoids + * cross-processor TLB- and cache flushes and may, on non-HIGHMEM systems + * reference already set-up mappings. + * + * Neither of the buffer objects may be placed in PCI memory + * (Fixed memory in TTM terminology) when using this function. + */ +int vmw_bo_cpu_blit(struct ttm_buffer_object *dst, + u32 dst_offset, u32 dst_stride, + struct ttm_buffer_object *src, + u32 src_offset, u32 src_stride, + u32 w, u32 h, + struct vmw_diff_cpy *diff) +{ + struct ttm_operation_ctx ctx = { + .interruptible = false, + .no_wait_gpu = false + }; + u32 j, initial_line = dst_offset / dst_stride; + struct vmw_bo_blit_line_data d; + int ret = 0; + + /* Buffer objects need to be either pinned or reserved: */ + if (!(dst->mem.placement & TTM_PL_FLAG_NO_EVICT)) + lockdep_assert_held(&dst->resv->lock.base); + if (!(src->mem.placement & TTM_PL_FLAG_NO_EVICT)) + lockdep_assert_held(&src->resv->lock.base); + + if (dst->ttm->state == tt_unpopulated) { + ret = dst->ttm->bdev->driver->ttm_tt_populate(dst->ttm, &ctx); + if (ret) + return ret; + } + + if (src->ttm->state == tt_unpopulated) { + ret = src->ttm->bdev->driver->ttm_tt_populate(src->ttm, &ctx); + if (ret) + return ret; + } + + d.mapped_dst = 0; + d.mapped_src = 0; + d.dst_addr = NULL; + d.src_addr = NULL; + d.dst_pages = dst->ttm->pages; + d.src_pages = src->ttm->pages; + d.dst_num_pages = dst->num_pages; + d.src_num_pages = src->num_pages; + d.dst_prot = ttm_io_prot(dst->mem.placement, PAGE_KERNEL); + d.src_prot = ttm_io_prot(src->mem.placement, PAGE_KERNEL); + d.diff = diff; + + for (j = 0; j < h; ++j) { + diff->line = j + initial_line; + diff->line_offset = dst_offset % dst_stride; + ret = vmw_bo_cpu_blit_line(&d, dst_offset, src_offset, w); + if (ret) + goto out; + + dst_offset += dst_stride; + src_offset += src_stride; + } +out: + if (d.src_addr) + ttm_kunmap_atomic_prot(d.src_addr, d.src_prot); + if (d.dst_addr) + ttm_kunmap_atomic_prot(d.dst_addr, d.dst_prot); + + return ret; +} diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index 7e5f30e..99e7e426 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -678,6 +678,7 @@ extern void vmw_fence_single_bo(struct ttm_buffer_object *bo, struct vmw_fence_obj *fence); extern void vmw_resource_evict_all(struct vmw_private *dev_priv); + /** * DMA buffer helper routines - vmwgfx_dmabuf.c */ @@ -1165,6 +1166,53 @@ extern int vmw_cmdbuf_cur_flush(struct vmw_cmdbuf_man *man, bool interruptible); extern void vmw_cmdbuf_irqthread(struct vmw_cmdbuf_man *man); +/* CPU blit utilities - vmwgfx_blit.c */ + +/** + * struct vmw_diff_cpy - CPU blit information structure + * + * @rect: The output bounding box rectangle. + * @line: The current line of the blit. + * @line_offset: Offset of the current line segment. + * @cpp: Bytes per pixel (granularity information). + * @memcpy: Which memcpy function to use. + */ +struct vmw_diff_cpy { + struct drm_rect rect; + size_t line; + size_t line_offset; + int cpp; + void (*memcpy)(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, + size_t n); +}; + +#define VMW_CPU_BLIT_INITIALIZER { \ + .memcpy = vmw_memcpy, \ +} + +#define VMW_CPU_BLIT_DIFF_INITIALIZER(_cpp) { \ + .line = 0, \ + .line_offset = 0, \ + .rect = { .x1 = INT_MAX/2, \ + .y1 = INT_MAX/2, \ + .x2 = INT_MIN/2, \ + .y2 = INT_MIN/2 \ + }, \ + .cpp = _cpp, \ + .memcpy = vmw_diff_memcpy, \ +} + +void vmw_diff_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, + size_t n); + +void vmw_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, size_t n); + +int vmw_bo_cpu_blit(struct ttm_buffer_object *dst, + u32 dst_offset, u32 dst_stride, + struct ttm_buffer_object *src, + u32 src_offset, u32 src_stride, + u32 w, u32 h, + struct vmw_diff_cpy *diff); /** * Inline helper functions