From patchwork Thu Mar 22 11:07:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 10301249 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id ADE33600F6 for ; Thu, 22 Mar 2018 11:07:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FD8928619 for ; Thu, 22 Mar 2018 11:07:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 914812863D; Thu, 22 Mar 2018 11:07:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 70CFC2863B for ; Thu, 22 Mar 2018 11:07:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0567D6EBCE; Thu, 22 Mar 2018 11:07:50 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM02-CY1-obe.outbound.protection.outlook.com (mail-cys01nam02on0087.outbound.protection.outlook.com [104.47.37.87]) by gabe.freedesktop.org (Postfix) with ESMTPS id 468126EBCE for ; Thu, 22 Mar 2018 11:07:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=onevmw.onmicrosoft.com; s=selector1-vmware-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=6Ujd4OhZGMrRFS7MQNh7CxlhLeVipR/mHDiAF8XjYKw=; b=TrCjXSdNiRXVROg0Jx91E+W9Za6XLorcEzEmx79895KqsSEKQ8vQ0Unda9qRR7oDiovii0kUOr3OPG3CsKa5r2CXskBI/irLqqURsEbFIILpkRlqOUgPXAVuRMmFJUjNCDyJyyKGpqmPUR4pRYiDMZjhH3ya1HLTkrgA3whKvx0= Received: from localhost.localdomain (155.4.205.56) by BY2PR05MB759.namprd05.prod.outlook.com (10.141.224.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.609.6; Thu, 22 Mar 2018 11:07:43 +0000 From: Thomas Hellstrom To: dri-devel@lists.freedesktop.org, linux-graphics-maintainer@vmware.com Subject: [PATCH v2 -next 01/11] drm/vmwgfx: Add a cpu blit utility that can be used for page-backed bos Date: Thu, 22 Mar 2018 12:07:10 +0100 Message-Id: <20180322110710.8040-1-thellstrom@vmware.com> X-Mailer: git-send-email 2.14.3 MIME-Version: 1.0 X-Originating-IP: [155.4.205.56] X-ClientProxiedBy: HE1PR0902CA0017.eurprd09.prod.outlook.com (10.171.90.27) To BY2PR05MB759.namprd05.prod.outlook.com (10.141.224.23) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 8fe42e51-3184-4294-0b13-08d58fe52365 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4604075)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:BY2PR05MB759; X-Microsoft-Exchange-Diagnostics: 1; BY2PR05MB759; 3:hilb6CV8Ds3i/j3qIqj0Km56CDmqgKn5Zu3GvAVis9HdeXh4bf6X2Au224BXM0WdBUM1sp8jNtsr/3dd81JAhtkSE9QYzIsEHOW55d5mq7X+VQN1KD+xzY8WflvIDCkAkLxs+BaIepCW+QSXcQ6MiHwTXQAiICuM5bcROuPBSyYB32ZjLBdOYb5rdszxcHSQYBGdtaphtdNAoPI1U3xKF6WIvJBgM9v4AeJYlGSvoUhw9eGQqMlv5uCSwSSUTjVK; 25:fq+UkOz/APW2mbDuM1gGold+1SBm54y4yIMCfiwoGSJiBPAomew6ibIs4+ZIchLF9gnd+tuN4nmpy15mECcsJK1nCPmgKwdmRavvK2dm/54cLNHQgtfrncYa4SjXzcSD8j0VPxgBTeVq5HhH+cQ1qlKUuW0hu1AXv4Pwap3sT5OruOrRunuYV1NjMnYU//pQZmLlUmkZyEdFfpCoVnNTyNklTbAaQ8VYzcOfRAZual+/xY0H/ePHOHt7jhKLUsw1rJ9T5Euv7bzJ/nUtTYynsET97eGsayZmZKuGBaO+K/tTc6k2Gso0vsOfSyT9JkTyn1IA7zP9rlYT6vUU9Lfoj8onHv4GcTMyLnVtSxTCkzo=; 31:BpS6R4gIVNrNz62ABXBjbYebfCINZcCrPrj43Czum+uvei8WtlYv4QHu9aIAIOlJLYdE81S0bmJV+FcLKA5q24e/MHTAsRNdp97CIYCJ61yPukMIr4bwI2PBmWrNBLqNopAwVBeuoAcrQxdICSPUoJ0cwoaaGebcy25AjwVo2AZVSCCue6FpOqRGifbyCpzUYWgCJJ+qwSub8wxO7w+u23Z99zSeJVT+2u/sNFdcuU0= X-MS-TrafficTypeDiagnostic: BY2PR05MB759: Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=thellstrom@vmware.com; X-Microsoft-Exchange-Diagnostics: 1; BY2PR05MB759; 20:IpgouRaR7Cqci04v3XaX86Sk5Lo75Bl7Er7Zs0JccVmG5rqMRF5XTH1LwpfdB6t5VaZxFjbOugaoW8GWe7fcmVi5DupAlzHHgZOymrhpYamnPrAmwkOf5rW4GO5iooWfG2MgUMSfWzhUZKwaB1Etcq7a9bploV/FTKBoDnJCOHPFLlbx7VEdixVxeqVIXxdTt92SkLPjJvPxkpLki4D3N/V25PlSjDn3+Kzzjp/HnrYfdo/f2Cu25NQB6NQBWfUF+ZsyUpPsTAIF5G8j8bj5c260mNl+XOd90xExAiAA52MZnLt3o0i46Xz+Dk11YE6+Gf379RBQoG5N+PPlQA2STmEm5o4aYyQZ1694b+8PApIDR5pD6GcoS2A8GfgxZYEFsV70yWjPssWbc0uy40ECN3I+e9ZhpoTWS0FbVWSV4ojZkT7dMDGguRnh7BMiQCOrx26LInR3co9lEbCgFJEh9BSg2r6x1H4D7XoztNP+v6jVM4FY460bGwxg/dgAPV0G; 4:ihD6PQL10tkicyPnnI+Ddo5jxfactawDe+dHinQOWIDtJ9IYzZay7BtngwuTYXnAFPDkS4T3vjWEK/ZObat3Z3xyegzUmcA/fNEqEtD6TnWWJ5eUeNrvePyPjHtxsBLa7cWaO1bOkvmA/IvCvFpPmXFqlIm8s3FzQFoLdXEcGpMQ2nojiJIqBBxvwCvpvEad12+7jqL6nAZa4JtY+1nWZVJf5KHbUnejyBpxpn1fy5uZ9TqzfShDZ6T6SvvK8UetS+DO+xJ/s4LygdzppcGRMvlDeN2Eklbxo8XUiTtqJ/rul4irrt/Su2z645zTZgEGciqU1DtRD5U/86ANcgw+LHd6CUd0E8uTgTEqoXOhdKU= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(61668805478150)(278428928389397); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(3231221)(944501327)(52105095)(10201501046)(3002001)(6041310)(20161123558120)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(6072148)(201708071742011); SRVR:BY2PR05MB759; BCL:0; PCL:0; RULEID:; SRVR:BY2PR05MB759; X-Forefront-PRVS: 0619D53754 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6069001)(346002)(366004)(376002)(396003)(39380400002)(39860400002)(199004)(189003)(86362001)(81156014)(8936002)(81166006)(8676002)(7736002)(305945005)(5660300001)(105586002)(106356001)(6486002)(26005)(386003)(6506007)(3846002)(6116002)(1076002)(50466002)(16526019)(25786009)(50226002)(68736007)(186003)(6512007)(59450400001)(47776003)(53936002)(66066001)(36756003)(23676004)(2870700001)(2906002)(478600001)(97736004)(6666003)(316002)(6636002)(52116002); DIR:OUT; SFP:1101; SCL:1; SRVR:BY2PR05MB759; H:localhost.localdomain; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; Received-SPF: None (protection.outlook.com: vmware.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtCWTJQUjA1TUI3NTk7MjM6MXNYVzIyN1VYcUFuelRlVWorQXU0cnFySkFD?= =?utf-8?B?TWNFUlB5NVN5aEpWUWxBZUplNlRRS04vQmhnOU9KNndyYXhTVzZSWEM5anJB?= =?utf-8?B?T3JOWm5rNTIzRG9ldUhtMjM5RklHZGpxNFg4UzlSU1FPcnBma1hQOXZITUkx?= =?utf-8?B?b2JsaGlrVHJBL3JtUS9MVFBBZjkvcUVuZzFLSEhIRUlVSW1VNmUrY0VrYURT?= =?utf-8?B?cXYzWHZoVHJHU3NTdVZtenQ3a2Y3Ti9kMElueHc0ejNGZGQ0TVhMdVRWQjdH?= =?utf-8?B?cjNVUWMzakVmc2hRL3VVV29UYTMydlp4WE53Um5XeUVTc0QvZ2pDOVBQbzVC?= =?utf-8?B?TmtURnZ3NVJyL0pJeUoxZ2lubVkrOHdPV2o4OXdHTk8wV2tGV3Zzd0RrSFcv?= =?utf-8?B?TEI1alpKdE5uZFdPd0h5VWMvZUdNL3l6UWx3MFRidjBXRjFlcnJSRnNxUzBu?= =?utf-8?B?amVLU2hla04yZnlXQWNvRHNKM1IzYXBYV1d5WS83RFJnV29SbVdGZXlKZzlQ?= =?utf-8?B?QURMN1RyOWVCSHpONUZkd2lwSWZxNFM2Yzl0ZkVHK1cyYjlUVGJadTdlWUtK?= =?utf-8?B?czE5MWJBQVEvT2ZZdUl6c3cwMUdvRUxON1paSTZQYVhYczI4elpCQTNXdFZU?= =?utf-8?B?SjRydk5TTmtEOHpHK055OEFyN0lnY09GaEtZQi80WlI0bjIvc0lSOHMyeko0?= =?utf-8?B?WUp2UWtnVXJPZXpXam9ENGw4SC9PRzdaSE82MXJlTE01cnJFY1V5MDUvSkJn?= =?utf-8?B?QnNNSjJUSXFmYlNTWFFYUjFHNDRDZXBnVzBLUmd4bFRVUnVPRWEyY0E2ODRH?= =?utf-8?B?YllsclBqNDh0WnpYRTlKeVorRldieUxxWUxoS25EUjNHa3h1T3ZZVlU4NVB1?= =?utf-8?B?TUs0YklQTEU4NDFiaVJWeklzdUFVL3EwOE1KSUJySC9PdzI4L0FLeXBjQUdT?= =?utf-8?B?elRUcmFJNWQzOURyY0tHVVpuTzNDWUNxUUNzRndPdy9Gbk5qWEZoVWlNZXFC?= =?utf-8?B?L0k4cW9aOExqYUNIYWRUcWtDemlybEx2RmVkaVlHaGNTN256Q0RVbXFxYzVq?= =?utf-8?B?bU1SeHl1MUtWSFNIMm1SMGhuQXJONXV4MEU5N1RJVFgxdVZPanh5c29aaGJl?= =?utf-8?B?Y29MTUwxSng4RnNCWHBRYUxjU3BDWWdDOU1jbjZnaVZZV1VFcWtLbDNtVUhn?= =?utf-8?B?ejFOYWNoUm5zQXBYcExnYWVWRWxxRjB3eDBQS3dYV0xvZ2QzdEJsalpwOWwy?= =?utf-8?B?TWd2RjFrVG9CdjUwditnRUtCcjFWRTdqM1NrSGtESC95eDN2Z25BN0NnVjBL?= =?utf-8?B?aW1zUU00c1JPWERuSnRLaVpCMFlEVmw5d0thSGw3S3UyZ292REN0bU9FUGtq?= =?utf-8?B?Qm5vMWZHWGtHcDZaZ09yR050TjFVdi9YazlZTi82UTJ0dzVWcDh5Q2xEOG4x?= =?utf-8?Q?ZQe0jtaHVnwDaw8vukftBv23ZY?= X-Microsoft-Antispam-Message-Info: UAB2HrGVX+ici/GaCyTjfbMuYbgvMefYqYnnqXkP3JjjTxJNdh63QbhMxbUGbP2udV1zC3t7N1wVS57zqt8cXO46qcB/xKaGHNQi9AjypWsbmPHXycFV/B6N0DCfoqQDLZbLWVBp9DUfwx9VLTG7eb5+vCtlPc6dZo0ruUeppZgQvMHNrGo6aJqvviVedngM X-Microsoft-Exchange-Diagnostics: 1; BY2PR05MB759; 6:tqbHm1MicZekGr15+8Opp8xfpAN7OuBh7/P60LKkarPY+Cx3nm74rful+waPDPf+WJcB+sNFkHismupoh0ROElhFrNNk7q1jkGGNm8ebEYULNFFB43cg/7SPGzrKFbvDpCubeih1bP/tt0wqSYiV1rMDwqtmlMr7kw32gHN0JqIJ7pC/02cL/ZYErulTXdfXXNJnSXwOfd+c1qtPklwDfjxiDeJpU6ZuFlHGp3tNpjQ9qdhM/2yzPNePBZC5WaLLHWsmY5J7EKu9cZ/9gBFpsyAWFyLVzdC82G2NWwzqHlw0a/gFou/JWee9s/Idgdso5Qk+L69ap9AlqJglWXRJOKpmptq1jNdSNXdScyDp5VrERGnAH/N6o1y3FusHh9t5swf27gCf7f+sFRG5MOqVHp0hrsfEP0uzr6+clLWgKJcV3dRvd4n1dWIa5ZICJd5NK314lxW395ALxVb47L0FPw==; 5:sz5PlV10eBklcJ3W0qA5j/97wpZq0ynyOmqlC0AU0dd/m2con9LePjBQo8TK8Zdd3vVoO8euwXv1ArxLmTFon/YnW1P205xw17J4ROHAbILtLu793WufZwON3aQ0ds48KDGIi0iPnM30E+fhN4dJaIzKO6J1cTt4pBLEo08rgBY=; 24:kdSe9GMQU46550Hp6imL19RFRLLhbldu8jP9z78iVOfLTvLtPW+U3vbajPY5VkQ06NXMpntIGSKc+msVYHjIVj5m2vNFKsK/f9BuHwNitOk= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; BY2PR05MB759; 7:Hh4HgN6F7Xc4mZMKmw9nl3UdiENNapbLkarWMAydiw7tTbefgbxgCur9yaDeHpAZmvz2Ca1IJ9dnqnAcQCC0JWEw/ZoBVTpJ3++Oocc/ZQcP7sXLxSLa48Hf+SommSGqb76w7u/UsTq9DpgRo9Npv11xLXrpKbxEsQQ2QRLQqvTXIkT+zj20dlrHqfYUXClfUC79eLeg/2P8X9lGshO1qcFRe2aKxvfghoutdhLFXGGoWsxY4rv7BV4LZKAYO3rF; 20:UzvIlDJznusVrRZObE13HySxBcpj85KhDQ3XVfy8bfhx1VVipGS3k6/eJHymAgEie1LkEWKsY9dq9OX6jvCCxL/ueugGQei/EdfY5wDBSGCCvxTqtTVaSX7O9Dg+E86e3gtKsXFy2EKCvEBLriJm84wHTT27ZBUQfyAj8NoijGs= X-OriginatorOrg: vmware.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Mar 2018 11:07:43.8253 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8fe42e51-3184-4294-0b13-08d58fe52365 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR05MB759 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP The utility uses kmap_atomic() instead of vmapping the whole buffer object. As a result there will be more book-keeping but on some architectures this will help avoid exhausting vmalloc space and also avoid expensive TLB flushes. The blit utility also adds a provision to compute a bounding box of changed content, which is very useful to optimize presentation speed of ill-behaved applications that don't supply proper damage regions, and for page-flips. The cost of computing the bounding box is not that expensive when done in a cpu-blit utility like this. Signed-off-by: Thomas Hellstrom Reviewed-by: Brian Paul --- v2: Fix a compilation failure by replacing a struct vmw_diff_cpy member called "memcpy" with "do_cpy". --- drivers/gpu/drm/vmwgfx/Makefile | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 506 +++++++++++++++++++++++++++++++++++ drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 48 ++++ 3 files changed, 555 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c diff --git a/drivers/gpu/drm/vmwgfx/Makefile b/drivers/gpu/drm/vmwgfx/Makefile index ad80211e1098..794cc9d5c9b0 100644 --- a/drivers/gpu/drm/vmwgfx/Makefile +++ b/drivers/gpu/drm/vmwgfx/Makefile @@ -7,6 +7,6 @@ vmwgfx-y := vmwgfx_execbuf.o vmwgfx_gmr.o vmwgfx_kms.o vmwgfx_drv.o \ vmwgfx_surface.o vmwgfx_prime.o vmwgfx_mob.o vmwgfx_shader.o \ vmwgfx_cmdbuf_res.o vmwgfx_cmdbuf.o vmwgfx_stdu.o \ vmwgfx_cotable.o vmwgfx_so.o vmwgfx_binding.o vmwgfx_msg.o \ - vmwgfx_simple_resource.o vmwgfx_va.o + vmwgfx_simple_resource.o vmwgfx_va.o vmwgfx_blit.o obj-$(CONFIG_DRM_VMWGFX) := vmwgfx.o diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c new file mode 100644 index 000000000000..e8c94b19db7b --- /dev/null +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c @@ -0,0 +1,506 @@ +/************************************************************************** + * + * Copyright © 2017 VMware, Inc., Palo Alto, CA., USA + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +#include "vmwgfx_drv.h" + +/* + * Template that implements find_first_diff() for a generic + * unsigned integer type. @size and return value are in bytes. + */ +#define VMW_FIND_FIRST_DIFF(_type) \ +static size_t vmw_find_first_diff_ ## _type \ + (const _type * dst, const _type * src, size_t size)\ +{ \ + size_t i; \ + \ + for (i = 0; i < size; i += sizeof(_type)) { \ + if (*dst++ != *src++) \ + break; \ + } \ + \ + return i; \ +} + + +/* + * Template that implements find_last_diff() for a generic + * unsigned integer type. Pointers point to the item following the + * *end* of the area to be examined. @size and return value are in + * bytes. + */ +#define VMW_FIND_LAST_DIFF(_type) \ +static ssize_t vmw_find_last_diff_ ## _type( \ + const _type * dst, const _type * src, size_t size) \ +{ \ + while (size) { \ + if (*--dst != *--src) \ + break; \ + \ + size -= sizeof(_type); \ + } \ + return size; \ +} + + +/* + * Instantiate find diff functions for relevant unsigned integer sizes, + * assuming that wider integers are faster (including aligning) up to the + * architecture native width, which is assumed to be 32 bit unless + * CONFIG_64BIT is defined. + */ +VMW_FIND_FIRST_DIFF(u8); +VMW_FIND_LAST_DIFF(u8); + +VMW_FIND_FIRST_DIFF(u16); +VMW_FIND_LAST_DIFF(u16); + +VMW_FIND_FIRST_DIFF(u32); +VMW_FIND_LAST_DIFF(u32); + +#ifdef CONFIG_64BIT +VMW_FIND_FIRST_DIFF(u64); +VMW_FIND_LAST_DIFF(u64); +#endif + + +/* We use size aligned copies. This computes (addr - align(addr)) */ +#define SPILL(_var, _type) ((unsigned long) _var & (sizeof(_type) - 1)) + + +/* + * Template to compute find_first_diff() for a certain integer type + * including a head copy for alignment, and adjustment of parameters + * for tail find or increased resolution find using an unsigned integer find + * of smaller width. If finding is complete, and resolution is sufficient, + * the macro executes a return statement. Otherwise it falls through. + */ +#define VMW_TRY_FIND_FIRST_DIFF(_type) \ +do { \ + unsigned int spill = SPILL(dst, _type); \ + size_t diff_offs; \ + \ + if (spill && spill == SPILL(src, _type) && \ + sizeof(_type) - spill <= size) { \ + spill = sizeof(_type) - spill; \ + diff_offs = vmw_find_first_diff_u8(dst, src, spill); \ + if (diff_offs < spill) \ + return round_down(offset + diff_offs, granularity); \ + \ + dst += spill; \ + src += spill; \ + size -= spill; \ + offset += spill; \ + spill = 0; \ + } \ + if (!spill && !SPILL(src, _type)) { \ + size_t to_copy = size & ~(sizeof(_type) - 1); \ + \ + diff_offs = vmw_find_first_diff_ ## _type \ + ((_type *) dst, (_type *) src, to_copy); \ + if (diff_offs >= size || granularity == sizeof(_type)) \ + return (offset + diff_offs); \ + \ + dst += diff_offs; \ + src += diff_offs; \ + size -= diff_offs; \ + offset += diff_offs; \ + } \ +} while (0) \ + + +/** + * vmw_find_first_diff - find the first difference between dst and src + * + * @dst: The destination address + * @src: The source address + * @size: Number of bytes to compare + * @granularity: The granularity needed for the return value in bytes. + * return: The offset from find start where the first difference was + * encountered in bytes. If no difference was found, the function returns + * a value >= @size. + */ +static size_t vmw_find_first_diff(const u8 *dst, const u8 *src, size_t size, + size_t granularity) +{ + size_t offset = 0; + + /* + * Try finding with large integers if alignment allows, or we can + * fix it. Fall through if we need better resolution or alignment + * was bad. + */ +#ifdef CONFIG_64BIT + VMW_TRY_FIND_FIRST_DIFF(u64); +#endif + VMW_TRY_FIND_FIRST_DIFF(u32); + VMW_TRY_FIND_FIRST_DIFF(u16); + + return round_down(offset + vmw_find_first_diff_u8(dst, src, size), + granularity); +} + + +/* + * Template to compute find_last_diff() for a certain integer type + * including a tail copy for alignment, and adjustment of parameters + * for head find or increased resolution find using an unsigned integer find + * of smaller width. If finding is complete, and resolution is sufficient, + * the macro executes a return statement. Otherwise it falls through. + */ +#define VMW_TRY_FIND_LAST_DIFF(_type) \ +do { \ + unsigned int spill = SPILL(dst, _type); \ + ssize_t location; \ + ssize_t diff_offs; \ + \ + if (spill && spill <= size && spill == SPILL(src, _type)) { \ + diff_offs = vmw_find_last_diff_u8(dst, src, spill); \ + if (diff_offs) { \ + location = size - spill + diff_offs - 1; \ + return round_down(location, granularity); \ + } \ + \ + dst -= spill; \ + src -= spill; \ + size -= spill; \ + spill = 0; \ + } \ + if (!spill && !SPILL(src, _type)) { \ + size_t to_copy = round_down(size, sizeof(_type)); \ + \ + diff_offs = vmw_find_last_diff_ ## _type \ + ((_type *) dst, (_type *) src, to_copy); \ + location = size - to_copy + diff_offs - sizeof(_type); \ + if (location < 0 || granularity == sizeof(_type)) \ + return location; \ + \ + dst -= to_copy - diff_offs; \ + src -= to_copy - diff_offs; \ + size -= to_copy - diff_offs; \ + } \ +} while (0) + + +/** + * vmw_find_last_diff - find the last difference between dst and src + * + * @dst: The destination address + * @src: The source address + * @size: Number of bytes to compare + * @granularity: The granularity needed for the return value in bytes. + * return: The offset from find start where the last difference was + * encountered in bytes, or a negative value if no difference was found. + */ +static ssize_t vmw_find_last_diff(const u8 *dst, const u8 *src, size_t size, + size_t granularity) +{ + dst += size; + src += size; + +#ifdef CONFIG_64BIT + VMW_TRY_FIND_LAST_DIFF(u64); +#endif + VMW_TRY_FIND_LAST_DIFF(u32); + VMW_TRY_FIND_LAST_DIFF(u16); + + return round_down(vmw_find_last_diff_u8(dst, src, size) - 1, + granularity); +} + + +/** + * vmw_memcpy - A wrapper around kernel memcpy with allowing to plug it into a + * struct vmw_diff_cpy. + * + * @diff: The struct vmw_diff_cpy closure argument (unused). + * @dest: The copy destination. + * @src: The copy source. + * @n: Number of bytes to copy. + */ +void vmw_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, size_t n) +{ + memcpy(dest, src, n); +} + + +/** + * vmw_adjust_rect - Adjust rectangle coordinates for newly found difference + * + * @diff: The struct vmw_diff_cpy used to track the modified bounding box. + * @diff_offs: The offset from @diff->line_offset where the difference was + * found. + */ +static void vmw_adjust_rect(struct vmw_diff_cpy *diff, size_t diff_offs) +{ + size_t offs = (diff_offs + diff->line_offset) / diff->cpp; + struct drm_rect *rect = &diff->rect; + + rect->x1 = min_t(int, rect->x1, offs); + rect->x2 = max_t(int, rect->x2, offs + 1); + rect->y1 = min_t(int, rect->y1, diff->line); + rect->y2 = max_t(int, rect->y2, diff->line + 1); +} + +/** + * vmw_diff_memcpy - memcpy that creates a bounding box of modified content. + * + * @diff: The struct vmw_diff_cpy used to track the modified bounding box. + * @dest: The copy destination. + * @src: The copy source. + * @n: Number of bytes to copy. + * + * In order to correctly track the modified content, the field @diff->line must + * be pre-loaded with the current line number, the field @diff->line_offset must + * be pre-loaded with the line offset in bytes where the copy starts, and + * finally the field @diff->cpp need to be preloaded with the number of bytes + * per unit in the horizontal direction of the area we're examining. + * Typically bytes per pixel. + * This is needed to know the needed granularity of the difference computing + * operations. A higher cpp generally leads to faster execution at the cost of + * bounding box width precision. + */ +void vmw_diff_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, + size_t n) +{ + ssize_t csize, byte_len; + + if (WARN_ON_ONCE(round_down(n, diff->cpp) != n)) + return; + + /* TODO: Possibly use a single vmw_find_first_diff per line? */ + csize = vmw_find_first_diff(dest, src, n, diff->cpp); + if (csize < n) { + vmw_adjust_rect(diff, csize); + byte_len = diff->cpp; + + /* + * Starting from where first difference was found, find + * location of last difference, and then copy. + */ + diff->line_offset += csize; + dest += csize; + src += csize; + n -= csize; + csize = vmw_find_last_diff(dest, src, n, diff->cpp); + if (csize >= 0) { + byte_len += csize; + vmw_adjust_rect(diff, csize); + } + memcpy(dest, src, byte_len); + } + diff->line_offset += n; +} + +/** + * struct vmw_bo_blit_line_data - Convenience argument to vmw_bo_cpu_blit_line + * + * @mapped_dst: Already mapped destination page index in @dst_pages. + * @dst_addr: Kernel virtual address of mapped destination page. + * @dst_pages: Array of destination bo pages. + * @dst_num_pages: Number of destination bo pages. + * @dst_prot: Destination bo page protection. + * @mapped_src: Already mapped source page index in @dst_pages. + * @src_addr: Kernel virtual address of mapped source page. + * @src_pages: Array of source bo pages. + * @src_num_pages: Number of source bo pages. + * @src_prot: Source bo page protection. + * @diff: Struct vmw_diff_cpy, in the end forwarded to the memcpy routine. + */ +struct vmw_bo_blit_line_data { + u32 mapped_dst; + u8 *dst_addr; + struct page **dst_pages; + u32 dst_num_pages; + pgprot_t dst_prot; + u32 mapped_src; + u8 *src_addr; + struct page **src_pages; + u32 src_num_pages; + pgprot_t src_prot; + struct vmw_diff_cpy *diff; +}; + +/** + * vmw_bo_cpu_blit_line - Blit part of a line from one bo to another. + * + * @d: Blit data as described above. + * @dst_offset: Destination copy start offset from start of bo. + * @src_offset: Source copy start offset from start of bo. + * @bytes_to_copy: Number of bytes to copy in this line. + */ +static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, + u32 dst_offset, + u32 src_offset, + u32 bytes_to_copy) +{ + struct vmw_diff_cpy *diff = d->diff; + + while (bytes_to_copy) { + u32 copy_size = bytes_to_copy; + u32 dst_page = dst_offset >> PAGE_SHIFT; + u32 src_page = src_offset >> PAGE_SHIFT; + u32 dst_page_offset = dst_offset & ~PAGE_MASK; + u32 src_page_offset = src_offset & ~PAGE_MASK; + bool unmap_dst = d->dst_addr && dst_page != d->mapped_dst; + bool unmap_src = d->src_addr && (src_page != d->mapped_src || + unmap_dst); + + copy_size = min_t(u32, copy_size, PAGE_SIZE - dst_page_offset); + copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset); + + if (unmap_src) { + ttm_kunmap_atomic_prot(d->src_addr, d->src_prot); + d->src_addr = NULL; + } + + if (unmap_dst) { + ttm_kunmap_atomic_prot(d->dst_addr, d->dst_prot); + d->dst_addr = NULL; + } + + if (!d->dst_addr) { + if (WARN_ON_ONCE(dst_page >= d->dst_num_pages)) + return -EINVAL; + + d->dst_addr = + ttm_kmap_atomic_prot(d->dst_pages[dst_page], + d->dst_prot); + if (!d->dst_addr) + return -ENOMEM; + + d->mapped_dst = dst_page; + } + + if (!d->src_addr) { + if (WARN_ON_ONCE(src_page >= d->src_num_pages)) + return -EINVAL; + + d->src_addr = + ttm_kmap_atomic_prot(d->src_pages[src_page], + d->src_prot); + if (!d->src_addr) + return -ENOMEM; + + d->mapped_src = src_page; + } + diff->do_cpy(diff, d->dst_addr + dst_page_offset, + d->src_addr + src_page_offset, copy_size); + + bytes_to_copy -= copy_size; + dst_offset += copy_size; + src_offset += copy_size; + } + + return 0; +} + +/** + * ttm_bo_cpu_blit - in-kernel cpu blit. + * + * @dst: Destination buffer object. + * @dst_offset: Destination offset of blit start in bytes. + * @dst_stride: Destination stride in bytes. + * @src: Source buffer object. + * @src_offset: Source offset of blit start in bytes. + * @src_stride: Source stride in bytes. + * @w: Width of blit. + * @h: Height of blit. + * return: Zero on success. Negative error value on failure. Will print out + * kernel warnings on caller bugs. + * + * Performs a CPU blit from one buffer object to another avoiding a full + * bo vmap which may exhaust- or fragment vmalloc space. + * On supported architectures (x86), we're using kmap_atomic which avoids + * cross-processor TLB- and cache flushes and may, on non-HIGHMEM systems + * reference already set-up mappings. + * + * Neither of the buffer objects may be placed in PCI memory + * (Fixed memory in TTM terminology) when using this function. + */ +int vmw_bo_cpu_blit(struct ttm_buffer_object *dst, + u32 dst_offset, u32 dst_stride, + struct ttm_buffer_object *src, + u32 src_offset, u32 src_stride, + u32 w, u32 h, + struct vmw_diff_cpy *diff) +{ + struct ttm_operation_ctx ctx = { + .interruptible = false, + .no_wait_gpu = false + }; + u32 j, initial_line = dst_offset / dst_stride; + struct vmw_bo_blit_line_data d; + int ret = 0; + + /* Buffer objects need to be either pinned or reserved: */ + if (!(dst->mem.placement & TTM_PL_FLAG_NO_EVICT)) + lockdep_assert_held(&dst->resv->lock.base); + if (!(src->mem.placement & TTM_PL_FLAG_NO_EVICT)) + lockdep_assert_held(&src->resv->lock.base); + + if (dst->ttm->state == tt_unpopulated) { + ret = dst->ttm->bdev->driver->ttm_tt_populate(dst->ttm, &ctx); + if (ret) + return ret; + } + + if (src->ttm->state == tt_unpopulated) { + ret = src->ttm->bdev->driver->ttm_tt_populate(src->ttm, &ctx); + if (ret) + return ret; + } + + d.mapped_dst = 0; + d.mapped_src = 0; + d.dst_addr = NULL; + d.src_addr = NULL; + d.dst_pages = dst->ttm->pages; + d.src_pages = src->ttm->pages; + d.dst_num_pages = dst->num_pages; + d.src_num_pages = src->num_pages; + d.dst_prot = ttm_io_prot(dst->mem.placement, PAGE_KERNEL); + d.src_prot = ttm_io_prot(src->mem.placement, PAGE_KERNEL); + d.diff = diff; + + for (j = 0; j < h; ++j) { + diff->line = j + initial_line; + diff->line_offset = dst_offset % dst_stride; + ret = vmw_bo_cpu_blit_line(&d, dst_offset, src_offset, w); + if (ret) + goto out; + + dst_offset += dst_stride; + src_offset += src_stride; + } +out: + if (d.src_addr) + ttm_kunmap_atomic_prot(d.src_addr, d.src_prot); + if (d.dst_addr) + ttm_kunmap_atomic_prot(d.dst_addr, d.dst_prot); + + return ret; +} diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index d08753e8fd94..053418adf6a0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -678,6 +678,7 @@ extern void vmw_fence_single_bo(struct ttm_buffer_object *bo, struct vmw_fence_obj *fence); extern void vmw_resource_evict_all(struct vmw_private *dev_priv); + /** * DMA buffer helper routines - vmwgfx_dmabuf.c */ @@ -1165,6 +1166,53 @@ extern int vmw_cmdbuf_cur_flush(struct vmw_cmdbuf_man *man, bool interruptible); extern void vmw_cmdbuf_irqthread(struct vmw_cmdbuf_man *man); +/* CPU blit utilities - vmwgfx_blit.c */ + +/** + * struct vmw_diff_cpy - CPU blit information structure + * + * @rect: The output bounding box rectangle. + * @line: The current line of the blit. + * @line_offset: Offset of the current line segment. + * @cpp: Bytes per pixel (granularity information). + * @memcpy: Which memcpy function to use. + */ +struct vmw_diff_cpy { + struct drm_rect rect; + size_t line; + size_t line_offset; + int cpp; + void (*do_cpy)(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, + size_t n); +}; + +#define VMW_CPU_BLIT_INITIALIZER { \ + .do_cpy = vmw_memcpy, \ +} + +#define VMW_CPU_BLIT_DIFF_INITIALIZER(_cpp) { \ + .line = 0, \ + .line_offset = 0, \ + .rect = { .x1 = INT_MAX/2, \ + .y1 = INT_MAX/2, \ + .x2 = INT_MIN/2, \ + .y2 = INT_MIN/2 \ + }, \ + .cpp = _cpp, \ + .do_cpy = vmw_diff_memcpy, \ +} + +void vmw_diff_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, + size_t n); + +void vmw_memcpy(struct vmw_diff_cpy *diff, u8 *dest, const u8 *src, size_t n); + +int vmw_bo_cpu_blit(struct ttm_buffer_object *dst, + u32 dst_offset, u32 dst_stride, + struct ttm_buffer_object *src, + u32 src_offset, u32 src_stride, + u32 w, u32 h, + struct vmw_diff_cpy *diff); /** * Inline helper functions