From patchwork Thu Nov 7 08:41:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28Intel=29?= X-Patchwork-Id: 11232263 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 88A251390 for ; Thu, 7 Nov 2019 08:41:36 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7057A21D7B for ; Thu, 7 Nov 2019 08:41:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7057A21D7B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ADF416EF12; Thu, 7 Nov 2019 08:41:35 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from pio-pvt-msa1.bahnhof.se (pio-pvt-msa1.bahnhof.se [79.136.2.40]) by gabe.freedesktop.org (Postfix) with ESMTPS id AEF336EF14 for ; Thu, 7 Nov 2019 08:41:34 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTP id 5A3543F4DB for ; Thu, 7 Nov 2019 09:41:32 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at bahnhof.se X-Spam-Flag: NO X-Spam-Score: -2.099 X-Spam-Level: X-Spam-Status: No, score=-2.099 tagged_above=-999 required=6.31 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no Received: from pio-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GoRu12Fjnn5V for ; Thu, 7 Nov 2019 09:41:25 +0100 (CET) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id CCA113F4EB for ; Thu, 7 Nov 2019 09:41:25 +0100 (CET) Received: from localhost.localdomain.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id CA433360352 for ; Thu, 7 Nov 2019 09:41:24 +0100 (CET) From: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28VMware=29?= To: dri-devel@lists.freedesktop.org Subject: [PATCH v7 7/8] drm/vmwgfx: Implement an infrastructure for read-coherent resources Date: Thu, 7 Nov 2019 09:41:15 +0100 Message-Id: <20191107084116.33032-8-thomas_os@shipmail.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191107084116.33032-1-thomas_os@shipmail.org> References: <20191107084116.33032-1-thomas_os@shipmail.org> MIME-Version: 1.0 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1573116084; bh=xevg7rDqAWXKhkuIN9/rx93jYWPcVt10V45B1IKLTpQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=ZY/foy/lRWxVzkIw3XkbjpQ8fX352opuy1H5VeEyb7Rog9GiPEeW0Qq6jwtTtodka xC1jb+hURAkreRqn4A9JWjn2bzTh4ZSp91jDM5YEmOI+hfwJQpnRmwnX+qhJNK/HVZ mcVaJmSndqJ02E8sIVJfHZ+D+9YbK+RO8kVg+8yk= X-Mailman-Original-Authentication-Results: pio-pvt-msa1.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b="ZY/foy/l"; dkim-atps=neutral X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Thomas Hellstrom Similar to write-coherent resources, make sure that from the user-space point of view, GPU rendered contents is automatically available for reading by the CPU. Cc: Andrew Morton Cc: Matthew Wilcox Cc: Will Deacon Cc: Peter Zijlstra Cc: Rik van Riel Cc: Minchan Kim Cc: Michal Hocko Cc: Huang Ying Cc: Jérôme Glisse Cc: Kirill A. Shutemov Signed-off-by: Thomas Hellstrom Reviewed-by: Deepak Rawat --- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 7 +- drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c | 77 ++++++++++++- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 103 +++++++++++++++++- drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h | 2 + drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 3 +- 5 files changed, 181 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index d5fa9b72c8ff..7773952f81f8 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -684,7 +684,8 @@ extern void vmw_resource_unreference(struct vmw_resource **p_res); extern struct vmw_resource *vmw_resource_reference(struct vmw_resource *res); extern struct vmw_resource * vmw_resource_reference_unless_doomed(struct vmw_resource *res); -extern int vmw_resource_validate(struct vmw_resource *res, bool intr); +extern int vmw_resource_validate(struct vmw_resource *res, bool intr, + bool dirtying); extern int vmw_resource_reserve(struct vmw_resource *res, bool interruptible, bool no_backup); extern bool vmw_resource_needs_backup(const struct vmw_resource *res); @@ -728,6 +729,8 @@ void vmw_resource_mob_attach(struct vmw_resource *res); void vmw_resource_mob_detach(struct vmw_resource *res); void vmw_resource_dirty_update(struct vmw_resource *res, pgoff_t start, pgoff_t end); +int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start, + pgoff_t end, pgoff_t *num_prefault); /** * vmw_resource_mob_attached - Whether a resource currently has a mob attached @@ -1421,6 +1424,8 @@ int vmw_bo_dirty_add(struct vmw_buffer_object *vbo); void vmw_bo_dirty_transfer_to_res(struct vmw_resource *res); void vmw_bo_dirty_clear_res(struct vmw_resource *res); void vmw_bo_dirty_release(struct vmw_buffer_object *vbo); +void vmw_bo_dirty_unmap(struct vmw_buffer_object *vbo, + pgoff_t start, pgoff_t end); vm_fault_t vmw_bo_vm_fault(struct vm_fault *vmf); vm_fault_t vmw_bo_vm_mkwrite(struct vm_fault *vmf); diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c index 060c1e492f25..f07aa857587c 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c @@ -155,7 +155,6 @@ static void vmw_bo_dirty_scan_mkwrite(struct vmw_buffer_object *vbo) } } - /** * vmw_bo_dirty_scan - Scan for dirty pages and add them to the dirty * tracking structure @@ -173,6 +172,53 @@ void vmw_bo_dirty_scan(struct vmw_buffer_object *vbo) vmw_bo_dirty_scan_mkwrite(vbo); } +/** + * vmw_bo_dirty_pre_unmap - write-protect and pick up dirty pages before + * an unmap_mapping_range operation. + * @vbo: The buffer object, + * @start: First page of the range within the buffer object. + * @end: Last page of the range within the buffer object + 1. + * + * If we're using the _PAGETABLE scan method, we may leak dirty pages + * when calling unmap_mapping_range(). This function makes sure we pick + * up all dirty pages. + */ +static void vmw_bo_dirty_pre_unmap(struct vmw_buffer_object *vbo, + pgoff_t start, pgoff_t end) +{ + struct vmw_bo_dirty *dirty = vbo->dirty; + unsigned long offset = drm_vma_node_start(&vbo->base.base.vma_node); + struct address_space *mapping = vbo->base.bdev->dev_mapping; + + if (dirty->method != VMW_BO_DIRTY_PAGETABLE || start >= end) + return; + + wp_shared_mapping_range(mapping, start + offset, end - start); + clean_record_shared_mapping_range(mapping, start + offset, + end - start, offset, + &dirty->bitmap[0], &dirty->start, + &dirty->end); +} + +/** + * vmw_bo_dirty_unmap - Clear all ptes pointing to a range within a bo + * @vbo: The buffer object, + * @start: First page of the range within the buffer object. + * @end: Last page of the range within the buffer object + 1. + * + * This is similar to ttm_bo_unmap_virtual_locked() except it takes a subrange. + */ +void vmw_bo_dirty_unmap(struct vmw_buffer_object *vbo, + pgoff_t start, pgoff_t end) +{ + unsigned long offset = drm_vma_node_start(&vbo->base.base.vma_node); + struct address_space *mapping = vbo->base.bdev->dev_mapping; + + vmw_bo_dirty_pre_unmap(vbo, start, end); + unmap_shared_mapping_range(mapping, (offset + start) << PAGE_SHIFT, + (loff_t) (end - start) << PAGE_SHIFT); +} + /** * vmw_bo_dirty_add - Add a dirty-tracking user to a buffer object * @vbo: The buffer object @@ -401,21 +447,42 @@ vm_fault_t vmw_bo_vm_fault(struct vm_fault *vmf) if (ret) return ret; + num_prefault = (vma->vm_flags & VM_RAND_READ) ? 1 : + TTM_BO_VM_NUM_PREFAULT; + + if (vbo->dirty) { + pgoff_t allowed_prefault; + unsigned long page_offset; + + page_offset = vmf->pgoff - + drm_vma_node_start(&bo->base.vma_node); + if (page_offset >= bo->num_pages || + vmw_resources_clean(vbo, page_offset, + page_offset + PAGE_SIZE, + &allowed_prefault)) { + ret = VM_FAULT_SIGBUS; + goto out_unlock; + } + + num_prefault = min(num_prefault, allowed_prefault); + } + /* - * This will cause mkwrite() to be called for each pte on - * write-enable vmas. + * If we don't track dirty using the MKWRITE method, make sure + * sure the page protection is write-enabled so we don't get + * a lot of unnecessary write faults. */ if (vbo->dirty && vbo->dirty->method == VMW_BO_DIRTY_MKWRITE) prot = vma->vm_page_prot; else prot = vm_get_page_prot(vma->vm_flags); - num_prefault = (vma->vm_flags & VM_RAND_READ) ? 0 : - TTM_BO_VM_NUM_PREFAULT; ret = ttm_bo_vm_fault_reserved(vmf, prot, num_prefault); if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) return ret; +out_unlock: dma_resv_unlock(bo->base.resv); + return ret; } diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index a18831e1d353..c8441030637a 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -393,7 +393,8 @@ static int vmw_resource_buf_alloc(struct vmw_resource *res, * should be retried once resources have been freed up. */ static int vmw_resource_do_validate(struct vmw_resource *res, - struct ttm_validate_buffer *val_buf) + struct ttm_validate_buffer *val_buf, + bool dirtying) { int ret = 0; const struct vmw_res_func *func = res->func; @@ -435,6 +436,15 @@ static int vmw_resource_do_validate(struct vmw_resource *res, * the resource. */ if (res->dirty) { + if (dirtying && !res->res_dirty) { + pgoff_t start = res->backup_offset >> PAGE_SHIFT; + pgoff_t end = __KERNEL_DIV_ROUND_UP + (res->backup_offset + res->backup_size, + PAGE_SIZE); + + vmw_bo_dirty_unmap(res->backup, start, end); + } + vmw_bo_dirty_transfer_to_res(res); return func->dirty_sync(res); } @@ -678,6 +688,7 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket, * to the device. * @res: The resource to make visible to the device. * @intr: Perform waits interruptible if possible. + * @dirtying: Pending GPU operation will dirty the resource * * On succesful return, any backup DMA buffer pointed to by @res->backup will * be reserved and validated. @@ -687,7 +698,8 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket, * Return: Zero on success, -ERESTARTSYS if interrupted, negative error code * on failure. */ -int vmw_resource_validate(struct vmw_resource *res, bool intr) +int vmw_resource_validate(struct vmw_resource *res, bool intr, + bool dirtying) { int ret; struct vmw_resource *evict_res; @@ -704,7 +716,7 @@ int vmw_resource_validate(struct vmw_resource *res, bool intr) if (res->backup) val_buf.bo = &res->backup->base; do { - ret = vmw_resource_do_validate(res, &val_buf); + ret = vmw_resource_do_validate(res, &val_buf, dirtying); if (likely(ret != -EBUSY)) break; @@ -1004,7 +1016,7 @@ int vmw_resource_pin(struct vmw_resource *res, bool interruptible) /* Do we really need to pin the MOB as well? */ vmw_bo_pin_reserved(vbo, true); } - ret = vmw_resource_validate(res, interruptible); + ret = vmw_resource_validate(res, interruptible, true); if (vbo) ttm_bo_unreserve(&vbo->base); if (ret) @@ -1079,3 +1091,86 @@ void vmw_resource_dirty_update(struct vmw_resource *res, pgoff_t start, res->func->dirty_range_add(res, start << PAGE_SHIFT, end << PAGE_SHIFT); } + +/** + * vmw_resources_clean - Clean resources intersecting a mob range + * @vbo: The mob buffer object + * @start: The mob page offset starting the range + * @end: The mob page offset ending the range + * @num_prefault: Returns how many pages including the first have been + * cleaned and are ok to prefault + */ +int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start, + pgoff_t end, pgoff_t *num_prefault) +{ + struct rb_node *cur = vbo->res_tree.rb_node; + struct vmw_resource *found = NULL; + unsigned long res_start = start << PAGE_SHIFT; + unsigned long res_end = end << PAGE_SHIFT; + unsigned long last_cleaned = 0; + + /* + * Find the resource with lowest backup_offset that intersects the + * range. + */ + while (cur) { + struct vmw_resource *cur_res = + container_of(cur, struct vmw_resource, mob_node); + + if (cur_res->backup_offset >= res_end) { + cur = cur->rb_left; + } else if (cur_res->backup_offset + cur_res->backup_size <= + res_start) { + cur = cur->rb_right; + } else { + found = cur_res; + cur = cur->rb_left; + /* Continue to look for resources with lower offsets */ + } + } + + /* + * In order of increasing backup_offset, clean dirty resorces + * intersecting the range. + */ + while (found) { + if (found->res_dirty) { + int ret; + + if (!found->func->clean) + return -EINVAL; + + ret = found->func->clean(found); + if (ret) + return ret; + + found->res_dirty = false; + } + last_cleaned = found->backup_offset + found->backup_size; + cur = rb_next(&found->mob_node); + if (!cur) + break; + + found = container_of(cur, struct vmw_resource, mob_node); + if (found->backup_offset >= res_end) + break; + } + + /* + * Set number of pages allowed prefaulting and fence the buffer object + */ + *num_prefault = 1; + if (last_cleaned > res_start) { + struct ttm_buffer_object *bo = &vbo->base; + + *num_prefault = __KERNEL_DIV_ROUND_UP(last_cleaned - res_start, + PAGE_SIZE); + vmw_bo_fence_single(bo, NULL); + if (bo->moving) + dma_fence_put(bo->moving); + bo->moving = dma_fence_get + (dma_resv_get_excl(bo->base.resv)); + } + + return 0; +} diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h index c85144286cfe..3b7438b2d289 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h @@ -77,6 +77,7 @@ struct vmw_user_resource_conv { * @dirty_sync: Upload the dirty mob contents to the resource. * @dirty_add_range: Add a sequential dirty range to the resource * dirty tracker. + * @clean: Clean the resource. */ struct vmw_res_func { enum vmw_res_type res_type; @@ -101,6 +102,7 @@ struct vmw_res_func { int (*dirty_sync)(struct vmw_resource *res); void (*dirty_range_add)(struct vmw_resource *res, size_t start, size_t end); + int (*clean)(struct vmw_resource *res); }; /** diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c index 384a11730a92..e69bc373ae2e 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c @@ -644,7 +644,8 @@ int vmw_validation_res_validate(struct vmw_validation_context *ctx, bool intr) struct vmw_resource *res = val->res; struct vmw_buffer_object *backup = res->backup; - ret = vmw_resource_validate(res, intr); + ret = vmw_resource_validate(res, intr, val->dirty_set && + val->dirty); if (ret) { if (ret != -ERESTARTSYS) DRM_ERROR("Failed to validate resource.\n");