From patchwork Mon Mar 27 22:17:58 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Sinclair Yeh <syeh@vmware.com>
X-Patchwork-Id: 9647609
Return-Path: <dri-devel-bounces@lists.freedesktop.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	271FE601D7 for <patchwork-dri-devel@patchwork.kernel.org>;
	Mon, 27 Mar 2017 22:19:44 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 17E1425D9E
	for <patchwork-dri-devel@patchwork.kernel.org>;
	Mon, 27 Mar 2017 22:19:44 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 0C85327D0E; Mon, 27 Mar 2017 22:19:44 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED
	autolearn=ham version=3.3.1
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4C97725D9E
	for <patchwork-dri-devel@patchwork.kernel.org>;
	Mon, 27 Mar 2017 22:19:43 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id AFE526E37F;
	Mon, 27 Mar 2017 22:19:42 +0000 (UTC)
X-Original-To: dri-devel@lists.freedesktop.org
Delivered-To: dri-devel@lists.freedesktop.org
Received: from EX13-EDG-OU-001.vmware.com (ex13-edg-ou-001.vmware.com
	[208.91.0.189])
	by gabe.freedesktop.org (Postfix) with ESMTPS id 43FA46E37F
	for <dri-devel@lists.freedesktop.org>;
	Mon, 27 Mar 2017 22:19:41 +0000 (UTC)
Received: from sc9-mailhost3.vmware.com (10.113.161.73) by
	EX13-EDG-OU-001.vmware.com (10.113.208.155) with Microsoft SMTP
	Server id 15.0.1156.6; Mon, 27 Mar 2017 15:18:23 -0700
Received: from vmware.com (unknown [10.16.254.220])
	by sc9-mailhost3.vmware.com (Postfix) with SMTP id 969994043D;
	Mon, 27 Mar 2017 15:19:28 -0700 (PDT)
Received: by vmware.com (sSMTP sendmail emulation);
	Mon, 27 Mar 2017 15:19:26 -0700
From: Sinclair Yeh <syeh@vmware.com>
To: <dri-devel@lists.freedesktop.org>
Subject: [PATCH 8/9] drm/vmwgfx: Support topology greater than texture size
Date: Mon, 27 Mar 2017 15:17:58 -0700
Message-ID: <1490653079-45042-9-git-send-email-syeh@vmware.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1490653079-45042-1-git-send-email-syeh@vmware.com>
References: <1490653079-45042-1-git-send-email-syeh@vmware.com>
MIME-Version: 1.0
Received-SPF: None (EX13-EDG-OU-001.vmware.com: syeh@vmware.com does not
	designate permitted sender hosts)
Cc: thellstrom@vmware.com
X-BeenThere: dri-devel@lists.freedesktop.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Direct Rendering Infrastructure - Development
	<dri-devel.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
X-Virus-Scanned: ClamAV using ClamSMTP

Most of the display servers today use a single surface to represent the
entire desktop even if it's stretched across multiple screens.

For vmwgfx with STDU, the maximum surface size is limited to the
maximum texture size on the host.  On a 2D VM, this limits our
ability to support configurations with more than one 4K monitor.

To get past this limitation, we will now allow using a large DMA buf
as the framebuffer, and take care of blitting contents from this DMA buf
to the display buffer.

Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c  |  64 ++++++++++-
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.h  |   4 +
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 202 +++++++++++++++++++++++++++++++++++
 3 files changed, 269 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 7104796..881ffd4 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -803,6 +803,11 @@ vmw_du_plane_duplicate_state(struct drm_plane *plane)
 
 	vps->pinned = 0;
 
+	/* Mapping is managed by prepare_fb/cleanup_fb */
+	memset(&vps->guest_map, 0, sizeof(vps->guest_map));
+	memset(&vps->host_map, 0, sizeof(vps->host_map));
+	vps->cpp = 0;
+
 	/* Each ref counted resource needs to be acquired again */
 	if (vps->surf)
 		(void) vmw_surface_reference(vps->surf);
@@ -861,6 +866,17 @@ vmw_du_plane_destroy_state(struct drm_plane *plane,
 	struct vmw_plane_state *vps = vmw_plane_state_to_vps(state);
 
 
+	/* Should have been freed by cleanup_fb */
+	if (vps->guest_map.virtual) {
+		DRM_ERROR("Guest mapping not freed\n");
+		ttm_bo_kunmap(&vps->guest_map);
+	}
+
+	if (vps->host_map.virtual) {
+		DRM_ERROR("Host mapping not freed\n");
+		ttm_bo_kunmap(&vps->host_map);
+	}
+
 	if (vps->surf)
 		vmw_surface_unreference(&vps->surf);
 
@@ -1435,6 +1451,25 @@ static int vmw_kms_new_framebuffer_dmabuf(struct vmw_private *dev_priv,
 	return ret;
 }
 
+
+/**
+ * vmw_kms_srf_ok - check if a surface can be created
+ *
+ * @width: requested width
+ * @height: requested height
+ *
+ * Surfaces need to be less than texture size
+ */
+static bool
+vmw_kms_srf_ok(struct vmw_private *dev_priv, uint32_t width, uint32_t height)
+{
+	if (width  > dev_priv->texture_max_width ||
+	    height > dev_priv->texture_max_height)
+		return false;
+
+	return true;
+}
+
 /**
  * vmw_kms_new_framebuffer - Create a new framebuffer.
  *
@@ -1463,7 +1498,8 @@ vmw_kms_new_framebuffer(struct vmw_private *dev_priv,
 	 * therefore, wrap the DMA buf in a surface so we can use the
 	 * SurfaceCopy command.
 	 */
-	if (dmabuf && only_2d &&
+	if (vmw_kms_srf_ok(dev_priv, mode_cmd->width, mode_cmd->height)  &&
+	    dmabuf && only_2d &&
 	    dev_priv->active_display_unit == vmw_du_screen_target) {
 		ret = vmw_create_dmabuf_proxy(dev_priv->dev, mode_cmd,
 					      dmabuf, &surface);
@@ -1556,6 +1592,16 @@ static struct drm_framebuffer *vmw_kms_fb_create(struct drm_device *dev,
 	if (ret)
 		goto err_out;
 
+
+	if (!bo &&
+	    !vmw_kms_srf_ok(dev_priv, mode_cmd->width, mode_cmd->height)) {
+		DRM_ERROR("Surface size cannot exceed %dx%d",
+			dev_priv->texture_max_width,
+			dev_priv->texture_max_height);
+		goto err_out;
+	}
+
+
 	vfb = vmw_kms_new_framebuffer(dev_priv, bo, surface,
 				      !(dev_priv->capabilities & SVGA_CAP_3D),
 				      mode_cmd);
@@ -1602,10 +1648,13 @@ vmw_kms_atomic_check_modeset(struct drm_device *dev,
 {
 	struct drm_crtc_state *crtc_state;
 	struct drm_crtc *crtc;
+	struct vmw_private *dev_priv = vmw_priv(dev);
 	int i, ret;
 
 
 	for_each_crtc_in_state(state, crtc, crtc_state, i) {
+		unsigned long requested_bb_mem = 0;
+
 		if (crtc_state->mode.crtc_clock == 0) {
 			/*
 			 * Our virtual device does not have a dot clock,
@@ -1613,6 +1662,19 @@ vmw_kms_atomic_check_modeset(struct drm_device *dev,
 			 */
 			crtc_state->mode.crtc_clock = crtc_state->mode.clock;
 		}
+
+		if (dev_priv->active_display_unit == vmw_du_screen_target) {
+			if (crtc->primary->fb) {
+				int cpp = crtc->primary->fb->pitches[0] /
+					  crtc->primary->fb->width;
+
+				requested_bb_mem += crtc->mode.hdisplay * cpp *
+						    crtc->mode.vdisplay;
+			}
+
+			if (requested_bb_mem > dev_priv->prim_bb_mem)
+				return -EINVAL;
+		}
 	}
 
 	ret = drm_atomic_helper_check_modeset(dev, state);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index 0016f07..0c4a566 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -173,6 +173,10 @@ struct vmw_plane_state {
 	unsigned long dmabuf_size;
 
 	int pinned;
+
+	/* For CPU Blit */
+	struct ttm_bo_kmap_obj host_map, guest_map;
+	unsigned int cpp;
 };
 
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 9e40138..ce1191a 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -109,6 +109,10 @@ struct vmw_screen_target_display_unit {
 	s32 display_width, display_height;
 
 	bool defined;
+
+	/* For CPU Blit */
+	struct ttm_bo_kmap_obj host_map, guest_map;
+	unsigned int cpp;
 };
 
 
@@ -636,6 +640,129 @@ static void vmw_stdu_dmabuf_fifo_commit(struct vmw_kms_dirty *dirty)
 	ddirty->right = ddirty->bottom = S32_MIN;
 }
 
+
+/**
+ * vmw_stdu_dmabuf_cpu_clip - Callback to encode a CPU blit
+ *
+ * @dirty: The closure structure.
+ *
+ * This function calculates the bounding box for all the incoming clips
+ */
+static void vmw_stdu_dmabuf_cpu_clip(struct vmw_kms_dirty *dirty)
+{
+	struct vmw_stdu_dirty *ddirty =
+		container_of(dirty, struct vmw_stdu_dirty, base);
+
+	dirty->num_hits = 1;
+
+	/* Calculate bounding box */
+	ddirty->left = min_t(s32, ddirty->left, dirty->unit_x1);
+	ddirty->top = min_t(s32, ddirty->top, dirty->unit_y1);
+	ddirty->right = max_t(s32, ddirty->right, dirty->unit_x2);
+	ddirty->bottom = max_t(s32, ddirty->bottom, dirty->unit_y2);
+}
+
+
+/**
+ * vmw_stdu_dmabuf_cpu_commit - Callback to do a CPU blit from DMAbuf
+ *
+ * @dirty: The closure structure.
+ *
+ * For the special case when we cannot create a proxy surface in a
+ * 2D VM, we have to do a CPU blit ourselves.
+ */
+static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
+{
+	struct vmw_stdu_dirty *ddirty =
+		container_of(dirty, struct vmw_stdu_dirty, base);
+	struct vmw_screen_target_display_unit *stdu =
+		container_of(dirty->unit, typeof(*stdu), base);
+	s32 width, height;
+	s32 src_pitch, dst_pitch;
+	u8 *src, *dst;
+	bool not_used;
+
+
+	if (!dirty->num_hits)
+		return;
+
+	width = ddirty->right - ddirty->left;
+	height = ddirty->bottom - ddirty->top;
+
+	if (width == 0 || height == 0)
+		return;
+
+
+	/* Assume we are blitting from Host (display_srf) to Guest (dmabuf) */
+	src_pitch = stdu->display_srf->base_size.width * stdu->cpp;
+	src = ttm_kmap_obj_virtual(&stdu->host_map, &not_used);
+	src += dirty->unit_y1 * src_pitch + dirty->unit_x1 * stdu->cpp;
+
+	dst_pitch = ddirty->pitch;
+	dst = ttm_kmap_obj_virtual(&stdu->guest_map, &not_used);
+	dst += dirty->fb_y * dst_pitch + dirty->fb_x * stdu->cpp;
+
+
+	/* Figure out the real direction */
+	if (ddirty->transfer == SVGA3D_WRITE_HOST_VRAM) {
+		u8 *tmp;
+		s32 tmp_pitch;
+
+		tmp = src;
+		tmp_pitch = src_pitch;
+
+		src = dst;
+		src_pitch = dst_pitch;
+
+		dst = tmp;
+		dst_pitch = tmp_pitch;
+	}
+
+	/* CPU Blit */
+	while (height-- > 0) {
+		memcpy(dst, src, width * stdu->cpp);
+		dst += dst_pitch;
+		src += src_pitch;
+	}
+
+	if (ddirty->transfer == SVGA3D_WRITE_HOST_VRAM) {
+		struct vmw_private *dev_priv;
+		struct vmw_stdu_update *cmd;
+		struct drm_clip_rect region;
+		int ret;
+
+		/* We are updating the actual surface, not a proxy */
+		region.x1 = ddirty->left;
+		region.x2 = ddirty->right;
+		region.y1 = ddirty->top;
+		region.y2 = ddirty->bottom;
+		ret = vmw_kms_update_proxy(
+			(struct vmw_resource *) &stdu->display_srf->res,
+			(const struct drm_clip_rect *) &region, 1, 1);
+		if (ret)
+			goto out_cleanup;
+
+
+		dev_priv = vmw_priv(stdu->base.crtc.dev);
+		cmd = vmw_fifo_reserve(dev_priv, sizeof(*cmd));
+
+		if (!cmd) {
+			DRM_ERROR("Cannot reserve FIFO space to update STDU");
+			goto out_cleanup;
+		}
+
+		vmw_stdu_populate_update(cmd, stdu->base.unit,
+					 ddirty->left, ddirty->right,
+					 ddirty->top, ddirty->bottom);
+
+		vmw_fifo_commit(dev_priv, sizeof(*cmd));
+	}
+
+out_cleanup:
+	ddirty->left = ddirty->top = S32_MAX;
+	ddirty->right = ddirty->bottom = S32_MIN;
+}
+
 /**
  * vmw_kms_stdu_dma - Perform a DMA transfer between a dma-buffer backed
  * framebuffer and the screen target system.
@@ -694,6 +821,13 @@ int vmw_kms_stdu_dma(struct vmw_private *dev_priv,
 	if (to_surface)
 		ddirty.base.fifo_reserve_size += sizeof(struct vmw_stdu_update);
 
+	/* 2D VMs cannot use SVGA_3D_CMD_SURFACE_DMA so do CPU blit instead */
+	if (!(dev_priv->capabilities & SVGA_CAP_3D)) {
+		ddirty.base.fifo_commit = vmw_stdu_dmabuf_cpu_commit;
+		ddirty.base.clip = vmw_stdu_dmabuf_cpu_clip;
+		ddirty.base.fifo_reserve_size = 0;
+	}
+
 	ret = vmw_kms_helper_dirty(dev_priv, vfb, clips, vclips,
 				   0, 0, num_clips, increment, &ddirty.base);
 	vmw_kms_helper_buffer_finish(dev_priv, file_priv, buf, NULL,
@@ -960,12 +1094,19 @@ vmw_stdu_primary_plane_cleanup_fb(struct drm_plane *plane,
 {
 	struct vmw_plane_state *vps = vmw_plane_state_to_vps(old_state);
 
+	if (vps->guest_map.virtual)
+		ttm_bo_kunmap(&vps->guest_map);
+
+	if (vps->host_map.virtual)
+		ttm_bo_kunmap(&vps->host_map);
+
 	if (vps->surf)
 		WARN_ON(!vps->pinned);
 
 	vmw_du_plane_cleanup_fb(plane, old_state);
 
 	vps->content_fb_type = SAME_AS_DISPLAY;
+	vps->cpp = 0;
 }
 
 
@@ -986,6 +1127,7 @@ static int
 vmw_stdu_primary_plane_prepare_fb(struct drm_plane *plane,
 				  struct drm_plane_state *new_state)
 {
+	struct vmw_private *dev_priv = vmw_priv(plane->dev);
 	struct drm_framebuffer *new_fb = new_state->fb;
 	struct vmw_framebuffer *vfb;
 	struct vmw_plane_state *vps = vmw_plane_state_to_vps(new_state);
@@ -1111,8 +1253,54 @@ vmw_stdu_primary_plane_prepare_fb(struct drm_plane *plane,
 	}
 
 	vps->content_fb_type = new_content_type;
+
+	/*
+	 * This should only happen if the DMA buf is too large to create a
+	 * proxy surface for.
+	 * If we are a 2D VM with a DMA buffer then we have to use CPU blit
+	 * so cache these mappings
+	 */
+	if (vps->content_fb_type == SEPARATE_DMA &&
+	    !(dev_priv->capabilities & SVGA_CAP_3D)) {
+
+		struct vmw_framebuffer_dmabuf *new_vfbd;
+
+		new_vfbd = vmw_framebuffer_to_vfbd(new_fb);
+
+		ret = ttm_bo_reserve(&new_vfbd->buffer->base, false, false,
+				     NULL);
+		if (ret)
+			goto out_srf_unpin;
+
+		ret = ttm_bo_kmap(&new_vfbd->buffer->base, 0,
+				  new_vfbd->buffer->base.num_pages,
+				  &vps->guest_map);
+
+		ttm_bo_unreserve(&new_vfbd->buffer->base);
+
+		if (ret) {
+			DRM_ERROR("Failed to map content buffer to CPU\n");
+			goto out_srf_unpin;
+		}
+
+		ret = ttm_bo_kmap(&vps->surf->res.backup->base, 0,
+				  vps->surf->res.backup->base.num_pages,
+				  &vps->host_map);
+		if (ret) {
+			DRM_ERROR("Failed to map display buffer to CPU\n");
+			ttm_bo_kunmap(&vps->guest_map);
+			goto out_srf_unpin;
+		}
+
+		vps->cpp = new_fb->pitches[0] / new_fb->width;
+	}
+
 	return 0;
 
+out_srf_unpin:
+	vmw_resource_unpin(&vps->surf->res);
+	vps->pinned--;
+
 out_srf_unref:
 	vmw_surface_unreference(&vps->surf);
 	return ret;
@@ -1146,6 +1334,9 @@ vmw_stdu_primary_plane_atomic_update(struct drm_plane *plane,
 
 	stdu->display_srf = vps->surf;
 	stdu->content_fb_type = vps->content_fb_type;
+	stdu->cpp = vps->cpp;
+	memcpy(&stdu->guest_map, &vps->guest_map, sizeof(vps->guest_map));
+	memcpy(&stdu->host_map, &vps->host_map, sizeof(vps->host_map));
 
 	if (!stdu->defined)
 		return;
@@ -1411,6 +1602,17 @@ int vmw_kms_stdu_init_display(struct vmw_private *dev_priv)
 
 	dev_priv->active_display_unit = vmw_du_screen_target;
 
+	if (!(dev_priv->capabilities & SVGA_CAP_3D)) {
+		/*
+		 * Given various display aspect ratios, there's no way to
+		 * estimate these using prim_bb_mem.  So just set these to
+		 * something arbitrarily large and we will reject any layout
+		 * that doesn't fit prim_bb_mem later
+		 */
+		dev->mode_config.max_width = 16384;
+		dev->mode_config.max_height = 16384;
+	}
+
 	vmw_kms_create_implicit_placement_property(dev_priv, false);
 
 	for (i = 0; i < VMWGFX_NUM_DISPLAY_UNITS; ++i) {