diff mbox

[09/27] drm/etnaviv: don't flush workqueue in etnaviv_gpu_wait_obj_inactive

Message ID 20171201103624.6565-10-l.stach@pengutronix.de (mailing list archive)
State New, archived
Headers show

Commit Message

Lucas Stach Dec. 1, 2017, 10:36 a.m. UTC
There is no need to synchronize with oustanding retire jobs if the object
has gone idle. Retire jobs only ever change the object state from active to
idle, not the other way around.

The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding
a reference to the GEM object, so the retire worker will not pull the
object into the CPU domain, which is the thing we are trying to guard
against with etnaviv_gpu_wait_obj_inactive.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

Comments

Philipp Zabel Dec. 1, 2017, 4:39 p.m. UTC | #1
On Fri, 2017-12-01 at 11:36 +0100, Lucas Stach wrote:
> There is no need to synchronize with oustanding retire jobs if the object
> has gone idle. Retire jobs only ever change the object state from active to
> idle, not the other way around.
> 
> The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding
> a reference to the GEM object, so the retire worker will not pull the
> object into the CPU domain, which is the thing we are trying to guard
> against with etnaviv_gpu_wait_obj_inactive.
> 
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp
Russell King (Oracle) Dec. 1, 2017, 4:59 p.m. UTC | #2
On Fri, Dec 01, 2017 at 11:36:06AM +0100, Lucas Stach wrote:
> There is no need to synchronize with oustanding retire jobs if the object
> has gone idle. Retire jobs only ever change the object state from active to
> idle, not the other way around.
> 
> The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding
> a reference to the GEM object, so the retire worker will not pull the
> object into the CPU domain, which is the thing we are trying to guard
> against with etnaviv_gpu_wait_obj_inactive.

I don't think this makes sense.

The point of GEM_WAIT is to ensure that the driver has completely
finished with the object, including making sure that the memory
(particularly usermem) is not going to get invalidated unexpectedly.

This is a very important property of GEM_WAIT, which, if violated,
leads to malloc() stack corruption in the Xorg DDX driver.

The explanation above seems to suggest that this condition has been
violated already by the etnaviv driver, which will lead to Xorg
segfaults.

The point to this is to ensure that, at this point in the DDX:

        if (bo->is_usermem)
                etna_bo_gem_wait(bo, VIV_WAIT_INDEFINITE);

        drmIoctl(conn->fd, DRM_IOCTL_GEM_CLOSE, &req);

once we reach here, the usermem object is completely free of any
meddling by the etnaviv driver.  This must *not* happen "at some
unknown point in the future".
Lucas Stach Dec. 1, 2017, 5:12 p.m. UTC | #3
Hi Russell,

Am Freitag, den 01.12.2017, 16:59 +0000 schrieb Russell King - ARM Linux:
> On Fri, Dec 01, 2017 at 11:36:06AM +0100, Lucas Stach wrote:
> > There is no need to synchronize with oustanding retire jobs if the object
> > has gone idle. Retire jobs only ever change the object state from active to
> > idle, not the other way around.
> > 
> > The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding
> > a reference to the GEM object, so the retire worker will not pull the
> > object into the CPU domain, which is the thing we are trying to guard
> > against with etnaviv_gpu_wait_obj_inactive.
> 
> I don't think this makes sense.
> 
> The point of GEM_WAIT is to ensure that the driver has completely
> finished with the object, including making sure that the memory
> (particularly usermem) is not going to get invalidated unexpectedly.
> 
> This is a very important property of GEM_WAIT, which, if violated,
> leads to malloc() stack corruption in the Xorg DDX driver.

I'm aware of this property of GEM_WAIT. In fact it's the only reason this
ioctl exists at all.

> The explanation above seems to suggest that this condition has been
> violated already by the etnaviv driver, which will lead to Xorg
> segfaults.

The comment in the commit message refers to the fact that the active
count decrement and IOVA put are not atomic. Still execution of the
etnaviv_gpu_wait_obj_inactive function is synchronized through the
fence event. This gives the ioctl the exact semantics you require for
the X.Org driver to work correctly.

The user visible behavior is unchanged before and after this patch. I
should make this more clear in the commit message.

Regards,
Lucas

> The point to this is to ensure that, at this point in the DDX:
> 
>         if (bo->is_usermem)
>                 etna_bo_gem_wait(bo, VIV_WAIT_INDEFINITE);
> 
>         drmIoctl(conn->fd, DRM_IOCTL_GEM_CLOSE, &req);
> 
> once we reach here, the usermem object is completely free of any
> meddling by the etnaviv driver.  This must *not* happen "at some
> unknown point in the future".
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 3dffccfcefce..ae152bb78f18 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -1295,17 +1295,12 @@  int etnaviv_gpu_wait_obj_inactive(struct etnaviv_gpu *gpu,
 	ret = wait_event_interruptible_timeout(gpu->fence_event,
 					       !is_active(etnaviv_obj),
 					       remaining);
-	if (ret > 0) {
-		struct etnaviv_drm_private *priv = gpu->drm->dev_private;
-
-		/* Synchronise with the retire worker */
-		flush_workqueue(priv->wq);
+	if (ret > 0)
 		return 0;
-	} else if (ret == -ERESTARTSYS) {
+	else if (ret == -ERESTARTSYS)
 		return -ERESTARTSYS;
-	} else {
+	else
 		return -ETIMEDOUT;
-	}
 }
 
 int etnaviv_gpu_pm_get_sync(struct etnaviv_gpu *gpu)