Message ID | 20171201103624.6565-10-l.stach@pengutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 2017-12-01 at 11:36 +0100, Lucas Stach wrote: > There is no need to synchronize with oustanding retire jobs if the object > has gone idle. Retire jobs only ever change the object state from active to > idle, not the other way around. > > The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding > a reference to the GEM object, so the retire worker will not pull the > object into the CPU domain, which is the thing we are trying to guard > against with etnaviv_gpu_wait_obj_inactive. > > Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> regards Philipp
On Fri, Dec 01, 2017 at 11:36:06AM +0100, Lucas Stach wrote: > There is no need to synchronize with oustanding retire jobs if the object > has gone idle. Retire jobs only ever change the object state from active to > idle, not the other way around. > > The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding > a reference to the GEM object, so the retire worker will not pull the > object into the CPU domain, which is the thing we are trying to guard > against with etnaviv_gpu_wait_obj_inactive. I don't think this makes sense. The point of GEM_WAIT is to ensure that the driver has completely finished with the object, including making sure that the memory (particularly usermem) is not going to get invalidated unexpectedly. This is a very important property of GEM_WAIT, which, if violated, leads to malloc() stack corruption in the Xorg DDX driver. The explanation above seems to suggest that this condition has been violated already by the etnaviv driver, which will lead to Xorg segfaults. The point to this is to ensure that, at this point in the DDX: if (bo->is_usermem) etna_bo_gem_wait(bo, VIV_WAIT_INDEFINITE); drmIoctl(conn->fd, DRM_IOCTL_GEM_CLOSE, &req); once we reach here, the usermem object is completely free of any meddling by the etnaviv driver. This must *not* happen "at some unknown point in the future".
Hi Russell, Am Freitag, den 01.12.2017, 16:59 +0000 schrieb Russell King - ARM Linux: > On Fri, Dec 01, 2017 at 11:36:06AM +0100, Lucas Stach wrote: > > There is no need to synchronize with oustanding retire jobs if the object > > has gone idle. Retire jobs only ever change the object state from active to > > idle, not the other way around. > > > > The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding > > a reference to the GEM object, so the retire worker will not pull the > > object into the CPU domain, which is the thing we are trying to guard > > against with etnaviv_gpu_wait_obj_inactive. > > I don't think this makes sense. > > The point of GEM_WAIT is to ensure that the driver has completely > finished with the object, including making sure that the memory > (particularly usermem) is not going to get invalidated unexpectedly. > > This is a very important property of GEM_WAIT, which, if violated, > leads to malloc() stack corruption in the Xorg DDX driver. I'm aware of this property of GEM_WAIT. In fact it's the only reason this ioctl exists at all. > The explanation above seems to suggest that this condition has been > violated already by the etnaviv driver, which will lead to Xorg > segfaults. The comment in the commit message refers to the fact that the active count decrement and IOVA put are not atomic. Still execution of the etnaviv_gpu_wait_obj_inactive function is synchronized through the fence event. This gives the ioctl the exact semantics you require for the X.Org driver to work correctly. The user visible behavior is unchanged before and after this patch. I should make this more clear in the commit message. Regards, Lucas > The point to this is to ensure that, at this point in the DDX: > > if (bo->is_usermem) > etna_bo_gem_wait(bo, VIV_WAIT_INDEFINITE); > > drmIoctl(conn->fd, DRM_IOCTL_GEM_CLOSE, &req); > > once we reach here, the usermem object is completely free of any > meddling by the etnaviv driver. This must *not* happen "at some > unknown point in the future". >
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 3dffccfcefce..ae152bb78f18 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -1295,17 +1295,12 @@ int etnaviv_gpu_wait_obj_inactive(struct etnaviv_gpu *gpu, ret = wait_event_interruptible_timeout(gpu->fence_event, !is_active(etnaviv_obj), remaining); - if (ret > 0) { - struct etnaviv_drm_private *priv = gpu->drm->dev_private; - - /* Synchronise with the retire worker */ - flush_workqueue(priv->wq); + if (ret > 0) return 0; - } else if (ret == -ERESTARTSYS) { + else if (ret == -ERESTARTSYS) return -ERESTARTSYS; - } else { + else return -ETIMEDOUT; - } } int etnaviv_gpu_pm_get_sync(struct etnaviv_gpu *gpu)
There is no need to synchronize with oustanding retire jobs if the object has gone idle. Retire jobs only ever change the object state from active to idle, not the other way around. The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding a reference to the GEM object, so the retire worker will not pull the object into the CPU domain, which is the thing we are trying to guard against with etnaviv_gpu_wait_obj_inactive. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> --- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)