drm/amdgpu: check vm bo eviction valuable at last

Message ID	20220217090440.4468-1-qiang.yu@amd.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; From: Qiang Yu <qiang.yu@amd.com> To: Alex Deucher <alexander.deucher@amd.com>, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>, "Pan, Xinhui" <Xinhui.Pan@amd.com>, David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>, Sumit Semwal <sumit.semwal@linaro.org> Subject: [PATCH] drm/amdgpu: check vm bo eviction valuable at last Date: Thu, 17 Feb 2022 17:04:40 +0800 Message-ID: <20220217090440.4468-1-qiang.yu@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: list Cc: linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, Qiang Yu <qiang.yu@amd.com>, amd-gfx@lists.freedesktop.org, linux-media@vger.kernel.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/amdgpu: check vm bo eviction valuable at last \| expand drm/amdgpu: check vm bo eviction valuable at last

Qiang Yu Feb. 17, 2022, 9:04 a.m. UTC

Workstation application ANSA/META get this error dmesg:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better make the amdgpu_vm->evicting
correctly reflact the vm status and clear the error log.

Signed-off-by: Qiang Yu <qiang.yu@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
 1 file changed, 47 insertions(+), 38 deletions(-)

Christian König Feb. 17, 2022, 9:15 a.m. UTC | #1

Am 17.02.22 um 10:04 schrieb Qiang Yu:
> Workstation application ANSA/META get this error dmesg:
> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
>
> This is caused by:
> 1. create a 256MB buffer in invisible VRAM
> 2. CPU map the buffer and access it causes vm_fault and try to move
>     it to visible VRAM
> 3. force visible VRAM space and traverse all VRAM bos to check if
>     evicting this bo is valuable
> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
>     will set amdgpu_vm->evicting, but latter due to not in visible
>     VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
>     ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
>     but fail in amdgpu_vm_bo_update_mapping() (check
>     amdgpu_vm->evicting) and get this error log
>
> This error won't affect functionality as next CS will finish the
> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> correctly reflact the vm status and clear the error log.

Well NAK, that is intentional behavior.

The VM page tables where considered for eviction, so setting the flag is 
correct even when the page tables later on are not actually evicted.

What we should rather do is to fix amdgpu_vm_ready() to take a look at 
the flag instead of the linked list.

Regards,
Christian.

>
> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>   1 file changed, 47 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 5a32ee66d8c8..88a27911054f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>   	return flags;
>   }
>   
> -/*
> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> - * object.
> - *
> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> - * used to clean out a memory space.
> - */
> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> -					    const struct ttm_place *place)
> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> +					     const struct ttm_place *place)
>   {
>   	unsigned long num_pages = bo->resource->num_pages;
>   	struct amdgpu_res_cursor cursor;
> -	struct dma_resv_list *flist;
> -	struct dma_fence *f;
> -	int i;
> -
> -	/* Swapout? */
> -	if (bo->resource->mem_type == TTM_PL_SYSTEM)
> -		return true;
> -
> -	if (bo->type == ttm_bo_type_kernel &&
> -	    !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> -		return false;
> -
> -	/* If bo is a KFD BO, check if the bo belongs to the current process.
> -	 * If true, then return false as any KFD process needs all its BOs to
> -	 * be resident to run successfully
> -	 */
> -	flist = dma_resv_shared_list(bo->base.resv);
> -	if (flist) {
> -		for (i = 0; i < flist->shared_count; ++i) {
> -			f = rcu_dereference_protected(flist->shared[i],
> -				dma_resv_held(bo->base.resv));
> -			if (amdkfd_fence_check_mm(f, current->mm))
> -				return false;
> -		}
> -	}
>   
>   	switch (bo->resource->mem_type) {
>   	case AMDGPU_PL_PREEMPT:
> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>   		return false;
>   
>   	default:
> -		break;
> +		return ttm_bo_eviction_valuable(bo, place);
>   	}
> +}
>   
> -	return ttm_bo_eviction_valuable(bo, place);
> +/*
> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> + * object.
> + *
> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> + * used to clean out a memory space.
> + */
> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> +					    const struct ttm_place *place)
> +{
> +	struct dma_resv_list *flist;
> +	struct dma_fence *f;
> +	int i;
> +
> +	/* Swapout? */
> +	if (bo->resource->mem_type == TTM_PL_SYSTEM)
> +		return true;
> +
> +	/* If bo is a KFD BO, check if the bo belongs to the current process.
> +	 * If true, then return false as any KFD process needs all its BOs to
> +	 * be resident to run successfully
> +	 */
> +	flist = dma_resv_shared_list(bo->base.resv);
> +	if (flist) {
> +		for (i = 0; i < flist->shared_count; ++i) {
> +			f = rcu_dereference_protected(flist->shared[i],
> +				dma_resv_held(bo->base.resv));
> +			if (amdkfd_fence_check_mm(f, current->mm))
> +				return false;
> +		}
> +	}
> +
> +	/* Check by different mem type. */
> +	if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> +		return false;
> +
> +	/* VM bo should be checked at last because it will mark VM evicting. */
> +	if (bo->type == ttm_bo_type_kernel)
> +		return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> +
> +	return true;
>   }
>   
>   static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

Qiang Yu Feb. 17, 2022, 9:40 a.m. UTC | #2

On Thu, Feb 17, 2022 at 5:15 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> > Workstation application ANSA/META get this error dmesg:
> > [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >
> > This is caused by:
> > 1. create a 256MB buffer in invisible VRAM
> > 2. CPU map the buffer and access it causes vm_fault and try to move
> >     it to visible VRAM
> > 3. force visible VRAM space and traverse all VRAM bos to check if
> >     evicting this bo is valuable
> > 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >     will set amdgpu_vm->evicting, but latter due to not in visible
> >     VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> > 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >     ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >     but fail in amdgpu_vm_bo_update_mapping() (check
> >     amdgpu_vm->evicting) and get this error log
> >
> > This error won't affect functionality as next CS will finish the
> > waiting VM ops. But we'd better make the amdgpu_vm->evicting
> > correctly reflact the vm status and clear the error log.
>
> Well NAK, that is intentional behavior.
>
> The VM page tables where considered for eviction, so setting the flag is
> correct even when the page tables later on are not actually evicted.
>
But this will unnecessarily stop latter user VM ops in ioctl before CS
even when the VM bos are not evicted.
Won't this have any negative effect when could do better?

Regards,
Qiang

> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> the flag instead of the linked list.
>
> Regards,
> Christian.
>
> >
> > Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >   1 file changed, 47 insertions(+), 38 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 5a32ee66d8c8..88a27911054f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >       return flags;
> >   }
> >
> > -/*
> > - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> > - * object.
> > - *
> > - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> > - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> > - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> > - * used to clean out a memory space.
> > - */
> > -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> > -                                         const struct ttm_place *place)
> > +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> > +                                          const struct ttm_place *place)
> >   {
> >       unsigned long num_pages = bo->resource->num_pages;
> >       struct amdgpu_res_cursor cursor;
> > -     struct dma_resv_list *flist;
> > -     struct dma_fence *f;
> > -     int i;
> > -
> > -     /* Swapout? */
> > -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> > -             return true;
> > -
> > -     if (bo->type == ttm_bo_type_kernel &&
> > -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> > -             return false;
> > -
> > -     /* If bo is a KFD BO, check if the bo belongs to the current process.
> > -      * If true, then return false as any KFD process needs all its BOs to
> > -      * be resident to run successfully
> > -      */
> > -     flist = dma_resv_shared_list(bo->base.resv);
> > -     if (flist) {
> > -             for (i = 0; i < flist->shared_count; ++i) {
> > -                     f = rcu_dereference_protected(flist->shared[i],
> > -                             dma_resv_held(bo->base.resv));
> > -                     if (amdkfd_fence_check_mm(f, current->mm))
> > -                             return false;
> > -             }
> > -     }
> >
> >       switch (bo->resource->mem_type) {
> >       case AMDGPU_PL_PREEMPT:
> > @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >               return false;
> >
> >       default:
> > -             break;
> > +             return ttm_bo_eviction_valuable(bo, place);
> >       }
> > +}
> >
> > -     return ttm_bo_eviction_valuable(bo, place);
> > +/*
> > + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> > + * object.
> > + *
> > + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> > + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> > + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> > + * used to clean out a memory space.
> > + */
> > +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> > +                                         const struct ttm_place *place)
> > +{
> > +     struct dma_resv_list *flist;
> > +     struct dma_fence *f;
> > +     int i;
> > +
> > +     /* Swapout? */
> > +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> > +             return true;
> > +
> > +     /* If bo is a KFD BO, check if the bo belongs to the current process.
> > +      * If true, then return false as any KFD process needs all its BOs to
> > +      * be resident to run successfully
> > +      */
> > +     flist = dma_resv_shared_list(bo->base.resv);
> > +     if (flist) {
> > +             for (i = 0; i < flist->shared_count; ++i) {
> > +                     f = rcu_dereference_protected(flist->shared[i],
> > +                             dma_resv_held(bo->base.resv));
> > +                     if (amdkfd_fence_check_mm(f, current->mm))
> > +                             return false;
> > +             }
> > +     }
> > +
> > +     /* Check by different mem type. */
> > +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> > +             return false;
> > +
> > +     /* VM bo should be checked at last because it will mark VM evicting. */
> > +     if (bo->type == ttm_bo_type_kernel)
> > +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> > +
> > +     return true;
> >   }
> >
> >   static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>

Christian König Feb. 17, 2022, 9:46 a.m. UTC | #3

Am 17.02.22 um 10:40 schrieb Qiang Yu:
> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
>>> Workstation application ANSA/META get this error dmesg:
>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
>>>
>>> This is caused by:
>>> 1. create a 256MB buffer in invisible VRAM
>>> 2. CPU map the buffer and access it causes vm_fault and try to move
>>>      it to visible VRAM
>>> 3. force visible VRAM space and traverse all VRAM bos to check if
>>>      evicting this bo is valuable
>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
>>>      will set amdgpu_vm->evicting, but latter due to not in visible
>>>      VRAM, won't really evict it so not add it to amdgpu_vm->evicted
>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
>>>      ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
>>>      but fail in amdgpu_vm_bo_update_mapping() (check
>>>      amdgpu_vm->evicting) and get this error log
>>>
>>> This error won't affect functionality as next CS will finish the
>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
>>> correctly reflact the vm status and clear the error log.
>> Well NAK, that is intentional behavior.
>>
>> The VM page tables where considered for eviction, so setting the flag is
>> correct even when the page tables later on are not actually evicted.
>>
> But this will unnecessarily stop latter user VM ops in ioctl before CS
> even when the VM bos are not evicted.
> Won't this have any negative effect when could do better?

No, this will have a positive effect. See the VM was already considered 
for eviction because it is idle.

Updating it immediately doesn't necessarily make sense, we should wait 
with that until its next usage.

Additional to that this patch doesn't really fix the problem, it just 
mitigates it.

Eviction can fail later on for a couple of reasons and we absolutely 
need to check the flag instead of the list in amdgpu_vm_ready().

Regards,
Christian.

>
> Regards,
> Qiang
>
>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
>> the flag instead of the linked list.
>>
>> Regards,
>> Christian.
>>
>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>>>    1 file changed, 47 insertions(+), 38 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 5a32ee66d8c8..88a27911054f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>>>        return flags;
>>>    }
>>>
>>> -/*
>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>> - * object.
>>> - *
>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>> - * used to clean out a memory space.
>>> - */
>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>> -                                         const struct ttm_place *place)
>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
>>> +                                          const struct ttm_place *place)
>>>    {
>>>        unsigned long num_pages = bo->resource->num_pages;
>>>        struct amdgpu_res_cursor cursor;
>>> -     struct dma_resv_list *flist;
>>> -     struct dma_fence *f;
>>> -     int i;
>>> -
>>> -     /* Swapout? */
>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>> -             return true;
>>> -
>>> -     if (bo->type == ttm_bo_type_kernel &&
>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
>>> -             return false;
>>> -
>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>> -      * If true, then return false as any KFD process needs all its BOs to
>>> -      * be resident to run successfully
>>> -      */
>>> -     flist = dma_resv_shared_list(bo->base.resv);
>>> -     if (flist) {
>>> -             for (i = 0; i < flist->shared_count; ++i) {
>>> -                     f = rcu_dereference_protected(flist->shared[i],
>>> -                             dma_resv_held(bo->base.resv));
>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
>>> -                             return false;
>>> -             }
>>> -     }
>>>
>>>        switch (bo->resource->mem_type) {
>>>        case AMDGPU_PL_PREEMPT:
>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>                return false;
>>>
>>>        default:
>>> -             break;
>>> +             return ttm_bo_eviction_valuable(bo, place);
>>>        }
>>> +}
>>>
>>> -     return ttm_bo_eviction_valuable(bo, place);
>>> +/*
>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>> + * object.
>>> + *
>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>> + * used to clean out a memory space.
>>> + */
>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>> +                                         const struct ttm_place *place)
>>> +{
>>> +     struct dma_resv_list *flist;
>>> +     struct dma_fence *f;
>>> +     int i;
>>> +
>>> +     /* Swapout? */
>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>> +             return true;
>>> +
>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>> +      * If true, then return false as any KFD process needs all its BOs to
>>> +      * be resident to run successfully
>>> +      */
>>> +     flist = dma_resv_shared_list(bo->base.resv);
>>> +     if (flist) {
>>> +             for (i = 0; i < flist->shared_count; ++i) {
>>> +                     f = rcu_dereference_protected(flist->shared[i],
>>> +                             dma_resv_held(bo->base.resv));
>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
>>> +                             return false;
>>> +             }
>>> +     }
>>> +
>>> +     /* Check by different mem type. */
>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
>>> +             return false;
>>> +
>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
>>> +     if (bo->type == ttm_bo_type_kernel)
>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
>>> +
>>> +     return true;
>>>    }
>>>
>>>    static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

Qiang Yu Feb. 17, 2022, 10:13 a.m. UTC | #4

On Thu, Feb 17, 2022 at 5:46 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 5:15 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>> Workstation application ANSA/META get this error dmesg:
> >>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>
> >>> This is caused by:
> >>> 1. create a 256MB buffer in invisible VRAM
> >>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>      it to visible VRAM
> >>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>      evicting this bo is valuable
> >>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>      will set amdgpu_vm->evicting, but latter due to not in visible
> >>>      VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>      ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>      but fail in amdgpu_vm_bo_update_mapping() (check
> >>>      amdgpu_vm->evicting) and get this error log
> >>>
> >>> This error won't affect functionality as next CS will finish the
> >>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>> correctly reflact the vm status and clear the error log.
> >> Well NAK, that is intentional behavior.
> >>
> >> The VM page tables where considered for eviction, so setting the flag is
> >> correct even when the page tables later on are not actually evicted.
> >>
> > But this will unnecessarily stop latter user VM ops in ioctl before CS
> > even when the VM bos are not evicted.
> > Won't this have any negative effect when could do better?
>
> No, this will have a positive effect. See the VM was already considered
> for eviction because it is idle.
>
> Updating it immediately doesn't necessarily make sense, we should wait
> with that until its next usage.
>
> Additional to that this patch doesn't really fix the problem, it just
> mitigates it.
>
> Eviction can fail later on for a couple of reasons and we absolutely
> need to check the flag instead of the list in amdgpu_vm_ready().
The flag only for both flag and list? Looks like should be both as
the list indicate some vm page table need to be updated and could
delay the user update with the same logic as you described above.

Regards,
Qiang

>
> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >> the flag instead of the linked list.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> >>> ---
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >>>    1 file changed, 47 insertions(+), 38 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> index 5a32ee66d8c8..88a27911054f 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >>>        return flags;
> >>>    }
> >>>
> >>> -/*
> >>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>> - * object.
> >>> - *
> >>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>> - * used to clean out a memory space.
> >>> - */
> >>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>> -                                         const struct ttm_place *place)
> >>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> >>> +                                          const struct ttm_place *place)
> >>>    {
> >>>        unsigned long num_pages = bo->resource->num_pages;
> >>>        struct amdgpu_res_cursor cursor;
> >>> -     struct dma_resv_list *flist;
> >>> -     struct dma_fence *f;
> >>> -     int i;
> >>> -
> >>> -     /* Swapout? */
> >>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>> -             return true;
> >>> -
> >>> -     if (bo->type == ttm_bo_type_kernel &&
> >>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> >>> -             return false;
> >>> -
> >>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>> -      * If true, then return false as any KFD process needs all its BOs to
> >>> -      * be resident to run successfully
> >>> -      */
> >>> -     flist = dma_resv_shared_list(bo->base.resv);
> >>> -     if (flist) {
> >>> -             for (i = 0; i < flist->shared_count; ++i) {
> >>> -                     f = rcu_dereference_protected(flist->shared[i],
> >>> -                             dma_resv_held(bo->base.resv));
> >>> -                     if (amdkfd_fence_check_mm(f, current->mm))
> >>> -                             return false;
> >>> -             }
> >>> -     }
> >>>
> >>>        switch (bo->resource->mem_type) {
> >>>        case AMDGPU_PL_PREEMPT:
> >>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>                return false;
> >>>
> >>>        default:
> >>> -             break;
> >>> +             return ttm_bo_eviction_valuable(bo, place);
> >>>        }
> >>> +}
> >>>
> >>> -     return ttm_bo_eviction_valuable(bo, place);
> >>> +/*
> >>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>> + * object.
> >>> + *
> >>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>> + * used to clean out a memory space.
> >>> + */
> >>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>> +                                         const struct ttm_place *place)
> >>> +{
> >>> +     struct dma_resv_list *flist;
> >>> +     struct dma_fence *f;
> >>> +     int i;
> >>> +
> >>> +     /* Swapout? */
> >>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>> +             return true;
> >>> +
> >>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>> +      * If true, then return false as any KFD process needs all its BOs to
> >>> +      * be resident to run successfully
> >>> +      */
> >>> +     flist = dma_resv_shared_list(bo->base.resv);
> >>> +     if (flist) {
> >>> +             for (i = 0; i < flist->shared_count; ++i) {
> >>> +                     f = rcu_dereference_protected(flist->shared[i],
> >>> +                             dma_resv_held(bo->base.resv));
> >>> +                     if (amdkfd_fence_check_mm(f, current->mm))
> >>> +                             return false;
> >>> +             }
> >>> +     }
> >>> +
> >>> +     /* Check by different mem type. */
> >>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> >>> +             return false;
> >>> +
> >>> +     /* VM bo should be checked at last because it will mark VM evicting. */
> >>> +     if (bo->type == ttm_bo_type_kernel)
> >>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> >>> +
> >>> +     return true;
> >>>    }
> >>>
> >>>    static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>

Christian König Feb. 17, 2022, 10:39 a.m. UTC | #5

Am 17.02.22 um 11:13 schrieb Qiang Yu:
> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
>>>>> Workstation application ANSA/META get this error dmesg:
>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
>>>>>
>>>>> This is caused by:
>>>>> 1. create a 256MB buffer in invisible VRAM
>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
>>>>>       it to visible VRAM
>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
>>>>>       evicting this bo is valuable
>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
>>>>>       will set amdgpu_vm->evicting, but latter due to not in visible
>>>>>       VRAM, won't really evict it so not add it to amdgpu_vm->evicted
>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
>>>>>       ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
>>>>>       but fail in amdgpu_vm_bo_update_mapping() (check
>>>>>       amdgpu_vm->evicting) and get this error log
>>>>>
>>>>> This error won't affect functionality as next CS will finish the
>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
>>>>> correctly reflact the vm status and clear the error log.
>>>> Well NAK, that is intentional behavior.
>>>>
>>>> The VM page tables where considered for eviction, so setting the flag is
>>>> correct even when the page tables later on are not actually evicted.
>>>>
>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
>>> even when the VM bos are not evicted.
>>> Won't this have any negative effect when could do better?
>> No, this will have a positive effect. See the VM was already considered
>> for eviction because it is idle.
>>
>> Updating it immediately doesn't necessarily make sense, we should wait
>> with that until its next usage.
>>
>> Additional to that this patch doesn't really fix the problem, it just
>> mitigates it.
>>
>> Eviction can fail later on for a couple of reasons and we absolutely
>> need to check the flag instead of the list in amdgpu_vm_ready().
> The flag only for both flag and list? Looks like should be both as
> the list indicate some vm page table need to be updated and could
> delay the user update with the same logic as you described above.

I think checking the flag should be enough. The issue is that the list 
was there initially, but to avoid race conditions we added the flag with 
separate lock protection later on.

Regards,
Christian.

>
> Regards,
> Qiang
>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Qiang
>>>
>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
>>>> the flag instead of the linked list.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>>>>>     1 file changed, 47 insertions(+), 38 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>> index 5a32ee66d8c8..88a27911054f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>>>>>         return flags;
>>>>>     }
>>>>>
>>>>> -/*
>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>> - * object.
>>>>> - *
>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>> - * used to clean out a memory space.
>>>>> - */
>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>> -                                         const struct ttm_place *place)
>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
>>>>> +                                          const struct ttm_place *place)
>>>>>     {
>>>>>         unsigned long num_pages = bo->resource->num_pages;
>>>>>         struct amdgpu_res_cursor cursor;
>>>>> -     struct dma_resv_list *flist;
>>>>> -     struct dma_fence *f;
>>>>> -     int i;
>>>>> -
>>>>> -     /* Swapout? */
>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>> -             return true;
>>>>> -
>>>>> -     if (bo->type == ttm_bo_type_kernel &&
>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
>>>>> -             return false;
>>>>> -
>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>> -      * If true, then return false as any KFD process needs all its BOs to
>>>>> -      * be resident to run successfully
>>>>> -      */
>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
>>>>> -     if (flist) {
>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
>>>>> -                             dma_resv_held(bo->base.resv));
>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>> -                             return false;
>>>>> -             }
>>>>> -     }
>>>>>
>>>>>         switch (bo->resource->mem_type) {
>>>>>         case AMDGPU_PL_PREEMPT:
>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>                 return false;
>>>>>
>>>>>         default:
>>>>> -             break;
>>>>> +             return ttm_bo_eviction_valuable(bo, place);
>>>>>         }
>>>>> +}
>>>>>
>>>>> -     return ttm_bo_eviction_valuable(bo, place);
>>>>> +/*
>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>> + * object.
>>>>> + *
>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>> + * used to clean out a memory space.
>>>>> + */
>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>> +                                         const struct ttm_place *place)
>>>>> +{
>>>>> +     struct dma_resv_list *flist;
>>>>> +     struct dma_fence *f;
>>>>> +     int i;
>>>>> +
>>>>> +     /* Swapout? */
>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>> +             return true;
>>>>> +
>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>> +      * If true, then return false as any KFD process needs all its BOs to
>>>>> +      * be resident to run successfully
>>>>> +      */
>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
>>>>> +     if (flist) {
>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
>>>>> +                             dma_resv_held(bo->base.resv));
>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>> +                             return false;
>>>>> +             }
>>>>> +     }
>>>>> +
>>>>> +     /* Check by different mem type. */
>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
>>>>> +             return false;
>>>>> +
>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
>>>>> +     if (bo->type == ttm_bo_type_kernel)
>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
>>>>> +
>>>>> +     return true;
>>>>>     }
>>>>>
>>>>>     static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

Qiang Yu Feb. 17, 2022, 10:58 a.m. UTC | #6

On Thu, Feb 17, 2022 at 6:39 PM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 5:46 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>> Workstation application ANSA/META get this error dmesg:
> >>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>>>
> >>>>> This is caused by:
> >>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>       it to visible VRAM
> >>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>       evicting this bo is valuable
> >>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>       will set amdgpu_vm->evicting, but latter due to not in visible
> >>>>>       VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>       ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>       but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>       amdgpu_vm->evicting) and get this error log
> >>>>>
> >>>>> This error won't affect functionality as next CS will finish the
> >>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>> correctly reflact the vm status and clear the error log.
> >>>> Well NAK, that is intentional behavior.
> >>>>
> >>>> The VM page tables where considered for eviction, so setting the flag is
> >>>> correct even when the page tables later on are not actually evicted.
> >>>>
> >>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>> even when the VM bos are not evicted.
> >>> Won't this have any negative effect when could do better?
> >> No, this will have a positive effect. See the VM was already considered
> >> for eviction because it is idle.
> >>
> >> Updating it immediately doesn't necessarily make sense, we should wait
> >> with that until its next usage.
> >>
> >> Additional to that this patch doesn't really fix the problem, it just
> >> mitigates it.
> >>
> >> Eviction can fail later on for a couple of reasons and we absolutely
> >> need to check the flag instead of the list in amdgpu_vm_ready().
> > The flag only for both flag and list? Looks like should be both as
> > the list indicate some vm page table need to be updated and could
> > delay the user update with the same logic as you described above.
>
> I think checking the flag should be enough. The issue is that the list
> was there initially, but to avoid race conditions we added the flag with
> separate lock protection later on.
>
But list and flag does not align always, there are cases like
list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
eviction). If only check flag list-non-empty/flag-unset change behavior.

Regards,
Qiang

> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >>>> the flag instead of the linked list.
> >>>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> >>>>> ---
> >>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >>>>>     1 file changed, 47 insertions(+), 38 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>> index 5a32ee66d8c8..88a27911054f 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >>>>>         return flags;
> >>>>>     }
> >>>>>
> >>>>> -/*
> >>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>> - * object.
> >>>>> - *
> >>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>> - * used to clean out a memory space.
> >>>>> - */
> >>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>> -                                         const struct ttm_place *place)
> >>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>> +                                          const struct ttm_place *place)
> >>>>>     {
> >>>>>         unsigned long num_pages = bo->resource->num_pages;
> >>>>>         struct amdgpu_res_cursor cursor;
> >>>>> -     struct dma_resv_list *flist;
> >>>>> -     struct dma_fence *f;
> >>>>> -     int i;
> >>>>> -
> >>>>> -     /* Swapout? */
> >>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>> -             return true;
> >>>>> -
> >>>>> -     if (bo->type == ttm_bo_type_kernel &&
> >>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> >>>>> -             return false;
> >>>>> -
> >>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>> -      * If true, then return false as any KFD process needs all its BOs to
> >>>>> -      * be resident to run successfully
> >>>>> -      */
> >>>>> -     flist = dma_resv_shared_list(bo->base.resv);
> >>>>> -     if (flist) {
> >>>>> -             for (i = 0; i < flist->shared_count; ++i) {
> >>>>> -                     f = rcu_dereference_protected(flist->shared[i],
> >>>>> -                             dma_resv_held(bo->base.resv));
> >>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>> -                             return false;
> >>>>> -             }
> >>>>> -     }
> >>>>>
> >>>>>         switch (bo->resource->mem_type) {
> >>>>>         case AMDGPU_PL_PREEMPT:
> >>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>                 return false;
> >>>>>
> >>>>>         default:
> >>>>> -             break;
> >>>>> +             return ttm_bo_eviction_valuable(bo, place);
> >>>>>         }
> >>>>> +}
> >>>>>
> >>>>> -     return ttm_bo_eviction_valuable(bo, place);
> >>>>> +/*
> >>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>> + * object.
> >>>>> + *
> >>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>> + * used to clean out a memory space.
> >>>>> + */
> >>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>> +                                         const struct ttm_place *place)
> >>>>> +{
> >>>>> +     struct dma_resv_list *flist;
> >>>>> +     struct dma_fence *f;
> >>>>> +     int i;
> >>>>> +
> >>>>> +     /* Swapout? */
> >>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>> +             return true;
> >>>>> +
> >>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>> +      * If true, then return false as any KFD process needs all its BOs to
> >>>>> +      * be resident to run successfully
> >>>>> +      */
> >>>>> +     flist = dma_resv_shared_list(bo->base.resv);
> >>>>> +     if (flist) {
> >>>>> +             for (i = 0; i < flist->shared_count; ++i) {
> >>>>> +                     f = rcu_dereference_protected(flist->shared[i],
> >>>>> +                             dma_resv_held(bo->base.resv));
> >>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>> +                             return false;
> >>>>> +             }
> >>>>> +     }
> >>>>> +
> >>>>> +     /* Check by different mem type. */
> >>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> >>>>> +             return false;
> >>>>> +
> >>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
> >>>>> +     if (bo->type == ttm_bo_type_kernel)
> >>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> >>>>> +
> >>>>> +     return true;
> >>>>>     }
> >>>>>
> >>>>>     static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>

Christian König Feb. 17, 2022, 12:22 p.m. UTC | #7

Am 17.02.22 um 11:58 schrieb Qiang Yu:
> On Thu, Feb 17, 2022 at 6:39 PM Christian König
> <christian.koenig@amd.com> wrote:
>>
>>
>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
>>>>>>> Workstation application ANSA/META get this error dmesg:
>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
>>>>>>>
>>>>>>> This is caused by:
>>>>>>> 1. create a 256MB buffer in invisible VRAM
>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
>>>>>>>        it to visible VRAM
>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
>>>>>>>        evicting this bo is valuable
>>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
>>>>>>>        will set amdgpu_vm->evicting, but latter due to not in visible
>>>>>>>        VRAM, won't really evict it so not add it to amdgpu_vm->evicted
>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
>>>>>>>        ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
>>>>>>>        but fail in amdgpu_vm_bo_update_mapping() (check
>>>>>>>        amdgpu_vm->evicting) and get this error log
>>>>>>>
>>>>>>> This error won't affect functionality as next CS will finish the
>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
>>>>>>> correctly reflact the vm status and clear the error log.
>>>>>> Well NAK, that is intentional behavior.
>>>>>>
>>>>>> The VM page tables where considered for eviction, so setting the flag is
>>>>>> correct even when the page tables later on are not actually evicted.
>>>>>>
>>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
>>>>> even when the VM bos are not evicted.
>>>>> Won't this have any negative effect when could do better?
>>>> No, this will have a positive effect. See the VM was already considered
>>>> for eviction because it is idle.
>>>>
>>>> Updating it immediately doesn't necessarily make sense, we should wait
>>>> with that until its next usage.
>>>>
>>>> Additional to that this patch doesn't really fix the problem, it just
>>>> mitigates it.
>>>>
>>>> Eviction can fail later on for a couple of reasons and we absolutely
>>>> need to check the flag instead of the list in amdgpu_vm_ready().
>>> The flag only for both flag and list? Looks like should be both as
>>> the list indicate some vm page table need to be updated and could
>>> delay the user update with the same logic as you described above.
>> I think checking the flag should be enough. The issue is that the list
>> was there initially, but to avoid race conditions we added the flag with
>> separate lock protection later on.
>>
> But list and flag does not align always, there are cases like
> list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
> eviction). If only check flag list-non-empty/flag-unset change behavior.

Yeah, but I think that the flag unset list-non-empty case would be 
correctly handled if we only test the flag.

In other words we can update the page tables as long as they are not 
partially or fully evicted and that's not the case when non-vm BOs are 
evicted.

Regards,
Christian.

>
> Regards,
> Qiang
>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Qiang
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Qiang
>>>>>
>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
>>>>>> the flag instead of the linked list.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>>>>>>>      1 file changed, 47 insertions(+), 38 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>>>>>>>          return flags;
>>>>>>>      }
>>>>>>>
>>>>>>> -/*
>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>> - * object.
>>>>>>> - *
>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>> - * used to clean out a memory space.
>>>>>>> - */
>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>> -                                         const struct ttm_place *place)
>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>> +                                          const struct ttm_place *place)
>>>>>>>      {
>>>>>>>          unsigned long num_pages = bo->resource->num_pages;
>>>>>>>          struct amdgpu_res_cursor cursor;
>>>>>>> -     struct dma_resv_list *flist;
>>>>>>> -     struct dma_fence *f;
>>>>>>> -     int i;
>>>>>>> -
>>>>>>> -     /* Swapout? */
>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>> -             return true;
>>>>>>> -
>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
>>>>>>> -             return false;
>>>>>>> -
>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
>>>>>>> -      * be resident to run successfully
>>>>>>> -      */
>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>> -     if (flist) {
>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>> -                             dma_resv_held(bo->base.resv));
>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>> -                             return false;
>>>>>>> -             }
>>>>>>> -     }
>>>>>>>
>>>>>>>          switch (bo->resource->mem_type) {
>>>>>>>          case AMDGPU_PL_PREEMPT:
>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>                  return false;
>>>>>>>
>>>>>>>          default:
>>>>>>> -             break;
>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
>>>>>>>          }
>>>>>>> +}
>>>>>>>
>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
>>>>>>> +/*
>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>> + * object.
>>>>>>> + *
>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>> + * used to clean out a memory space.
>>>>>>> + */
>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>> +                                         const struct ttm_place *place)
>>>>>>> +{
>>>>>>> +     struct dma_resv_list *flist;
>>>>>>> +     struct dma_fence *f;
>>>>>>> +     int i;
>>>>>>> +
>>>>>>> +     /* Swapout? */
>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>> +             return true;
>>>>>>> +
>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
>>>>>>> +      * be resident to run successfully
>>>>>>> +      */
>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>> +     if (flist) {
>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>> +                             dma_resv_held(bo->base.resv));
>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>> +                             return false;
>>>>>>> +             }
>>>>>>> +     }
>>>>>>> +
>>>>>>> +     /* Check by different mem type. */
>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
>>>>>>> +             return false;
>>>>>>> +
>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
>>>>>>> +
>>>>>>> +     return true;
>>>>>>>      }
>>>>>>>
>>>>>>>      static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

Qiang Yu Feb. 18, 2022, 3:08 a.m. UTC | #8

On Thu, Feb 17, 2022 at 8:22 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 17.02.22 um 11:58 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 6:39 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >>
> >>
> >> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>>> <christian.koenig@amd.com> wrote:
> >>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>>>> Workstation application ANSA/META get this error dmesg:
> >>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>>>>>
> >>>>>>> This is caused by:
> >>>>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>>>        it to visible VRAM
> >>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>>>        evicting this bo is valuable
> >>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>>>        will set amdgpu_vm->evicting, but latter due to not in visible
> >>>>>>>        VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>>>        ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>>>        but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>>>        amdgpu_vm->evicting) and get this error log
> >>>>>>>
> >>>>>>> This error won't affect functionality as next CS will finish the
> >>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>>>> correctly reflact the vm status and clear the error log.
> >>>>>> Well NAK, that is intentional behavior.
> >>>>>>
> >>>>>> The VM page tables where considered for eviction, so setting the flag is
> >>>>>> correct even when the page tables later on are not actually evicted.
> >>>>>>
> >>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>>>> even when the VM bos are not evicted.
> >>>>> Won't this have any negative effect when could do better?
> >>>> No, this will have a positive effect. See the VM was already considered
> >>>> for eviction because it is idle.
> >>>>
> >>>> Updating it immediately doesn't necessarily make sense, we should wait
> >>>> with that until its next usage.
> >>>>
> >>>> Additional to that this patch doesn't really fix the problem, it just
> >>>> mitigates it.
> >>>>
> >>>> Eviction can fail later on for a couple of reasons and we absolutely
> >>>> need to check the flag instead of the list in amdgpu_vm_ready().
> >>> The flag only for both flag and list? Looks like should be both as
> >>> the list indicate some vm page table need to be updated and could
> >>> delay the user update with the same logic as you described above.
> >> I think checking the flag should be enough. The issue is that the list
> >> was there initially, but to avoid race conditions we added the flag with
> >> separate lock protection later on.
> >>
> > But list and flag does not align always, there are cases like
> > list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
> > eviction). If only check flag list-non-empty/flag-unset change behavior.
>
> Yeah, but I think that the flag unset list-non-empty case would be
> correctly handled if we only test the flag.
>
> In other words we can update the page tables as long as they are not
> partially or fully evicted and that's not the case when non-vm BOs are
> evicted.
>
This sounds like two standard for the same thing, because this problem
does not evict page tables too. But I see your point is:
There's a difference that this problem's case can make sure vm is idle,
and we prefer to delay vm updates when vm is idle.

If so, why not just stop user vm update by checking vm busy in
amdgpu_gem_va_ioctl() to skip amdgpu_gem_va_update_vm()?

Then we can keep the evicting flag accurate (after solving your
concern for this patch that eviction may fail latter by further delay
the flag update after eviction success).

Regards,
Qiang


> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Regards,
> >>>>> Qiang
> >>>>>
> >>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >>>>>> the flag instead of the linked list.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> >>>>>>> ---
> >>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >>>>>>>      1 file changed, 47 insertions(+), 38 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>> index 5a32ee66d8c8..88a27911054f 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >>>>>>>          return flags;
> >>>>>>>      }
> >>>>>>>
> >>>>>>> -/*
> >>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>> - * object.
> >>>>>>> - *
> >>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>> - * used to clean out a memory space.
> >>>>>>> - */
> >>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>> -                                         const struct ttm_place *place)
> >>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>> +                                          const struct ttm_place *place)
> >>>>>>>      {
> >>>>>>>          unsigned long num_pages = bo->resource->num_pages;
> >>>>>>>          struct amdgpu_res_cursor cursor;
> >>>>>>> -     struct dma_resv_list *flist;
> >>>>>>> -     struct dma_fence *f;
> >>>>>>> -     int i;
> >>>>>>> -
> >>>>>>> -     /* Swapout? */
> >>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>> -             return true;
> >>>>>>> -
> >>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
> >>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> >>>>>>> -             return false;
> >>>>>>> -
> >>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>> -      * be resident to run successfully
> >>>>>>> -      */
> >>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>> -     if (flist) {
> >>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>> -                             dma_resv_held(bo->base.resv));
> >>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>> -                             return false;
> >>>>>>> -             }
> >>>>>>> -     }
> >>>>>>>
> >>>>>>>          switch (bo->resource->mem_type) {
> >>>>>>>          case AMDGPU_PL_PREEMPT:
> >>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>                  return false;
> >>>>>>>
> >>>>>>>          default:
> >>>>>>> -             break;
> >>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>          }
> >>>>>>> +}
> >>>>>>>
> >>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
> >>>>>>> +/*
> >>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>> + * object.
> >>>>>>> + *
> >>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>> + * used to clean out a memory space.
> >>>>>>> + */
> >>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>> +                                         const struct ttm_place *place)
> >>>>>>> +{
> >>>>>>> +     struct dma_resv_list *flist;
> >>>>>>> +     struct dma_fence *f;
> >>>>>>> +     int i;
> >>>>>>> +
> >>>>>>> +     /* Swapout? */
> >>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>> +             return true;
> >>>>>>> +
> >>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>> +      * be resident to run successfully
> >>>>>>> +      */
> >>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>> +     if (flist) {
> >>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>> +                             dma_resv_held(bo->base.resv));
> >>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>> +                             return false;
> >>>>>>> +             }
> >>>>>>> +     }
> >>>>>>> +
> >>>>>>> +     /* Check by different mem type. */
> >>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> >>>>>>> +             return false;
> >>>>>>> +
> >>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
> >>>>>>> +     if (bo->type == ttm_bo_type_kernel)
> >>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> >>>>>>> +
> >>>>>>> +     return true;
> >>>>>>>      }
> >>>>>>>
> >>>>>>>      static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>

Christian König Feb. 18, 2022, 7:46 a.m. UTC | #9

Am 18.02.22 um 04:08 schrieb Qiang Yu:
> On Thu, Feb 17, 2022 at 8:22 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 17.02.22 um 11:58 schrieb Qiang Yu:
>>> On Thu, Feb 17, 2022 at 6:39 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>>
>>>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
>>>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
>>>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
>>>>>>>>> Workstation application ANSA/META get this error dmesg:
>>>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
>>>>>>>>>
>>>>>>>>> This is caused by:
>>>>>>>>> 1. create a 256MB buffer in invisible VRAM
>>>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
>>>>>>>>>         it to visible VRAM
>>>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
>>>>>>>>>         evicting this bo is valuable
>>>>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
>>>>>>>>>         will set amdgpu_vm->evicting, but latter due to not in visible
>>>>>>>>>         VRAM, won't really evict it so not add it to amdgpu_vm->evicted
>>>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
>>>>>>>>>         ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
>>>>>>>>>         but fail in amdgpu_vm_bo_update_mapping() (check
>>>>>>>>>         amdgpu_vm->evicting) and get this error log
>>>>>>>>>
>>>>>>>>> This error won't affect functionality as next CS will finish the
>>>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
>>>>>>>>> correctly reflact the vm status and clear the error log.
>>>>>>>> Well NAK, that is intentional behavior.
>>>>>>>>
>>>>>>>> The VM page tables where considered for eviction, so setting the flag is
>>>>>>>> correct even when the page tables later on are not actually evicted.
>>>>>>>>
>>>>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
>>>>>>> even when the VM bos are not evicted.
>>>>>>> Won't this have any negative effect when could do better?
>>>>>> No, this will have a positive effect. See the VM was already considered
>>>>>> for eviction because it is idle.
>>>>>>
>>>>>> Updating it immediately doesn't necessarily make sense, we should wait
>>>>>> with that until its next usage.
>>>>>>
>>>>>> Additional to that this patch doesn't really fix the problem, it just
>>>>>> mitigates it.
>>>>>>
>>>>>> Eviction can fail later on for a couple of reasons and we absolutely
>>>>>> need to check the flag instead of the list in amdgpu_vm_ready().
>>>>> The flag only for both flag and list? Looks like should be both as
>>>>> the list indicate some vm page table need to be updated and could
>>>>> delay the user update with the same logic as you described above.
>>>> I think checking the flag should be enough. The issue is that the list
>>>> was there initially, but to avoid race conditions we added the flag with
>>>> separate lock protection later on.
>>>>
>>> But list and flag does not align always, there are cases like
>>> list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
>>> eviction). If only check flag list-non-empty/flag-unset change behavior.
>> Yeah, but I think that the flag unset list-non-empty case would be
>> correctly handled if we only test the flag.
>>
>> In other words we can update the page tables as long as they are not
>> partially or fully evicted and that's not the case when non-vm BOs are
>> evicted.
>>
> This sounds like two standard for the same thing, because this problem
> does not evict page tables too. But I see your point is:
> There's a difference that this problem's case can make sure vm is idle,
> and we prefer to delay vm updates when vm is idle.
>
> If so, why not just stop user vm update by checking vm busy in
> amdgpu_gem_va_ioctl() to skip amdgpu_gem_va_update_vm()?

That's exactly what amdgpu_gem_va_update_vm() is doing by calling 
amdgpu_vm_ready(). The problem is that amdgpu_vm_ready() looks at the 
wrong thing.

> Then we can keep the evicting flag accurate (after solving your
> concern for this patch that eviction may fail latter by further delay
> the flag update after eviction success).

That won't work. See we need to mark the VM as evicted before we 
actually evict them because otherwise somebody could use the VM in 
parallel and add another fence to it.

Regards,
Christian.

>
> Regards,
> Qiang
>
>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Qiang
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Qiang
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Regards,
>>>>>>> Qiang
>>>>>>>
>>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
>>>>>>>> the flag instead of the linked list.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
>>>>>>>>> ---
>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>>>>>>>>>       1 file changed, 47 insertions(+), 38 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>>>>>>>>>           return flags;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> -/*
>>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>> - * object.
>>>>>>>>> - *
>>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>> - * used to clean out a memory space.
>>>>>>>>> - */
>>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>> -                                         const struct ttm_place *place)
>>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>> +                                          const struct ttm_place *place)
>>>>>>>>>       {
>>>>>>>>>           unsigned long num_pages = bo->resource->num_pages;
>>>>>>>>>           struct amdgpu_res_cursor cursor;
>>>>>>>>> -     struct dma_resv_list *flist;
>>>>>>>>> -     struct dma_fence *f;
>>>>>>>>> -     int i;
>>>>>>>>> -
>>>>>>>>> -     /* Swapout? */
>>>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>> -             return true;
>>>>>>>>> -
>>>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
>>>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
>>>>>>>>> -             return false;
>>>>>>>>> -
>>>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>> -      * be resident to run successfully
>>>>>>>>> -      */
>>>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>> -     if (flist) {
>>>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>> -                             dma_resv_held(bo->base.resv));
>>>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>> -                             return false;
>>>>>>>>> -             }
>>>>>>>>> -     }
>>>>>>>>>
>>>>>>>>>           switch (bo->resource->mem_type) {
>>>>>>>>>           case AMDGPU_PL_PREEMPT:
>>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>                   return false;
>>>>>>>>>
>>>>>>>>>           default:
>>>>>>>>> -             break;
>>>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>>           }
>>>>>>>>> +}
>>>>>>>>>
>>>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>> +/*
>>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>> + * object.
>>>>>>>>> + *
>>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>> + * used to clean out a memory space.
>>>>>>>>> + */
>>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>> +                                         const struct ttm_place *place)
>>>>>>>>> +{
>>>>>>>>> +     struct dma_resv_list *flist;
>>>>>>>>> +     struct dma_fence *f;
>>>>>>>>> +     int i;
>>>>>>>>> +
>>>>>>>>> +     /* Swapout? */
>>>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>> +             return true;
>>>>>>>>> +
>>>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>> +      * be resident to run successfully
>>>>>>>>> +      */
>>>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>> +     if (flist) {
>>>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>> +                             dma_resv_held(bo->base.resv));
>>>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>> +                             return false;
>>>>>>>>> +             }
>>>>>>>>> +     }
>>>>>>>>> +
>>>>>>>>> +     /* Check by different mem type. */
>>>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
>>>>>>>>> +             return false;
>>>>>>>>> +
>>>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
>>>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
>>>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
>>>>>>>>> +
>>>>>>>>> +     return true;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>>       static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

Qiang Yu Feb. 18, 2022, 8:58 a.m. UTC | #10

On Fri, Feb 18, 2022 at 3:46 PM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 18.02.22 um 04:08 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 8:22 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 17.02.22 um 11:58 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 6:39 PM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>>
> >>>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> >>>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >>>>> <christian.koenig@amd.com> wrote:
> >>>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>>>>> <christian.koenig@amd.com> wrote:
> >>>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>>>>>> Workstation application ANSA/META get this error dmesg:
> >>>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>>>>>>>
> >>>>>>>>> This is caused by:
> >>>>>>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>>>>>         it to visible VRAM
> >>>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>>>>>         evicting this bo is valuable
> >>>>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>>>>>         will set amdgpu_vm->evicting, but latter due to not in visible
> >>>>>>>>>         VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>>>>>         ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>>>>>         but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>>>>>         amdgpu_vm->evicting) and get this error log
> >>>>>>>>>
> >>>>>>>>> This error won't affect functionality as next CS will finish the
> >>>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>>>>>> correctly reflact the vm status and clear the error log.
> >>>>>>>> Well NAK, that is intentional behavior.
> >>>>>>>>
> >>>>>>>> The VM page tables where considered for eviction, so setting the flag is
> >>>>>>>> correct even when the page tables later on are not actually evicted.
> >>>>>>>>
> >>>>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>>>>>> even when the VM bos are not evicted.
> >>>>>>> Won't this have any negative effect when could do better?
> >>>>>> No, this will have a positive effect. See the VM was already considered
> >>>>>> for eviction because it is idle.
> >>>>>>
> >>>>>> Updating it immediately doesn't necessarily make sense, we should wait
> >>>>>> with that until its next usage.
> >>>>>>
> >>>>>> Additional to that this patch doesn't really fix the problem, it just
> >>>>>> mitigates it.
> >>>>>>
> >>>>>> Eviction can fail later on for a couple of reasons and we absolutely
> >>>>>> need to check the flag instead of the list in amdgpu_vm_ready().
> >>>>> The flag only for both flag and list? Looks like should be both as
> >>>>> the list indicate some vm page table need to be updated and could
> >>>>> delay the user update with the same logic as you described above.
> >>>> I think checking the flag should be enough. The issue is that the list
> >>>> was there initially, but to avoid race conditions we added the flag with
> >>>> separate lock protection later on.
> >>>>
> >>> But list and flag does not align always, there are cases like
> >>> list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
> >>> eviction). If only check flag list-non-empty/flag-unset change behavior.
> >> Yeah, but I think that the flag unset list-non-empty case would be
> >> correctly handled if we only test the flag.
> >>
> >> In other words we can update the page tables as long as they are not
> >> partially or fully evicted and that's not the case when non-vm BOs are
> >> evicted.
> >>
> > This sounds like two standard for the same thing, because this problem
> > does not evict page tables too. But I see your point is:
> > There's a difference that this problem's case can make sure vm is idle,
> > and we prefer to delay vm updates when vm is idle.
> >
> > If so, why not just stop user vm update by checking vm busy in
> > amdgpu_gem_va_ioctl() to skip amdgpu_gem_va_update_vm()?
>
> That's exactly what amdgpu_gem_va_update_vm() is doing by calling
> amdgpu_vm_ready(). The problem is that amdgpu_vm_ready() looks at the
> wrong thing.
>
If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
vm idle: true -> vm idle, false -> vm may be idle or busy.

> > Then we can keep the evicting flag accurate (after solving your
> > concern for this patch that eviction may fail latter by further delay
> > the flag update after eviction success).
>
> That won't work. See we need to mark the VM as evicted before we
> actually evict them because otherwise somebody could use the VM in
> parallel and add another fence to it.
>
I see, make this too accurate should cost too much like holding the
eviction_lock when eviction. But just delay it in
amdgpu_ttm_bo_eviction_valuable()
could avoid most false positive case.

Regards,
Qiang

> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Regards,
> >>>>> Qiang
> >>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Regards,
> >>>>>>> Qiang
> >>>>>>>
> >>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >>>>>>>> the flag instead of the linked list.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Christian.
> >>>>>>>>
> >>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> >>>>>>>>> ---
> >>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >>>>>>>>>       1 file changed, 47 insertions(+), 38 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >>>>>>>>>           return flags;
> >>>>>>>>>       }
> >>>>>>>>>
> >>>>>>>>> -/*
> >>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>> - * object.
> >>>>>>>>> - *
> >>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>> - * used to clean out a memory space.
> >>>>>>>>> - */
> >>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>> -                                         const struct ttm_place *place)
> >>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>> +                                          const struct ttm_place *place)
> >>>>>>>>>       {
> >>>>>>>>>           unsigned long num_pages = bo->resource->num_pages;
> >>>>>>>>>           struct amdgpu_res_cursor cursor;
> >>>>>>>>> -     struct dma_resv_list *flist;
> >>>>>>>>> -     struct dma_fence *f;
> >>>>>>>>> -     int i;
> >>>>>>>>> -
> >>>>>>>>> -     /* Swapout? */
> >>>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>> -             return true;
> >>>>>>>>> -
> >>>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
> >>>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> >>>>>>>>> -             return false;
> >>>>>>>>> -
> >>>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>> -      * be resident to run successfully
> >>>>>>>>> -      */
> >>>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>> -     if (flist) {
> >>>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>> -                             dma_resv_held(bo->base.resv));
> >>>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>> -                             return false;
> >>>>>>>>> -             }
> >>>>>>>>> -     }
> >>>>>>>>>
> >>>>>>>>>           switch (bo->resource->mem_type) {
> >>>>>>>>>           case AMDGPU_PL_PREEMPT:
> >>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>                   return false;
> >>>>>>>>>
> >>>>>>>>>           default:
> >>>>>>>>> -             break;
> >>>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>>           }
> >>>>>>>>> +}
> >>>>>>>>>
> >>>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>> +/*
> >>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>> + * object.
> >>>>>>>>> + *
> >>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>> + * used to clean out a memory space.
> >>>>>>>>> + */
> >>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>> +                                         const struct ttm_place *place)
> >>>>>>>>> +{
> >>>>>>>>> +     struct dma_resv_list *flist;
> >>>>>>>>> +     struct dma_fence *f;
> >>>>>>>>> +     int i;
> >>>>>>>>> +
> >>>>>>>>> +     /* Swapout? */
> >>>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>> +             return true;
> >>>>>>>>> +
> >>>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>> +      * be resident to run successfully
> >>>>>>>>> +      */
> >>>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>> +     if (flist) {
> >>>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>> +                             dma_resv_held(bo->base.resv));
> >>>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>> +                             return false;
> >>>>>>>>> +             }
> >>>>>>>>> +     }
> >>>>>>>>> +
> >>>>>>>>> +     /* Check by different mem type. */
> >>>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> >>>>>>>>> +             return false;
> >>>>>>>>> +
> >>>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
> >>>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
> >>>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> >>>>>>>>> +
> >>>>>>>>> +     return true;
> >>>>>>>>>       }
> >>>>>>>>>
> >>>>>>>>>       static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>

Christian König Feb. 18, 2022, 9:27 a.m. UTC | #11

Am 18.02.22 um 09:58 schrieb Qiang Yu:
> On Fri, Feb 18, 2022 at 3:46 PM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 18.02.22 um 04:08 schrieb Qiang Yu:
>>> On Thu, Feb 17, 2022 at 8:22 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 17.02.22 um 11:58 schrieb Qiang Yu:
>>>>> On Thu, Feb 17, 2022 at 6:39 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
>>>>>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
>>>>>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
>>>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
>>>>>>>>>>> Workstation application ANSA/META get this error dmesg:
>>>>>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
>>>>>>>>>>>
>>>>>>>>>>> This is caused by:
>>>>>>>>>>> 1. create a 256MB buffer in invisible VRAM
>>>>>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
>>>>>>>>>>>          it to visible VRAM
>>>>>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
>>>>>>>>>>>          evicting this bo is valuable
>>>>>>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
>>>>>>>>>>>          will set amdgpu_vm->evicting, but latter due to not in visible
>>>>>>>>>>>          VRAM, won't really evict it so not add it to amdgpu_vm->evicted
>>>>>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
>>>>>>>>>>>          ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
>>>>>>>>>>>          but fail in amdgpu_vm_bo_update_mapping() (check
>>>>>>>>>>>          amdgpu_vm->evicting) and get this error log
>>>>>>>>>>>
>>>>>>>>>>> This error won't affect functionality as next CS will finish the
>>>>>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
>>>>>>>>>>> correctly reflact the vm status and clear the error log.
>>>>>>>>>> Well NAK, that is intentional behavior.
>>>>>>>>>>
>>>>>>>>>> The VM page tables where considered for eviction, so setting the flag is
>>>>>>>>>> correct even when the page tables later on are not actually evicted.
>>>>>>>>>>
>>>>>>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
>>>>>>>>> even when the VM bos are not evicted.
>>>>>>>>> Won't this have any negative effect when could do better?
>>>>>>>> No, this will have a positive effect. See the VM was already considered
>>>>>>>> for eviction because it is idle.
>>>>>>>>
>>>>>>>> Updating it immediately doesn't necessarily make sense, we should wait
>>>>>>>> with that until its next usage.
>>>>>>>>
>>>>>>>> Additional to that this patch doesn't really fix the problem, it just
>>>>>>>> mitigates it.
>>>>>>>>
>>>>>>>> Eviction can fail later on for a couple of reasons and we absolutely
>>>>>>>> need to check the flag instead of the list in amdgpu_vm_ready().
>>>>>>> The flag only for both flag and list? Looks like should be both as
>>>>>>> the list indicate some vm page table need to be updated and could
>>>>>>> delay the user update with the same logic as you described above.
>>>>>> I think checking the flag should be enough. The issue is that the list
>>>>>> was there initially, but to avoid race conditions we added the flag with
>>>>>> separate lock protection later on.
>>>>>>
>>>>> But list and flag does not align always, there are cases like
>>>>> list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
>>>>> eviction). If only check flag list-non-empty/flag-unset change behavior.
>>>> Yeah, but I think that the flag unset list-non-empty case would be
>>>> correctly handled if we only test the flag.
>>>>
>>>> In other words we can update the page tables as long as they are not
>>>> partially or fully evicted and that's not the case when non-vm BOs are
>>>> evicted.
>>>>
>>> This sounds like two standard for the same thing, because this problem
>>> does not evict page tables too. But I see your point is:
>>> There's a difference that this problem's case can make sure vm is idle,
>>> and we prefer to delay vm updates when vm is idle.
>>>
>>> If so, why not just stop user vm update by checking vm busy in
>>> amdgpu_gem_va_ioctl() to skip amdgpu_gem_va_update_vm()?
>> That's exactly what amdgpu_gem_va_update_vm() is doing by calling
>> amdgpu_vm_ready(). The problem is that amdgpu_vm_ready() looks at the
>> wrong thing.
>>
> If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
> vm idle: true -> vm idle, false -> vm may be idle or busy.

Yeah, but why should that be relevant?

The amdgpu_vm_ready() return if we can do page table updates or not. If 
the VM is idle or not is only relevant for eviction.

In other words any CS or page table update makes the VM busy, but that 
only affects if the VM can be evicted or not.

>>> Then we can keep the evicting flag accurate (after solving your
>>> concern for this patch that eviction may fail latter by further delay
>>> the flag update after eviction success).
>> That won't work. See we need to mark the VM as evicted before we
>> actually evict them because otherwise somebody could use the VM in
>> parallel and add another fence to it.
>>
> I see, make this too accurate should cost too much like holding the
> eviction_lock when eviction. But just delay it in
> amdgpu_ttm_bo_eviction_valuable()
> could avoid most false positive case.

Partially correct. Another fundamental problem is that we can't hold the 
eviction lock because that would result in lock inversion and potential 
deadlock.

We could set the flag later on, but as I said before that when we set 
the evicted flag when the VM is already idle is a desired effect.

Regards,
Christian.

>
> Regards,
> Qiang
>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Qiang
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Qiang
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Regards,
>>>>>>> Qiang
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Qiang
>>>>>>>>>
>>>>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
>>>>>>>>>> the flag instead of the linked list.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
>>>>>>>>>>> ---
>>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>>>>>>>>>>>        1 file changed, 47 insertions(+), 38 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>>>>>>>>>>>            return flags;
>>>>>>>>>>>        }
>>>>>>>>>>>
>>>>>>>>>>> -/*
>>>>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>>>> - * object.
>>>>>>>>>>> - *
>>>>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>>>> - * used to clean out a memory space.
>>>>>>>>>>> - */
>>>>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>> -                                         const struct ttm_place *place)
>>>>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>> +                                          const struct ttm_place *place)
>>>>>>>>>>>        {
>>>>>>>>>>>            unsigned long num_pages = bo->resource->num_pages;
>>>>>>>>>>>            struct amdgpu_res_cursor cursor;
>>>>>>>>>>> -     struct dma_resv_list *flist;
>>>>>>>>>>> -     struct dma_fence *f;
>>>>>>>>>>> -     int i;
>>>>>>>>>>> -
>>>>>>>>>>> -     /* Swapout? */
>>>>>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>>>> -             return true;
>>>>>>>>>>> -
>>>>>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
>>>>>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
>>>>>>>>>>> -             return false;
>>>>>>>>>>> -
>>>>>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>>>> -      * be resident to run successfully
>>>>>>>>>>> -      */
>>>>>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>>>> -     if (flist) {
>>>>>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>>>> -                             dma_resv_held(bo->base.resv));
>>>>>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>>>> -                             return false;
>>>>>>>>>>> -             }
>>>>>>>>>>> -     }
>>>>>>>>>>>
>>>>>>>>>>>            switch (bo->resource->mem_type) {
>>>>>>>>>>>            case AMDGPU_PL_PREEMPT:
>>>>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>                    return false;
>>>>>>>>>>>
>>>>>>>>>>>            default:
>>>>>>>>>>> -             break;
>>>>>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>>>>            }
>>>>>>>>>>> +}
>>>>>>>>>>>
>>>>>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>>>> +/*
>>>>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>>>> + * object.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>>>> + * used to clean out a memory space.
>>>>>>>>>>> + */
>>>>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>> +                                         const struct ttm_place *place)
>>>>>>>>>>> +{
>>>>>>>>>>> +     struct dma_resv_list *flist;
>>>>>>>>>>> +     struct dma_fence *f;
>>>>>>>>>>> +     int i;
>>>>>>>>>>> +
>>>>>>>>>>> +     /* Swapout? */
>>>>>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>>>> +             return true;
>>>>>>>>>>> +
>>>>>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>>>> +      * be resident to run successfully
>>>>>>>>>>> +      */
>>>>>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>>>> +     if (flist) {
>>>>>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>>>> +                             dma_resv_held(bo->base.resv));
>>>>>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>>>> +                             return false;
>>>>>>>>>>> +             }
>>>>>>>>>>> +     }
>>>>>>>>>>> +
>>>>>>>>>>> +     /* Check by different mem type. */
>>>>>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
>>>>>>>>>>> +             return false;
>>>>>>>>>>> +
>>>>>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
>>>>>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
>>>>>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
>>>>>>>>>>> +
>>>>>>>>>>> +     return true;
>>>>>>>>>>>        }
>>>>>>>>>>>
>>>>>>>>>>>        static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

Qiang Yu Feb. 18, 2022, 10:16 a.m. UTC | #12

On Fri, Feb 18, 2022 at 5:27 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 18.02.22 um 09:58 schrieb Qiang Yu:
> > On Fri, Feb 18, 2022 at 3:46 PM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Am 18.02.22 um 04:08 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 8:22 PM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>> Am 17.02.22 um 11:58 schrieb Qiang Yu:
> >>>>> On Thu, Feb 17, 2022 at 6:39 PM Christian König
> >>>>> <christian.koenig@amd.com> wrote:
> >>>>>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> >>>>>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >>>>>>> <christian.koenig@amd.com> wrote:
> >>>>>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>>>>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>>>>>>> <christian.koenig@amd.com> wrote:
> >>>>>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>>>>>>>> Workstation application ANSA/META get this error dmesg:
> >>>>>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>>>>>>>>>
> >>>>>>>>>>> This is caused by:
> >>>>>>>>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>>>>>>>          it to visible VRAM
> >>>>>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>>>>>>>          evicting this bo is valuable
> >>>>>>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>>>>>>>          will set amdgpu_vm->evicting, but latter due to not in visible
> >>>>>>>>>>>          VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>>>>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>>>>>>>          ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>>>>>>>          but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>>>>>>>          amdgpu_vm->evicting) and get this error log
> >>>>>>>>>>>
> >>>>>>>>>>> This error won't affect functionality as next CS will finish the
> >>>>>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>>>>>>>> correctly reflact the vm status and clear the error log.
> >>>>>>>>>> Well NAK, that is intentional behavior.
> >>>>>>>>>>
> >>>>>>>>>> The VM page tables where considered for eviction, so setting the flag is
> >>>>>>>>>> correct even when the page tables later on are not actually evicted.
> >>>>>>>>>>
> >>>>>>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>>>>>>>> even when the VM bos are not evicted.
> >>>>>>>>> Won't this have any negative effect when could do better?
> >>>>>>>> No, this will have a positive effect. See the VM was already considered
> >>>>>>>> for eviction because it is idle.
> >>>>>>>>
> >>>>>>>> Updating it immediately doesn't necessarily make sense, we should wait
> >>>>>>>> with that until its next usage.
> >>>>>>>>
> >>>>>>>> Additional to that this patch doesn't really fix the problem, it just
> >>>>>>>> mitigates it.
> >>>>>>>>
> >>>>>>>> Eviction can fail later on for a couple of reasons and we absolutely
> >>>>>>>> need to check the flag instead of the list in amdgpu_vm_ready().
> >>>>>>> The flag only for both flag and list? Looks like should be both as
> >>>>>>> the list indicate some vm page table need to be updated and could
> >>>>>>> delay the user update with the same logic as you described above.
> >>>>>> I think checking the flag should be enough. The issue is that the list
> >>>>>> was there initially, but to avoid race conditions we added the flag with
> >>>>>> separate lock protection later on.
> >>>>>>
> >>>>> But list and flag does not align always, there are cases like
> >>>>> list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
> >>>>> eviction). If only check flag list-non-empty/flag-unset change behavior.
> >>>> Yeah, but I think that the flag unset list-non-empty case would be
> >>>> correctly handled if we only test the flag.
> >>>>
> >>>> In other words we can update the page tables as long as they are not
> >>>> partially or fully evicted and that's not the case when non-vm BOs are
> >>>> evicted.
> >>>>
> >>> This sounds like two standard for the same thing, because this problem
> >>> does not evict page tables too. But I see your point is:
> >>> There's a difference that this problem's case can make sure vm is idle,
> >>> and we prefer to delay vm updates when vm is idle.
> >>>
> >>> If so, why not just stop user vm update by checking vm busy in
> >>> amdgpu_gem_va_ioctl() to skip amdgpu_gem_va_update_vm()?
> >> That's exactly what amdgpu_gem_va_update_vm() is doing by calling
> >> amdgpu_vm_ready(). The problem is that amdgpu_vm_ready() looks at the
> >> wrong thing.
> >>
> > If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
> > vm idle: true -> vm idle, false -> vm may be idle or busy.
>
> Yeah, but why should that be relevant?
>
> The amdgpu_vm_ready() return if we can do page table updates or not. If
> the VM is idle or not is only relevant for eviction.
>
> In other words any CS or page table update makes the VM busy, but that
> only affects if the VM can be evicted or not.
>
My point is: we can't use amdgpu_vm_ready() to replace vm_is_busy(), so
currently we update vm even when vm is busy. So why not use:
if (!amdgpu_vm_ready() || vm_is_busy()) return;
in amdgpu_gem_va_update_vm(), as you mentioned we prefer to not
update vm when it's idle.

> >>> Then we can keep the evicting flag accurate (after solving your
> >>> concern for this patch that eviction may fail latter by further delay
> >>> the flag update after eviction success).
> >> That won't work. See we need to mark the VM as evicted before we
> >> actually evict them because otherwise somebody could use the VM in
> >> parallel and add another fence to it.
> >>
> > I see, make this too accurate should cost too much like holding the
> > eviction_lock when eviction. But just delay it in
> > amdgpu_ttm_bo_eviction_valuable()
> > could avoid most false positive case.
>
> Partially correct. Another fundamental problem is that we can't hold the
> eviction lock because that would result in lock inversion and potential
> deadlock.
>
> We could set the flag later on, but as I said before that when we set
> the evicted flag when the VM is already idle is a desired effect.
>
As above, this confuse me as we can explicitly check vm idle when
user update vm, why bother to embed it in evicting flag implicitly?

Check vm idle need to hold resv lock. Read your patch for adding
evicting flag is to update vm without resv lock. But user vm ops in
amdgpu_gem_va_update_vm() do hold the resv lock, so the difference
happens when calling amdgpu_vm_bo_update_mapping() from
svm_range_(un)map_to_gpu(). So embed vm idle in evicting flag
is for svm_range_(un)map_to_gpu() also do nothing when vm idle?

Regards,
Qiang

> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Regards,
> >>>>> Qiang
> >>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Regards,
> >>>>>>> Qiang
> >>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Christian.
> >>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Qiang
> >>>>>>>>>
> >>>>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >>>>>>>>>> the flag instead of the linked list.
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Christian.
> >>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>        drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >>>>>>>>>>>        1 file changed, 47 insertions(+), 38 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
> >>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >>>>>>>>>>>            return flags;
> >>>>>>>>>>>        }
> >>>>>>>>>>>
> >>>>>>>>>>> -/*
> >>>>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>>>> - * object.
> >>>>>>>>>>> - *
> >>>>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>>>> - * used to clean out a memory space.
> >>>>>>>>>>> - */
> >>>>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>> -                                         const struct ttm_place *place)
> >>>>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>> +                                          const struct ttm_place *place)
> >>>>>>>>>>>        {
> >>>>>>>>>>>            unsigned long num_pages = bo->resource->num_pages;
> >>>>>>>>>>>            struct amdgpu_res_cursor cursor;
> >>>>>>>>>>> -     struct dma_resv_list *flist;
> >>>>>>>>>>> -     struct dma_fence *f;
> >>>>>>>>>>> -     int i;
> >>>>>>>>>>> -
> >>>>>>>>>>> -     /* Swapout? */
> >>>>>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>>>> -             return true;
> >>>>>>>>>>> -
> >>>>>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
> >>>>>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> >>>>>>>>>>> -             return false;
> >>>>>>>>>>> -
> >>>>>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>>>> -      * be resident to run successfully
> >>>>>>>>>>> -      */
> >>>>>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>>>> -     if (flist) {
> >>>>>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>>>> -                             dma_resv_held(bo->base.resv));
> >>>>>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>>>> -                             return false;
> >>>>>>>>>>> -             }
> >>>>>>>>>>> -     }
> >>>>>>>>>>>
> >>>>>>>>>>>            switch (bo->resource->mem_type) {
> >>>>>>>>>>>            case AMDGPU_PL_PREEMPT:
> >>>>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>>                    return false;
> >>>>>>>>>>>
> >>>>>>>>>>>            default:
> >>>>>>>>>>> -             break;
> >>>>>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>>>>            }
> >>>>>>>>>>> +}
> >>>>>>>>>>>
> >>>>>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>>>> +/*
> >>>>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>>>> + * object.
> >>>>>>>>>>> + *
> >>>>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>>>> + * used to clean out a memory space.
> >>>>>>>>>>> + */
> >>>>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>> +                                         const struct ttm_place *place)
> >>>>>>>>>>> +{
> >>>>>>>>>>> +     struct dma_resv_list *flist;
> >>>>>>>>>>> +     struct dma_fence *f;
> >>>>>>>>>>> +     int i;
> >>>>>>>>>>> +
> >>>>>>>>>>> +     /* Swapout? */
> >>>>>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>>>> +             return true;
> >>>>>>>>>>> +
> >>>>>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>>>> +      * be resident to run successfully
> >>>>>>>>>>> +      */
> >>>>>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>>>> +     if (flist) {
> >>>>>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>>>> +                             dma_resv_held(bo->base.resv));
> >>>>>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>>>> +                             return false;
> >>>>>>>>>>> +             }
> >>>>>>>>>>> +     }
> >>>>>>>>>>> +
> >>>>>>>>>>> +     /* Check by different mem type. */
> >>>>>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> >>>>>>>>>>> +             return false;
> >>>>>>>>>>> +
> >>>>>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
> >>>>>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
> >>>>>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> >>>>>>>>>>> +
> >>>>>>>>>>> +     return true;
> >>>>>>>>>>>        }
> >>>>>>>>>>>
> >>>>>>>>>>>        static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>

Christian König Feb. 18, 2022, 10:24 a.m. UTC | #13

Am 18.02.22 um 11:16 schrieb Qiang Yu:
> [SNIP]
>>> If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
>>> vm idle: true -> vm idle, false -> vm may be idle or busy.
>> Yeah, but why should that be relevant?
>>
>> The amdgpu_vm_ready() return if we can do page table updates or not. If
>> the VM is idle or not is only relevant for eviction.
>>
>> In other words any CS or page table update makes the VM busy, but that
>> only affects if the VM can be evicted or not.
>>
> My point is: we can't use amdgpu_vm_ready() to replace vm_is_busy(), so
> currently we update vm even when vm is busy. So why not use:
> if (!amdgpu_vm_ready() || vm_is_busy()) return;
> in amdgpu_gem_va_update_vm(), as you mentioned we prefer to not
> update vm when it's idle.

Because updating the VM while it is busy is perfectly fine, we do it all 
the time.

We should just not update it when it is already idle and was considered 
for eviction. In this situation it makes most of the time sense to keep 
it idle and postpone the update till the next command submission.

>>>>> Then we can keep the evicting flag accurate (after solving your
>>>>> concern for this patch that eviction may fail latter by further delay
>>>>> the flag update after eviction success).
>>>> That won't work. See we need to mark the VM as evicted before we
>>>> actually evict them because otherwise somebody could use the VM in
>>>> parallel and add another fence to it.
>>>>
>>> I see, make this too accurate should cost too much like holding the
>>> eviction_lock when eviction. But just delay it in
>>> amdgpu_ttm_bo_eviction_valuable()
>>> could avoid most false positive case.
>> Partially correct. Another fundamental problem is that we can't hold the
>> eviction lock because that would result in lock inversion and potential
>> deadlock.
>>
>> We could set the flag later on, but as I said before that when we set
>> the evicted flag when the VM is already idle is a desired effect.
>>
> As above, this confuse me as we can explicitly check vm idle when
> user update vm, why bother to embed it in evicting flag implicitly?

Well as I said it's irrelevant for the update if the VM is idle or not.

To summarize the rules once more:
1. When VM page tables are used by CS or page tables updates it is 
considered busy, e.g. not idle.

2. When we want to evict a VM it must be idle. As soon as we considered 
this we should set the evicted flag to make sure to keep it idle as much 
as possible.

3. When we want to update the page tables we just need to check if the 
VM is idle or not.

4. When a CS happens we don't have another chance and make the VM busy 
again. And do all postponed page table updates.

Regards,
Christian.

>
> Check vm idle need to hold resv lock. Read your patch for adding
> evicting flag is to update vm without resv lock. But user vm ops in
> amdgpu_gem_va_update_vm() do hold the resv lock, so the difference
> happens when calling amdgpu_vm_bo_update_mapping() from
> svm_range_(un)map_to_gpu(). So embed vm idle in evicting flag
> is for svm_range_(un)map_to_gpu() also do nothing when vm idle?



>
> Regards,
> Qiang
>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Qiang
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Qiang
>>>>>
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Regards,
>>>>>>> Qiang
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Qiang
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Qiang
>>>>>>>>>>>
>>>>>>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
>>>>>>>>>>>> the flag instead of the linked list.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Christian.
>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>         drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>>>>>>>>>>>>>         1 file changed, 47 insertions(+), 38 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>>>>>>>>>>>>>             return flags;
>>>>>>>>>>>>>         }
>>>>>>>>>>>>>
>>>>>>>>>>>>> -/*
>>>>>>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>>>>>> - * object.
>>>>>>>>>>>>> - *
>>>>>>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>>>>>> - * used to clean out a memory space.
>>>>>>>>>>>>> - */
>>>>>>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>> -                                         const struct ttm_place *place)
>>>>>>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>> +                                          const struct ttm_place *place)
>>>>>>>>>>>>>         {
>>>>>>>>>>>>>             unsigned long num_pages = bo->resource->num_pages;
>>>>>>>>>>>>>             struct amdgpu_res_cursor cursor;
>>>>>>>>>>>>> -     struct dma_resv_list *flist;
>>>>>>>>>>>>> -     struct dma_fence *f;
>>>>>>>>>>>>> -     int i;
>>>>>>>>>>>>> -
>>>>>>>>>>>>> -     /* Swapout? */
>>>>>>>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>>>>>> -             return true;
>>>>>>>>>>>>> -
>>>>>>>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
>>>>>>>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
>>>>>>>>>>>>> -             return false;
>>>>>>>>>>>>> -
>>>>>>>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>>>>>> -      * be resident to run successfully
>>>>>>>>>>>>> -      */
>>>>>>>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>>>>>> -     if (flist) {
>>>>>>>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>>>>>> -                             dma_resv_held(bo->base.resv));
>>>>>>>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>>>>>> -                             return false;
>>>>>>>>>>>>> -             }
>>>>>>>>>>>>> -     }
>>>>>>>>>>>>>
>>>>>>>>>>>>>             switch (bo->resource->mem_type) {
>>>>>>>>>>>>>             case AMDGPU_PL_PREEMPT:
>>>>>>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>>                     return false;
>>>>>>>>>>>>>
>>>>>>>>>>>>>             default:
>>>>>>>>>>>>> -             break;
>>>>>>>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>>>>>>             }
>>>>>>>>>>>>> +}
>>>>>>>>>>>>>
>>>>>>>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>>>>>> +/*
>>>>>>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>>>>>> + * object.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>>>>>> + * used to clean out a memory space.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>> +                                         const struct ttm_place *place)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +     struct dma_resv_list *flist;
>>>>>>>>>>>>> +     struct dma_fence *f;
>>>>>>>>>>>>> +     int i;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +     /* Swapout? */
>>>>>>>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>>>>>> +             return true;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>>>>>> +      * be resident to run successfully
>>>>>>>>>>>>> +      */
>>>>>>>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>>>>>> +     if (flist) {
>>>>>>>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>>>>>> +                             dma_resv_held(bo->base.resv));
>>>>>>>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>>>>>> +                             return false;
>>>>>>>>>>>>> +             }
>>>>>>>>>>>>> +     }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +     /* Check by different mem type. */
>>>>>>>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
>>>>>>>>>>>>> +             return false;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
>>>>>>>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
>>>>>>>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +     return true;
>>>>>>>>>>>>>         }
>>>>>>>>>>>>>
>>>>>>>>>>>>>         static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

Qiang Yu Feb. 21, 2022, 3:28 a.m. UTC | #14

On Fri, Feb 18, 2022 at 6:24 PM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 18.02.22 um 11:16 schrieb Qiang Yu:
> > [SNIP]
> >>> If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
> >>> vm idle: true -> vm idle, false -> vm may be idle or busy.
> >> Yeah, but why should that be relevant?
> >>
> >> The amdgpu_vm_ready() return if we can do page table updates or not. If
> >> the VM is idle or not is only relevant for eviction.
> >>
> >> In other words any CS or page table update makes the VM busy, but that
> >> only affects if the VM can be evicted or not.
> >>
> > My point is: we can't use amdgpu_vm_ready() to replace vm_is_busy(), so
> > currently we update vm even when vm is busy. So why not use:
Sorry, should be "vm is idle".

> > if (!amdgpu_vm_ready() || vm_is_busy()) return;
> > in amdgpu_gem_va_update_vm(), as you mentioned we prefer to not
> > update vm when it's idle.
>
> Because updating the VM while it is busy is perfectly fine, we do it all
> the time.
>
Yeah, as above, my typo.

> We should just not update it when it is already idle and was considered
> for eviction.
"and", not "or"?

> In this situation it makes most of the time sense to keep
> it idle and postpone the update till the next command submission.
>
> >>>>> Then we can keep the evicting flag accurate (after solving your
> >>>>> concern for this patch that eviction may fail latter by further delay
> >>>>> the flag update after eviction success).
> >>>> That won't work. See we need to mark the VM as evicted before we
> >>>> actually evict them because otherwise somebody could use the VM in
> >>>> parallel and add another fence to it.
> >>>>
> >>> I see, make this too accurate should cost too much like holding the
> >>> eviction_lock when eviction. But just delay it in
> >>> amdgpu_ttm_bo_eviction_valuable()
> >>> could avoid most false positive case.
> >> Partially correct. Another fundamental problem is that we can't hold the
> >> eviction lock because that would result in lock inversion and potential
> >> deadlock.
> >>
> >> We could set the flag later on, but as I said before that when we set
> >> the evicted flag when the VM is already idle is a desired effect.
> >>
> > As above, this confuse me as we can explicitly check vm idle when
> > user update vm, why bother to embed it in evicting flag implicitly?
>
> Well as I said it's irrelevant for the update if the VM is idle or not.
>
> To summarize the rules once more:
> 1. When VM page tables are used by CS or page tables updates it is
> considered busy, e.g. not idle.
>
> 2. When we want to evict a VM it must be idle. As soon as we considered
> this we should set the evicted flag to make sure to keep it idle as much
> as possible.
>
> 3. When we want to update the page tables we just need to check if the
> VM is idle or not.
>
But now we does not check vm idle directly in amdgpu_gem_va_update_vm().
If VM bo has not been considered for eviction, it could be either idle or busy.

Just want to confirm if the fix should be only change amdgpu_vm_ready()
to use evicting flag or besides using evicting flag, also check vm_idle() in
amdgpu_gem_va_update_vm().

Regards,
Qiang

> 4. When a CS happens we don't have another chance and make the VM busy
> again. And do all postponed page table updates.
>
Anyway,

> Regards,
> Christian.
>
> >
> > Check vm idle need to hold resv lock. Read your patch for adding
> > evicting flag is to update vm without resv lock. But user vm ops in
> > amdgpu_gem_va_update_vm() do hold the resv lock, so the difference
> > happens when calling amdgpu_vm_bo_update_mapping() from
> > svm_range_(un)map_to_gpu(). So embed vm idle in evicting flag
> > is for svm_range_(un)map_to_gpu() also do nothing when vm idle?
>
>
>
> >
> > Regards,
> > Qiang
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Regards,
> >>>>> Qiang
> >>>>>
> >>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Regards,
> >>>>>>> Qiang
> >>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Christian.
> >>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Qiang
> >>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Christian.
> >>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Qiang
> >>>>>>>>>>>
> >>>>>>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >>>>>>>>>>>> the flag instead of the linked list.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Christian.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
> >>>>>>>>>>>>> ---
> >>>>>>>>>>>>>         drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >>>>>>>>>>>>>         1 file changed, 47 insertions(+), 38 deletions(-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
> >>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >>>>>>>>>>>>>             return flags;
> >>>>>>>>>>>>>         }
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -/*
> >>>>>>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>>>>>> - * object.
> >>>>>>>>>>>>> - *
> >>>>>>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>>>>>> - * used to clean out a memory space.
> >>>>>>>>>>>>> - */
> >>>>>>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>>>> -                                         const struct ttm_place *place)
> >>>>>>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>>>> +                                          const struct ttm_place *place)
> >>>>>>>>>>>>>         {
> >>>>>>>>>>>>>             unsigned long num_pages = bo->resource->num_pages;
> >>>>>>>>>>>>>             struct amdgpu_res_cursor cursor;
> >>>>>>>>>>>>> -     struct dma_resv_list *flist;
> >>>>>>>>>>>>> -     struct dma_fence *f;
> >>>>>>>>>>>>> -     int i;
> >>>>>>>>>>>>> -
> >>>>>>>>>>>>> -     /* Swapout? */
> >>>>>>>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>>>>>> -             return true;
> >>>>>>>>>>>>> -
> >>>>>>>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
> >>>>>>>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> >>>>>>>>>>>>> -             return false;
> >>>>>>>>>>>>> -
> >>>>>>>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>>>>>> -      * be resident to run successfully
> >>>>>>>>>>>>> -      */
> >>>>>>>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>>>>>> -     if (flist) {
> >>>>>>>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>>>>>> -                             dma_resv_held(bo->base.resv));
> >>>>>>>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>>>>>> -                             return false;
> >>>>>>>>>>>>> -             }
> >>>>>>>>>>>>> -     }
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>             switch (bo->resource->mem_type) {
> >>>>>>>>>>>>>             case AMDGPU_PL_PREEMPT:
> >>>>>>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>>>>                     return false;
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>             default:
> >>>>>>>>>>>>> -             break;
> >>>>>>>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>>>>>>             }
> >>>>>>>>>>>>> +}
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>>>>>> +/*
> >>>>>>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>>>>>> + * object.
> >>>>>>>>>>>>> + *
> >>>>>>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>>>>>> + * used to clean out a memory space.
> >>>>>>>>>>>>> + */
> >>>>>>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>>>>>> +                                         const struct ttm_place *place)
> >>>>>>>>>>>>> +{
> >>>>>>>>>>>>> +     struct dma_resv_list *flist;
> >>>>>>>>>>>>> +     struct dma_fence *f;
> >>>>>>>>>>>>> +     int i;
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +     /* Swapout? */
> >>>>>>>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>>>>>> +             return true;
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>>>>>> +      * be resident to run successfully
> >>>>>>>>>>>>> +      */
> >>>>>>>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>>>>>> +     if (flist) {
> >>>>>>>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>>>>>> +                             dma_resv_held(bo->base.resv));
> >>>>>>>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>>>>>> +                             return false;
> >>>>>>>>>>>>> +             }
> >>>>>>>>>>>>> +     }
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +     /* Check by different mem type. */
> >>>>>>>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> >>>>>>>>>>>>> +             return false;
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
> >>>>>>>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
> >>>>>>>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +     return true;
> >>>>>>>>>>>>>         }
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>         static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>

Christian König Feb. 21, 2022, 8:24 a.m. UTC | #15

Am 21.02.22 um 04:28 schrieb Qiang Yu:
> On Fri, Feb 18, 2022 at 6:24 PM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 18.02.22 um 11:16 schrieb Qiang Yu:
>>> [SNIP]
>>>>> If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
>>>>> vm idle: true -> vm idle, false -> vm may be idle or busy.
>>>> Yeah, but why should that be relevant?
>>>>
>>>> The amdgpu_vm_ready() return if we can do page table updates or not. If
>>>> the VM is idle or not is only relevant for eviction.
>>>>
>>>> In other words any CS or page table update makes the VM busy, but that
>>>> only affects if the VM can be evicted or not.
>>>>
>>> My point is: we can't use amdgpu_vm_ready() to replace vm_is_busy(), so
>>> currently we update vm even when vm is busy. So why not use:
> Sorry, should be "vm is idle".
>
>>> if (!amdgpu_vm_ready() || vm_is_busy()) return;
>>> in amdgpu_gem_va_update_vm(), as you mentioned we prefer to not
>>> update vm when it's idle.
>> Because updating the VM while it is busy is perfectly fine, we do it all
>> the time.
>>
> Yeah, as above, my typo.
>
>> We should just not update it when it is already idle and was considered
>> for eviction.
> "and", not "or"?
>
>> In this situation it makes most of the time sense to keep
>> it idle and postpone the update till the next command submission.
>>
>>>>>>> Then we can keep the evicting flag accurate (after solving your
>>>>>>> concern for this patch that eviction may fail latter by further delay
>>>>>>> the flag update after eviction success).
>>>>>> That won't work. See we need to mark the VM as evicted before we
>>>>>> actually evict them because otherwise somebody could use the VM in
>>>>>> parallel and add another fence to it.
>>>>>>
>>>>> I see, make this too accurate should cost too much like holding the
>>>>> eviction_lock when eviction. But just delay it in
>>>>> amdgpu_ttm_bo_eviction_valuable()
>>>>> could avoid most false positive case.
>>>> Partially correct. Another fundamental problem is that we can't hold the
>>>> eviction lock because that would result in lock inversion and potential
>>>> deadlock.
>>>>
>>>> We could set the flag later on, but as I said before that when we set
>>>> the evicted flag when the VM is already idle is a desired effect.
>>>>
>>> As above, this confuse me as we can explicitly check vm idle when
>>> user update vm, why bother to embed it in evicting flag implicitly?
>> Well as I said it's irrelevant for the update if the VM is idle or not.
>>
>> To summarize the rules once more:
>> 1. When VM page tables are used by CS or page tables updates it is
>> considered busy, e.g. not idle.
>>
>> 2. When we want to evict a VM it must be idle. As soon as we considered
>> this we should set the evicted flag to make sure to keep it idle as much
>> as possible.
>>
>> 3. When we want to update the page tables we just need to check if the
>> VM is idle or not.
>>
> But now we does not check vm idle directly in amdgpu_gem_va_update_vm().
> If VM bo has not been considered for eviction, it could be either idle or busy.
>
> Just want to confirm if the fix should be only change amdgpu_vm_ready()
> to use evicting flag or besides using evicting flag, also check vm_idle() in
> amdgpu_gem_va_update_vm().

Only changing the amdgpu_vm_ready() should be enough. It can be that 
this then bubbles up more issue, but those need to be taken care of 
separately then.

Regards,
Christian.

>
> Regards,
> Qiang
>
>> 4. When a CS happens we don't have another chance and make the VM busy
>> again. And do all postponed page table updates.
>>
> Anyway,
>
>> Regards,
>> Christian.
>>
>>> Check vm idle need to hold resv lock. Read your patch for adding
>>> evicting flag is to update vm without resv lock. But user vm ops in
>>> amdgpu_gem_va_update_vm() do hold the resv lock, so the difference
>>> happens when calling amdgpu_vm_bo_update_mapping() from
>>> svm_range_(un)map_to_gpu(). So embed vm idle in evicting flag
>>> is for svm_range_(un)map_to_gpu() also do nothing when vm idle?
>>
>>
>>> Regards,
>>> Qiang
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Qiang
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Regards,
>>>>>>> Qiang
>>>>>>>
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Qiang
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Qiang
>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Christian.
>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Qiang
>>>>>>>>>>>>>
>>>>>>>>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
>>>>>>>>>>>>>> the flag instead of the linked list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Christian.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@amd.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>          drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
>>>>>>>>>>>>>>>          1 file changed, 47 insertions(+), 38 deletions(-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>>>>>>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>>>>>>>>>>>>>>>              return flags;
>>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -/*
>>>>>>>>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>>>>>>>> - * object.
>>>>>>>>>>>>>>> - *
>>>>>>>>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>>>>>>>> - * used to clean out a memory space.
>>>>>>>>>>>>>>> - */
>>>>>>>>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>>>> -                                         const struct ttm_place *place)
>>>>>>>>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>>>> +                                          const struct ttm_place *place)
>>>>>>>>>>>>>>>          {
>>>>>>>>>>>>>>>              unsigned long num_pages = bo->resource->num_pages;
>>>>>>>>>>>>>>>              struct amdgpu_res_cursor cursor;
>>>>>>>>>>>>>>> -     struct dma_resv_list *flist;
>>>>>>>>>>>>>>> -     struct dma_fence *f;
>>>>>>>>>>>>>>> -     int i;
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>> -     /* Swapout? */
>>>>>>>>>>>>>>> -     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>>>>>>>> -             return true;
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>> -     if (bo->type == ttm_bo_type_kernel &&
>>>>>>>>>>>>>>> -         !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
>>>>>>>>>>>>>>> -             return false;
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>> -     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>>>>>>>> -      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>>>>>>>> -      * be resident to run successfully
>>>>>>>>>>>>>>> -      */
>>>>>>>>>>>>>>> -     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>>>>>>>> -     if (flist) {
>>>>>>>>>>>>>>> -             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>>>>>>>> -                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>>>>>>>> -                             dma_resv_held(bo->base.resv));
>>>>>>>>>>>>>>> -                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>>>>>>>> -                             return false;
>>>>>>>>>>>>>>> -             }
>>>>>>>>>>>>>>> -     }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>              switch (bo->resource->mem_type) {
>>>>>>>>>>>>>>>              case AMDGPU_PL_PREEMPT:
>>>>>>>>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>>>>                      return false;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>              default:
>>>>>>>>>>>>>>> -             break;
>>>>>>>>>>>>>>> +             return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>>>>>>>>              }
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -     return ttm_bo_eviction_valuable(bo, place);
>>>>>>>>>>>>>>> +/*
>>>>>>>>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
>>>>>>>>>>>>>>> + * object.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
>>>>>>>>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
>>>>>>>>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
>>>>>>>>>>>>>>> + * used to clean out a memory space.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
>>>>>>>>>>>>>>> +                                         const struct ttm_place *place)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +     struct dma_resv_list *flist;
>>>>>>>>>>>>>>> +     struct dma_fence *f;
>>>>>>>>>>>>>>> +     int i;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +     /* Swapout? */
>>>>>>>>>>>>>>> +     if (bo->resource->mem_type == TTM_PL_SYSTEM)
>>>>>>>>>>>>>>> +             return true;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +     /* If bo is a KFD BO, check if the bo belongs to the current process.
>>>>>>>>>>>>>>> +      * If true, then return false as any KFD process needs all its BOs to
>>>>>>>>>>>>>>> +      * be resident to run successfully
>>>>>>>>>>>>>>> +      */
>>>>>>>>>>>>>>> +     flist = dma_resv_shared_list(bo->base.resv);
>>>>>>>>>>>>>>> +     if (flist) {
>>>>>>>>>>>>>>> +             for (i = 0; i < flist->shared_count; ++i) {
>>>>>>>>>>>>>>> +                     f = rcu_dereference_protected(flist->shared[i],
>>>>>>>>>>>>>>> +                             dma_resv_held(bo->base.resv));
>>>>>>>>>>>>>>> +                     if (amdkfd_fence_check_mm(f, current->mm))
>>>>>>>>>>>>>>> +                             return false;
>>>>>>>>>>>>>>> +             }
>>>>>>>>>>>>>>> +     }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +     /* Check by different mem type. */
>>>>>>>>>>>>>>> +     if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
>>>>>>>>>>>>>>> +             return false;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +     /* VM bo should be checked at last because it will mark VM evicting. */
>>>>>>>>>>>>>>> +     if (bo->type == ttm_bo_type_kernel)
>>>>>>>>>>>>>>> +             return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +     return true;
>>>>>>>>>>>>>>>          }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>          static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,

drm/amdgpu: check vm bo eviction valuable at last

Commit Message

Comments

Patch