[1/6] drm/amdgpu: Generalize KFD dmabuf import

Message ID	20230112013157.750568-2-Felix.Kuehling@amd.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C From: Felix Kuehling <Felix.Kuehling@amd.com> To: <amd-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org> Subject: [PATCH 1/6] drm/amdgpu: Generalize KFD dmabuf import Date: Wed, 11 Jan 2023 20:31:52 -0500 Message-ID: <20230112013157.750568-2-Felix.Kuehling@amd.com> In-Reply-To: <20230112013157.750568-1-Felix.Kuehling@amd.com> References: <20230112013157.750568-1-Felix.Kuehling@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: list Cc: xiaogang.chen@amd.com, christian.koenig@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	Enable KFD to use render node BO mappings \| expand [0/6] Enable KFD to use render node BO mappings [1/6] drm/amdgpu: Generalize KFD dmabuf import [2/6] drm/amdkfd: Implement DMA buf fd export from KFD [3/6] drm/amdkfd: Improve amdgpu_vm_handle_moved [4/6] drm/amdgpu: Attach eviction fence on alloc [5/6] drm/amdgpu: update mappings not managed by KFD [6/6] drm/amdgpu: Do bo_va ref counting for KFD BOs

Felix Kuehling Jan. 12, 2023, 1:31 a.m. UTC

Use proper amdgpu_gem_prime_import function to handle all kinds of
imports. Remember the dmabuf reference to enable proper multi-GPU
attachment to multiple VMs without erroneously re-exporting the
underlying BO multiple times.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 ++++++++++---------
 1 file changed, 21 insertions(+), 17 deletions(-)

Chen, Xiaogang Jan. 12, 2023, 10:41 p.m. UTC | #1

On 1/11/2023 7:31 PM, Felix Kuehling wrote:
> Use proper amdgpu_gem_prime_import function to handle all kinds of
> imports. Remember the dmabuf reference to enable proper multi-GPU
> attachment to multiple VMs without erroneously re-exporting the
> underlying BO multiple times.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 ++++++++++---------
>   1 file changed, 21 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 6f236ded5f12..e13c3493b786 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -2209,30 +2209,27 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>   	struct amdgpu_bo *bo;
>   	int ret;
>   
> -	if (dma_buf->ops != &amdgpu_dmabuf_ops)
> -		/* Can't handle non-graphics buffers */
> -		return -EINVAL;
> -
> -	obj = dma_buf->priv;
> -	if (drm_to_adev(obj->dev) != adev)
> -		/* Can't handle buffers from other devices */
> -		return -EINVAL;
> +	obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
> +	if (IS_ERR(obj))
> +		return PTR_ERR(obj);
>   
>   	bo = gem_to_amdgpu_bo(obj);
>   	if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
> -				    AMDGPU_GEM_DOMAIN_GTT)))
> +				    AMDGPU_GEM_DOMAIN_GTT))) {
>   		/* Only VRAM and GTT BOs are supported */
> -		return -EINVAL;
> +		ret = -EINVAL;
> +		goto err_put_obj;
> +	}
>   
>   	*mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
> -	if (!*mem)
> -		return -ENOMEM;
> +	if (!*mem) {
> +		ret = -ENOMEM;
> +		goto err_put_obj;
> +	}
>   
>   	ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
> -	if (ret) {
> -		kfree(*mem);
> -		return ret;
> -	}
> +	if (ret)
> +		goto err_free_mem;
>   
>   	if (size)
>   		*size = amdgpu_bo_size(bo);

Hi Felix:

I have a question when using amdgpu_gem_prime_import. It will allow 
importing a dmabuf to different gpus, then can we still call 
amdgpu_bo_mmap_offset on the generated bo if 
amdgpu_amdkfd_gpuvm_import_dmabuf also ask mmap_offset?

> @@ -2249,7 +2246,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>   		| KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
>   		| KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>   
> -	drm_gem_object_get(&bo->tbo.base);
> +	get_dma_buf(dma_buf);
> +	(*mem)->dmabuf = dma_buf;
>   	(*mem)->bo = bo;
>   	(*mem)->va = va;
>   	(*mem)->domain = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
> @@ -2261,6 +2259,12 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>   	(*mem)->is_imported = true;
>   
>   	return 0;
> +
> +err_free_mem:
> +	kfree(*mem);
> +err_put_obj:
> +	drm_gem_object_put(obj);
> +	return ret;
>   }
>   
>   /* Evict a userptr BO by stopping the queues if necessary

Felix Kuehling Jan. 13, 2023, 10:26 p.m. UTC | #2

On 2023-01-12 17:41, Chen, Xiaogang wrote:
>
> On 1/11/2023 7:31 PM, Felix Kuehling wrote:
>> Use proper amdgpu_gem_prime_import function to handle all kinds of
>> imports. Remember the dmabuf reference to enable proper multi-GPU
>> attachment to multiple VMs without erroneously re-exporting the
>> underlying BO multiple times.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 ++++++++++---------
>>   1 file changed, 21 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 6f236ded5f12..e13c3493b786 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -2209,30 +2209,27 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>> amdgpu_device *adev,
>>       struct amdgpu_bo *bo;
>>       int ret;
>>   -    if (dma_buf->ops != &amdgpu_dmabuf_ops)
>> -        /* Can't handle non-graphics buffers */
>> -        return -EINVAL;
>> -
>> -    obj = dma_buf->priv;
>> -    if (drm_to_adev(obj->dev) != adev)
>> -        /* Can't handle buffers from other devices */
>> -        return -EINVAL;
>> +    obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
>> +    if (IS_ERR(obj))
>> +        return PTR_ERR(obj);
>>         bo = gem_to_amdgpu_bo(obj);
>>       if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
>> -                    AMDGPU_GEM_DOMAIN_GTT)))
>> +                    AMDGPU_GEM_DOMAIN_GTT))) {
>>           /* Only VRAM and GTT BOs are supported */
>> -        return -EINVAL;
>> +        ret = -EINVAL;
>> +        goto err_put_obj;
>> +    }
>>         *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
>> -    if (!*mem)
>> -        return -ENOMEM;
>> +    if (!*mem) {
>> +        ret = -ENOMEM;
>> +        goto err_put_obj;
>> +    }
>>         ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>> -    if (ret) {
>> -        kfree(*mem);
>> -        return ret;
>> -    }
>> +    if (ret)
>> +        goto err_free_mem;
>>         if (size)
>>           *size = amdgpu_bo_size(bo);
>
> Hi Felix:
>
> I have a question when using amdgpu_gem_prime_import. It will allow 
> importing a dmabuf to different gpus, then can we still call 
> amdgpu_bo_mmap_offset on the generated bo if 
> amdgpu_amdkfd_gpuvm_import_dmabuf also ask mmap_offset?

The mmap  offset comes from drm_vma_node_offset_addr. The DRM VMA 
address is allocated when ttm_bo_init_reserved calls drm_vma_offset_add, 
so there should be no problem querying the mmap_offset. Whether mmapping 
of an imported BO is actually supported is a different question. As far 
as I can see, it should be possible. That said, currently ROCm 
(libhsakmt) uses only original BOs for CPU mappings, not imported BOs.

Regards,
   Felix


>
>> @@ -2249,7 +2246,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>> amdgpu_device *adev,
>>           | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
>>           | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>>   -    drm_gem_object_get(&bo->tbo.base);
>> +    get_dma_buf(dma_buf);
>> +    (*mem)->dmabuf = dma_buf;
>>       (*mem)->bo = bo;
>>       (*mem)->va = va;
>>       (*mem)->domain = (bo->preferred_domains & 
>> AMDGPU_GEM_DOMAIN_VRAM) ?
>> @@ -2261,6 +2259,12 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>> amdgpu_device *adev,
>>       (*mem)->is_imported = true;
>>         return 0;
>> +
>> +err_free_mem:
>> +    kfree(*mem);
>> +err_put_obj:
>> +    drm_gem_object_put(obj);
>> +    return ret;
>>   }
>>     /* Evict a userptr BO by stopping the queues if necessary

Chen, Xiaogang Jan. 13, 2023, 11 p.m. UTC | #3

On 1/13/2023 4:26 PM, Felix Kuehling wrote:
> On 2023-01-12 17:41, Chen, Xiaogang wrote:
>>
>> On 1/11/2023 7:31 PM, Felix Kuehling wrote:
>>> Use proper amdgpu_gem_prime_import function to handle all kinds of
>>> imports. Remember the dmabuf reference to enable proper multi-GPU
>>> attachment to multiple VMs without erroneously re-exporting the
>>> underlying BO multiple times.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 
>>> ++++++++++---------
>>>   1 file changed, 21 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> index 6f236ded5f12..e13c3493b786 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> @@ -2209,30 +2209,27 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>>> amdgpu_device *adev,
>>>       struct amdgpu_bo *bo;
>>>       int ret;
>>>   -    if (dma_buf->ops != &amdgpu_dmabuf_ops)
>>> -        /* Can't handle non-graphics buffers */
>>> -        return -EINVAL;
>>> -
>>> -    obj = dma_buf->priv;
>>> -    if (drm_to_adev(obj->dev) != adev)
>>> -        /* Can't handle buffers from other devices */
>>> -        return -EINVAL;
>>> +    obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
>>> +    if (IS_ERR(obj))
>>> +        return PTR_ERR(obj);
>>>         bo = gem_to_amdgpu_bo(obj);
>>>       if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
>>> -                    AMDGPU_GEM_DOMAIN_GTT)))
>>> +                    AMDGPU_GEM_DOMAIN_GTT))) {
>>>           /* Only VRAM and GTT BOs are supported */
>>> -        return -EINVAL;
>>> +        ret = -EINVAL;
>>> +        goto err_put_obj;
>>> +    }
>>>         *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
>>> -    if (!*mem)
>>> -        return -ENOMEM;
>>> +    if (!*mem) {
>>> +        ret = -ENOMEM;
>>> +        goto err_put_obj;
>>> +    }
>>>         ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>>> -    if (ret) {
>>> -        kfree(*mem);
>>> -        return ret;
>>> -    }
>>> +    if (ret)
>>> +        goto err_free_mem;
>>>         if (size)
>>>           *size = amdgpu_bo_size(bo);
>>
>> Hi Felix:
>>
>> I have a question when using amdgpu_gem_prime_import. It will allow 
>> importing a dmabuf to different gpus, then can we still call 
>> amdgpu_bo_mmap_offset on the generated bo if 
>> amdgpu_amdkfd_gpuvm_import_dmabuf also ask mmap_offset?
>
> The mmap  offset comes from drm_vma_node_offset_addr. The DRM VMA 
> address is allocated when ttm_bo_init_reserved calls 
> drm_vma_offset_add, so there should be no problem querying the 
> mmap_offset. Whether mmapping of an imported BO is actually supported 
> is a different question. As far as I can see, it should be possible. 
> That said, currently ROCm (libhsakmt) uses only original BOs for CPU 
> mappings, not imported BOs.
>
> Regards,
>   Felix

The mmap_offset is actually not returned to user space. I just wonder if 
here we should get mmap_offset of imported vram buffer if allow bo be 
imported to difference gpus. If a vram buffer is imported to same gpu 
device amdgpu_bo_mmap_offset is ok, otherwise I think 
amdgpu_bo_mmap_offset would not give correct mmap_offset for the device 
that the buffer is  imported to.

Maybe we should remove mmap_offset parameter of 
amdgpu_amdkfd_gpuvm_import_dmabuf since we do not return it to user 
space anyway. With that:

Reviewed-by: Xiaogang Chen <Xiaoganng.Chen@amd.com>

Regards

Xiaogang


>
>
>>
>>> @@ -2249,7 +2246,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>>> amdgpu_device *adev,
>>>           | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
>>>           | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>>>   -    drm_gem_object_get(&bo->tbo.base);
>>> +    get_dma_buf(dma_buf);
>>> +    (*mem)->dmabuf = dma_buf;
>>>       (*mem)->bo = bo;
>>>       (*mem)->va = va;
>>>       (*mem)->domain = (bo->preferred_domains & 
>>> AMDGPU_GEM_DOMAIN_VRAM) ?
>>> @@ -2261,6 +2259,12 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>>> amdgpu_device *adev,
>>>       (*mem)->is_imported = true;
>>>         return 0;
>>> +
>>> +err_free_mem:
>>> +    kfree(*mem);
>>> +err_put_obj:
>>> +    drm_gem_object_put(obj);
>>> +    return ret;
>>>   }
>>>     /* Evict a userptr BO by stopping the queues if necessary

Felix Kuehling Jan. 13, 2023, 11:15 p.m. UTC | #4

On 2023-01-13 18:00, Chen, Xiaogang wrote:
>
> On 1/13/2023 4:26 PM, Felix Kuehling wrote:
>> On 2023-01-12 17:41, Chen, Xiaogang wrote:
>>>
>>> On 1/11/2023 7:31 PM, Felix Kuehling wrote:
>>>> Use proper amdgpu_gem_prime_import function to handle all kinds of
>>>> imports. Remember the dmabuf reference to enable proper multi-GPU
>>>> attachment to multiple VMs without erroneously re-exporting the
>>>> underlying BO multiple times.
>>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>> ---
>>>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 
>>>> ++++++++++---------
>>>>   1 file changed, 21 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> index 6f236ded5f12..e13c3493b786 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> @@ -2209,30 +2209,27 @@ int 
>>>> amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>>>       struct amdgpu_bo *bo;
>>>>       int ret;
>>>>   -    if (dma_buf->ops != &amdgpu_dmabuf_ops)
>>>> -        /* Can't handle non-graphics buffers */
>>>> -        return -EINVAL;
>>>> -
>>>> -    obj = dma_buf->priv;
>>>> -    if (drm_to_adev(obj->dev) != adev)
>>>> -        /* Can't handle buffers from other devices */
>>>> -        return -EINVAL;
>>>> +    obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
>>>> +    if (IS_ERR(obj))
>>>> +        return PTR_ERR(obj);
>>>>         bo = gem_to_amdgpu_bo(obj);
>>>>       if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
>>>> -                    AMDGPU_GEM_DOMAIN_GTT)))
>>>> +                    AMDGPU_GEM_DOMAIN_GTT))) {
>>>>           /* Only VRAM and GTT BOs are supported */
>>>> -        return -EINVAL;
>>>> +        ret = -EINVAL;
>>>> +        goto err_put_obj;
>>>> +    }
>>>>         *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
>>>> -    if (!*mem)
>>>> -        return -ENOMEM;
>>>> +    if (!*mem) {
>>>> +        ret = -ENOMEM;
>>>> +        goto err_put_obj;
>>>> +    }
>>>>         ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>>>> -    if (ret) {
>>>> -        kfree(*mem);
>>>> -        return ret;
>>>> -    }
>>>> +    if (ret)
>>>> +        goto err_free_mem;
>>>>         if (size)
>>>>           *size = amdgpu_bo_size(bo);
>>>
>>> Hi Felix:
>>>
>>> I have a question when using amdgpu_gem_prime_import. It will allow 
>>> importing a dmabuf to different gpus, then can we still call 
>>> amdgpu_bo_mmap_offset on the generated bo if 
>>> amdgpu_amdkfd_gpuvm_import_dmabuf also ask mmap_offset?
>>
>> The mmap  offset comes from drm_vma_node_offset_addr. The DRM VMA 
>> address is allocated when ttm_bo_init_reserved calls 
>> drm_vma_offset_add, so there should be no problem querying the 
>> mmap_offset. Whether mmapping of an imported BO is actually supported 
>> is a different question. As far as I can see, it should be possible. 
>> That said, currently ROCm (libhsakmt) uses only original BOs for CPU 
>> mappings, not imported BOs.
>>
>> Regards,
>>   Felix
>
> The mmap_offset is actually not returned to user space. I just wonder 
> if here we should get mmap_offset of imported vram buffer if allow bo 
> be imported to difference gpus. If a vram buffer is imported to same 
> gpu device amdgpu_bo_mmap_offset is ok, otherwise I think 
> amdgpu_bo_mmap_offset would not give correct mmap_offset for the 
> device that the buffer is  imported to.

When the BO is imported into the same GPU, you get a reference to the 
same BO, so the imported BO has the same mmap_offset as the original BO.

When the BO is imported into a different GPU, it is a new BO with a new 
mmap_offset. I don't think this is incorrect. mmapping the memory with 
that new offset should still work. The imported BO is created with 
ttm_bo_type_sg, and AFAICT ttm_bo_vm.c supports mapping of SG BOs.

Regards,
   Felix


>
> Maybe we should remove mmap_offset parameter of 
> amdgpu_amdkfd_gpuvm_import_dmabuf since we do not return it to user 
> space anyway. With that:
>
> Reviewed-by: Xiaogang Chen <Xiaoganng.Chen@amd.com>
>
> Regards
>
> Xiaogang
>
>
>>
>>
>>>
>>>> @@ -2249,7 +2246,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>>>> amdgpu_device *adev,
>>>>           | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
>>>>           | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>>>>   -    drm_gem_object_get(&bo->tbo.base);
>>>> +    get_dma_buf(dma_buf);
>>>> +    (*mem)->dmabuf = dma_buf;
>>>>       (*mem)->bo = bo;
>>>>       (*mem)->va = va;
>>>>       (*mem)->domain = (bo->preferred_domains & 
>>>> AMDGPU_GEM_DOMAIN_VRAM) ?
>>>> @@ -2261,6 +2259,12 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>>>> amdgpu_device *adev,
>>>>       (*mem)->is_imported = true;
>>>>         return 0;
>>>> +
>>>> +err_free_mem:
>>>> +    kfree(*mem);
>>>> +err_put_obj:
>>>> +    drm_gem_object_put(obj);
>>>> +    return ret;
>>>>   }
>>>>     /* Evict a userptr BO by stopping the queues if necessary

Christian König Jan. 15, 2023, 4:43 p.m. UTC | #5

Am 14.01.23 um 00:15 schrieb Felix Kuehling:
> On 2023-01-13 18:00, Chen, Xiaogang wrote:
>>
>> On 1/13/2023 4:26 PM, Felix Kuehling wrote:
>>> On 2023-01-12 17:41, Chen, Xiaogang wrote:
>>>>
>>>> On 1/11/2023 7:31 PM, Felix Kuehling wrote:
>>>>> Use proper amdgpu_gem_prime_import function to handle all kinds of
>>>>> imports. Remember the dmabuf reference to enable proper multi-GPU
>>>>> attachment to multiple VMs without erroneously re-exporting the
>>>>> underlying BO multiple times.
>>>>>
>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>> ---
>>>>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 
>>>>> ++++++++++---------
>>>>>   1 file changed, 21 insertions(+), 17 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>> index 6f236ded5f12..e13c3493b786 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>> @@ -2209,30 +2209,27 @@ int 
>>>>> amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>>>>       struct amdgpu_bo *bo;
>>>>>       int ret;
>>>>>   -    if (dma_buf->ops != &amdgpu_dmabuf_ops)
>>>>> -        /* Can't handle non-graphics buffers */
>>>>> -        return -EINVAL;
>>>>> -
>>>>> -    obj = dma_buf->priv;
>>>>> -    if (drm_to_adev(obj->dev) != adev)
>>>>> -        /* Can't handle buffers from other devices */
>>>>> -        return -EINVAL;
>>>>> +    obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
>>>>> +    if (IS_ERR(obj))
>>>>> +        return PTR_ERR(obj);
>>>>>         bo = gem_to_amdgpu_bo(obj);
>>>>>       if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
>>>>> -                    AMDGPU_GEM_DOMAIN_GTT)))
>>>>> +                    AMDGPU_GEM_DOMAIN_GTT))) {
>>>>>           /* Only VRAM and GTT BOs are supported */
>>>>> -        return -EINVAL;
>>>>> +        ret = -EINVAL;
>>>>> +        goto err_put_obj;
>>>>> +    }
>>>>>         *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
>>>>> -    if (!*mem)
>>>>> -        return -ENOMEM;
>>>>> +    if (!*mem) {
>>>>> +        ret = -ENOMEM;
>>>>> +        goto err_put_obj;
>>>>> +    }
>>>>>         ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>>>>> -    if (ret) {
>>>>> -        kfree(*mem);
>>>>> -        return ret;
>>>>> -    }
>>>>> +    if (ret)
>>>>> +        goto err_free_mem;
>>>>>         if (size)
>>>>>           *size = amdgpu_bo_size(bo);
>>>>
>>>> Hi Felix:
>>>>
>>>> I have a question when using amdgpu_gem_prime_import. It will allow 
>>>> importing a dmabuf to different gpus, then can we still call 
>>>> amdgpu_bo_mmap_offset on the generated bo if 
>>>> amdgpu_amdkfd_gpuvm_import_dmabuf also ask mmap_offset?
>>>
>>> The mmap  offset comes from drm_vma_node_offset_addr. The DRM VMA 
>>> address is allocated when ttm_bo_init_reserved calls 
>>> drm_vma_offset_add, so there should be no problem querying the 
>>> mmap_offset. Whether mmapping of an imported BO is actually 
>>> supported is a different question. As far as I can see, it should be 
>>> possible. That said, currently ROCm (libhsakmt) uses only original 
>>> BOs for CPU mappings, not imported BOs.
>>>
>>> Regards,
>>>   Felix
>>
>> The mmap_offset is actually not returned to user space. I just wonder 
>> if here we should get mmap_offset of imported vram buffer if allow bo 
>> be imported to difference gpus. If a vram buffer is imported to same 
>> gpu device amdgpu_bo_mmap_offset is ok, otherwise I think 
>> amdgpu_bo_mmap_offset would not give correct mmap_offset for the 
>> device that the buffer is  imported to.
>
> When the BO is imported into the same GPU, you get a reference to the 
> same BO, so the imported BO has the same mmap_offset as the original BO.
>
> When the BO is imported into a different GPU, it is a new BO with a 
> new mmap_offset.

That won't work.

> I don't think this is incorrect.

No, this is completely incorrect. It mixes up the reverse tracking of 
mappings and might crash the system. This is the reason why we can't 
mmap() imported BOs.

> mmapping the memory with that new offset should still work. The 
> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c 
> supports mapping of SG BOs.

Actually it shouldn't. This can go boom really easily.

When you have imported a BO the only correct way of to mmap() it is to 
do so on the original exporter.

Regards,
Christian.

>
> Regards,
>   Felix
>
>
>>
>> Maybe we should remove mmap_offset parameter of 
>> amdgpu_amdkfd_gpuvm_import_dmabuf since we do not return it to user 
>> space anyway. With that:
>>
>> Reviewed-by: Xiaogang Chen <Xiaoganng.Chen@amd.com>
>>
>> Regards
>>
>> Xiaogang
>>
>>
>>>
>>>
>>>>
>>>>> @@ -2249,7 +2246,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct 
>>>>> amdgpu_device *adev,
>>>>>           | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
>>>>>           | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>>>>>   -    drm_gem_object_get(&bo->tbo.base);
>>>>> +    get_dma_buf(dma_buf);
>>>>> +    (*mem)->dmabuf = dma_buf;
>>>>>       (*mem)->bo = bo;
>>>>>       (*mem)->va = va;
>>>>>       (*mem)->domain = (bo->preferred_domains & 
>>>>> AMDGPU_GEM_DOMAIN_VRAM) ?
>>>>> @@ -2261,6 +2259,12 @@ int 
>>>>> amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>>>>       (*mem)->is_imported = true;
>>>>>         return 0;
>>>>> +
>>>>> +err_free_mem:
>>>>> +    kfree(*mem);
>>>>> +err_put_obj:
>>>>> +    drm_gem_object_put(obj);
>>>>> +    return ret;
>>>>>   }
>>>>>     /* Evict a userptr BO by stopping the queues if necessary

Felix Kuehling Jan. 15, 2023, 6:32 p.m. UTC | #6

Am 2023-01-15 um 11:43 schrieb Christian König:
>
>
> Am 14.01.23 um 00:15 schrieb Felix Kuehling:
>> On 2023-01-13 18:00, Chen, Xiaogang wrote:
>>>
>>> On 1/13/2023 4:26 PM, Felix Kuehling wrote:
>>>> On 2023-01-12 17:41, Chen, Xiaogang wrote:
>>>>>
>>>>> On 1/11/2023 7:31 PM, Felix Kuehling wrote:
>>>>>> Use proper amdgpu_gem_prime_import function to handle all kinds of
>>>>>> imports. Remember the dmabuf reference to enable proper multi-GPU
>>>>>> attachment to multiple VMs without erroneously re-exporting the
>>>>>> underlying BO multiple times.
>>>>>>
>>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>>> ---
>>>>>>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 
>>>>>> ++++++++++---------
>>>>>>   1 file changed, 21 insertions(+), 17 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>>> index 6f236ded5f12..e13c3493b786 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>>>> @@ -2209,30 +2209,27 @@ int 
>>>>>> amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>>>>>       struct amdgpu_bo *bo;
>>>>>>       int ret;
>>>>>>   -    if (dma_buf->ops != &amdgpu_dmabuf_ops)
>>>>>> -        /* Can't handle non-graphics buffers */
>>>>>> -        return -EINVAL;
>>>>>> -
>>>>>> -    obj = dma_buf->priv;
>>>>>> -    if (drm_to_adev(obj->dev) != adev)
>>>>>> -        /* Can't handle buffers from other devices */
>>>>>> -        return -EINVAL;
>>>>>> +    obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
>>>>>> +    if (IS_ERR(obj))
>>>>>> +        return PTR_ERR(obj);
>>>>>>         bo = gem_to_amdgpu_bo(obj);
>>>>>>       if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
>>>>>> -                    AMDGPU_GEM_DOMAIN_GTT)))
>>>>>> +                    AMDGPU_GEM_DOMAIN_GTT))) {
>>>>>>           /* Only VRAM and GTT BOs are supported */
>>>>>> -        return -EINVAL;
>>>>>> +        ret = -EINVAL;
>>>>>> +        goto err_put_obj;
>>>>>> +    }
>>>>>>         *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
>>>>>> -    if (!*mem)
>>>>>> -        return -ENOMEM;
>>>>>> +    if (!*mem) {
>>>>>> +        ret = -ENOMEM;
>>>>>> +        goto err_put_obj;
>>>>>> +    }
>>>>>>         ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
>>>>>> -    if (ret) {
>>>>>> -        kfree(*mem);
>>>>>> -        return ret;
>>>>>> -    }
>>>>>> +    if (ret)
>>>>>> +        goto err_free_mem;
>>>>>>         if (size)
>>>>>>           *size = amdgpu_bo_size(bo);
>>>>>
>>>>> Hi Felix:
>>>>>
>>>>> I have a question when using amdgpu_gem_prime_import. It will 
>>>>> allow importing a dmabuf to different gpus, then can we still call 
>>>>> amdgpu_bo_mmap_offset on the generated bo if 
>>>>> amdgpu_amdkfd_gpuvm_import_dmabuf also ask mmap_offset?
>>>>
>>>> The mmap  offset comes from drm_vma_node_offset_addr. The DRM VMA 
>>>> address is allocated when ttm_bo_init_reserved calls 
>>>> drm_vma_offset_add, so there should be no problem querying the 
>>>> mmap_offset. Whether mmapping of an imported BO is actually 
>>>> supported is a different question. As far as I can see, it should 
>>>> be possible. That said, currently ROCm (libhsakmt) uses only 
>>>> original BOs for CPU mappings, not imported BOs.
>>>>
>>>> Regards,
>>>>   Felix
>>>
>>> The mmap_offset is actually not returned to user space. I just 
>>> wonder if here we should get mmap_offset of imported vram buffer if 
>>> allow bo be imported to difference gpus. If a vram buffer is 
>>> imported to same gpu device amdgpu_bo_mmap_offset is ok, otherwise I 
>>> think amdgpu_bo_mmap_offset would not give correct mmap_offset for 
>>> the device that the buffer is imported to.
>>
>> When the BO is imported into the same GPU, you get a reference to the 
>> same BO, so the imported BO has the same mmap_offset as the original BO.
>>
>> When the BO is imported into a different GPU, it is a new BO with a 
>> new mmap_offset.
>
> That won't work.
>
>> I don't think this is incorrect.
>
> No, this is completely incorrect. It mixes up the reverse tracking of 
> mappings and might crash the system.

I don't understand that. The imported BO is a different BO with a 
different mmap offset in a different device file. I don't see how that 
messes with the tracking of mappings.


> This is the reason why we can't mmap() imported BOs.

I don't see anything preventing that. For userptr BOs, there is this 
code in amdgpu_gem_object_mmap:

         if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
                 return -EPERM;

I don't see anything like this preventing mmapping of imported dmabuf 
BOs. What am I missing?


>
>> mmapping the memory with that new offset should still work. The 
>> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c 
>> supports mapping of SG BOs.
>
> Actually it shouldn't. This can go boom really easily.

OK. I don't think we're doing this, but after Xiaogang raised the 
question I went looking through the code whether it's theoretically 
possible. I didn't find anything in the code that says that mmapping 
imported dmabufs would be prohibited or even dangerous. On the contrary, 
I found that ttm_bo_vm explicitly supports mmapping SG BOs.


>
> When you have imported a BO the only correct way of to mmap() it is to 
> do so on the original exporter.

That seems sensible, and this is what we do today. That said, if 
mmapping an imported BO is dangerous, I'm missing a mechanism to protect 
against this. It could be as simple as setting 
AMDGPU_GEM_CREATE_NO_CPU_ACCESS in amdgpu_dma_buf_create_obj.

Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Regards,
>>   Felix
>>
>>
>>>
>>> Maybe we should remove mmap_offset parameter of 
>>> amdgpu_amdkfd_gpuvm_import_dmabuf since we do not return it to user 
>>> space anyway. With that:
>>>
>>> Reviewed-by: Xiaogang Chen <Xiaoganng.Chen@amd.com>
>>>
>>> Regards
>>>
>>> Xiaogang
>>>
>>>
>>>>
>>>>
>>>>>
>>>>>> @@ -2249,7 +2246,8 @@ int 
>>>>>> amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>>>>>           | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
>>>>>>           | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>>>>>>   -    drm_gem_object_get(&bo->tbo.base);
>>>>>> +    get_dma_buf(dma_buf);
>>>>>> +    (*mem)->dmabuf = dma_buf;
>>>>>>       (*mem)->bo = bo;
>>>>>>       (*mem)->va = va;
>>>>>>       (*mem)->domain = (bo->preferred_domains & 
>>>>>> AMDGPU_GEM_DOMAIN_VRAM) ?
>>>>>> @@ -2261,6 +2259,12 @@ int 
>>>>>> amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>>>>>>       (*mem)->is_imported = true;
>>>>>>         return 0;
>>>>>> +
>>>>>> +err_free_mem:
>>>>>> +    kfree(*mem);
>>>>>> +err_put_obj:
>>>>>> +    drm_gem_object_put(obj);
>>>>>> +    return ret;
>>>>>>   }
>>>>>>     /* Evict a userptr BO by stopping the queues if necessary
>

Christian König Jan. 16, 2023, 11:42 a.m. UTC | #7

[SNIP]
>>> When the BO is imported into the same GPU, you get a reference to 
>>> the same BO, so the imported BO has the same mmap_offset as the 
>>> original BO.
>>>
>>> When the BO is imported into a different GPU, it is a new BO with a 
>>> new mmap_offset.
>>
>> That won't work.
>>
>>> I don't think this is incorrect.
>>
>> No, this is completely incorrect. It mixes up the reverse tracking of 
>> mappings and might crash the system.
>
> I don't understand that. The imported BO is a different BO with a 
> different mmap offset in a different device file. I don't see how that 
> messes with the tracking of mappings.

The tracking keeps note which piece of information is accessible through 
which address space object and offset. I you suddenly have two address 
spaces and offsets pointing to the same piece of information that won't 
work any more.

>
>> This is the reason why we can't mmap() imported BOs.
>
> I don't see anything preventing that. For userptr BOs, there is this 
> code in amdgpu_gem_object_mmap:
>
>         if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
>                 return -EPERM;
>
> I don't see anything like this preventing mmapping of imported dmabuf 
> BOs. What am I missing?
>

At some point I really need to make a big presentation about all this 
stuff, we had the same discussion multiple times now :)

It's the same reason why you can't mmap() VRAM through the kfd node: 
Each file can have only one address space object associated with it.

See dma_buf_mmap() and vma_set_file() how this is worked around in DMA-buf.

>>
>>> mmapping the memory with that new offset should still work. The 
>>> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c 
>>> supports mapping of SG BOs.
>>
>> Actually it shouldn't. This can go boom really easily.
>
> OK. I don't think we're doing this, but after Xiaogang raised the 
> question I went looking through the code whether it's theoretically 
> possible. I didn't find anything in the code that says that mmapping 
> imported dmabufs would be prohibited or even dangerous. On the 
> contrary, I found that ttm_bo_vm explicitly supports mmapping SG BOs.
>
>
>>
>> When you have imported a BO the only correct way of to mmap() it is 
>> to do so on the original exporter.
>
> That seems sensible, and this is what we do today. That said, if 
> mmapping an imported BO is dangerous, I'm missing a mechanism to 
> protect against this. It could be as simple as setting 
> AMDGPU_GEM_CREATE_NO_CPU_ACCESS in amdgpu_dma_buf_create_obj.

At least for the GEM mmap() handler this is double checked very early by 
looking at obj->import_attach and then either rejecting it or 
redirecting the request to the DMA-buf file instead.

We probably need something similar when stuff is mapped through the KFD 
node. But I think we don't do that any more for "normal" BOs anyway, 
don't we?

Regards,
Christian.

>
> Regards,
>   Felix

Felix Kuehling Jan. 16, 2023, 2:52 p.m. UTC | #8

Am 2023-01-16 um 06:42 schrieb Christian König:
> [SNIP]
>>>> When the BO is imported into the same GPU, you get a reference to 
>>>> the same BO, so the imported BO has the same mmap_offset as the 
>>>> original BO.
>>>>
>>>> When the BO is imported into a different GPU, it is a new BO with a 
>>>> new mmap_offset.
>>>
>>> That won't work.
>>>
>>>> I don't think this is incorrect.
>>>
>>> No, this is completely incorrect. It mixes up the reverse tracking 
>>> of mappings and might crash the system.
>>
>> I don't understand that. The imported BO is a different BO with a 
>> different mmap offset in a different device file. I don't see how 
>> that messes with the tracking of mappings.
>
> The tracking keeps note which piece of information is accessible 
> through which address space object and offset. I you suddenly have two 
> address spaces and offsets pointing to the same piece of information 
> that won't work any more.

How do you identify a "piece of information". I don't think it's the 
physical page. VRAM doesn't even have struct pages. I think it's the BO 
that's being tracked. With a dmabuf import you have a second BO aliasing 
the same physical pages as the original BO. Then those two BOs are seen 
as two distinct "pieces of information" that can each have their own 
mapping.


>
>>
>>> This is the reason why we can't mmap() imported BOs.
>>
>> I don't see anything preventing that. For userptr BOs, there is this 
>> code in amdgpu_gem_object_mmap:
>>
>>         if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
>>                 return -EPERM;
>>
>> I don't see anything like this preventing mmapping of imported dmabuf 
>> BOs. What am I missing?
>>
>
> At some point I really need to make a big presentation about all this 
> stuff, we had the same discussion multiple times now :)
>
> It's the same reason why you can't mmap() VRAM through the kfd node: 
> Each file can have only one address space object associated with it.

I remember that. We haven't used KFD to mmap BOs for a long time for 
that reason.


>
> See dma_buf_mmap() and vma_set_file() how this is worked around in 
> DMA-buf.

These are for mmapping memory through the dmabuf fd. I'm not sure that's 
a good example. drm_gem_prime_mmap creates a temporary struct file and 
struct drm_file that are destroyed immediately after calling 
obj->dev->driver->fops->mmap. I think that would break any reverse mapping.


>
>>>
>>>> mmapping the memory with that new offset should still work. The 
>>>> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c 
>>>> supports mapping of SG BOs.
>>>
>>> Actually it shouldn't. This can go boom really easily.
>>
>> OK. I don't think we're doing this, but after Xiaogang raised the 
>> question I went looking through the code whether it's theoretically 
>> possible. I didn't find anything in the code that says that mmapping 
>> imported dmabufs would be prohibited or even dangerous. On the 
>> contrary, I found that ttm_bo_vm explicitly supports mmapping SG BOs.
>>
>>
>>>
>>> When you have imported a BO the only correct way of to mmap() it is 
>>> to do so on the original exporter.
>>
>> That seems sensible, and this is what we do today. That said, if 
>> mmapping an imported BO is dangerous, I'm missing a mechanism to 
>> protect against this. It could be as simple as setting 
>> AMDGPU_GEM_CREATE_NO_CPU_ACCESS in amdgpu_dma_buf_create_obj.
>
> At least for the GEM mmap() handler this is double checked very early 
> by looking at obj->import_attach and then either rejecting it or 
> redirecting the request to the DMA-buf file instead.

Can you point me at where this check is? I see a check for 
obj->import_attach in drm_gem_dumb_map_offset. But I can't see how this 
function is called in amdgpu. I don't think it is used at all.


>
> We probably need something similar when stuff is mapped through the 
> KFD node. But I think we don't do that any more for "normal" BOs 
> anyway, don't we?

Correct, we don't map BOs through the KFD device file. The only mappings 
we still use it for are:

  * Doorbells on APUs
  * Events page on APUs
  * MMIO page for HDP flushing

The code for mmapping regular BOs through /dev/kfd was never upstream.

Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Regards,
>>   Felix
>

Christian König Jan. 16, 2023, 3:11 p.m. UTC | #9

Am 16.01.23 um 15:52 schrieb Felix Kuehling:
> Am 2023-01-16 um 06:42 schrieb Christian König:
>> [SNIP]
>>>>> When the BO is imported into the same GPU, you get a reference to 
>>>>> the same BO, so the imported BO has the same mmap_offset as the 
>>>>> original BO.
>>>>>
>>>>> When the BO is imported into a different GPU, it is a new BO with 
>>>>> a new mmap_offset.
>>>>
>>>> That won't work.
>>>>
>>>>> I don't think this is incorrect.
>>>>
>>>> No, this is completely incorrect. It mixes up the reverse tracking 
>>>> of mappings and might crash the system.
>>>
>>> I don't understand that. The imported BO is a different BO with a 
>>> different mmap offset in a different device file. I don't see how 
>>> that messes with the tracking of mappings.
>>
>> The tracking keeps note which piece of information is accessible 
>> through which address space object and offset. I you suddenly have 
>> two address spaces and offsets pointing to the same piece of 
>> information that won't work any more.
>
> How do you identify a "piece of information". I don't think it's the 
> physical page. VRAM doesn't even have struct pages. I think it's the 
> BO that's being tracked.

Both struct page as well as pfn. Essentially everything you can give 
vm_insert_pfn() or vm_insert_page() as parameter.

> With a dmabuf import you have a second BO aliasing the same physical 
> pages as the original BO.

No, exactly that's wrong. You have a BO which contains a bunch of dma 
addresses. The pages or whatever lies behind those as backing store are 
intentionally hidden from the importer.

Daniel even started mangling the page pointer to prevent people from 
using them: 
https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-buf.c#L769

> Then those two BOs are seen as two distinct "pieces of information" 
> that can each have their own mapping.

Well I need to correct myself: Creating a "fake" offset for the BO is 
ok, but we should just never ever mmap() through two different address 
spaces.

>
>>>
>>>> This is the reason why we can't mmap() imported BOs.
>>>
>>> I don't see anything preventing that. For userptr BOs, there is this 
>>> code in amdgpu_gem_object_mmap:
>>>
>>>         if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
>>>                 return -EPERM;
>>>
>>> I don't see anything like this preventing mmapping of imported 
>>> dmabuf BOs. What am I missing?
>>>
>>
>> At some point I really need to make a big presentation about all this 
>> stuff, we had the same discussion multiple times now :)
>>
>> It's the same reason why you can't mmap() VRAM through the kfd node: 
>> Each file can have only one address space object associated with it.
>
> I remember that. We haven't used KFD to mmap BOs for a long time for 
> that reason.
>
>
>>
>> See dma_buf_mmap() and vma_set_file() how this is worked around in 
>> DMA-buf.
>
> These are for mmapping memory through the dmabuf fd. I'm not sure 
> that's a good example. drm_gem_prime_mmap creates a temporary struct 
> file and struct drm_file that are destroyed immediately after calling 
> obj->dev->driver->fops->mmap. I think that would break any reverse 
> mapping.

I was hoping we have removed that code by now. Today obj->funcs->mmap 
should be used and not the hack with temporary creating an file/fpriv 
any more.

>>>>
>>>>> mmapping the memory with that new offset should still work. The 
>>>>> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c 
>>>>> supports mapping of SG BOs.
>>>>
>>>> Actually it shouldn't. This can go boom really easily.
>>>
>>> OK. I don't think we're doing this, but after Xiaogang raised the 
>>> question I went looking through the code whether it's theoretically 
>>> possible. I didn't find anything in the code that says that mmapping 
>>> imported dmabufs would be prohibited or even dangerous. On the 
>>> contrary, I found that ttm_bo_vm explicitly supports mmapping SG BOs.
>>>
>>>
>>>>
>>>> When you have imported a BO the only correct way of to mmap() it is 
>>>> to do so on the original exporter.
>>>
>>> That seems sensible, and this is what we do today. That said, if 
>>> mmapping an imported BO is dangerous, I'm missing a mechanism to 
>>> protect against this. It could be as simple as setting 
>>> AMDGPU_GEM_CREATE_NO_CPU_ACCESS in amdgpu_dma_buf_create_obj.
>>
>> At least for the GEM mmap() handler this is double checked very early 
>> by looking at obj->import_attach and then either rejecting it or 
>> redirecting the request to the DMA-buf file instead.
>
> Can you point me at where this check is? I see a check for 
> obj->import_attach in drm_gem_dumb_map_offset. But I can't see how 
> this function is called in amdgpu. I don't think it is used at all.

Uff, good question! @Thomas and @Dmitry: I clearly remember that one of 
you guys was involved in the DRM/GEM mmap cleanup and DMA-buf with 
workarounds for the KFD and DMA-buf.

What was the final solution to this? I can't find it of hand any more.

>>
>> We probably need something similar when stuff is mapped through the 
>> KFD node. But I think we don't do that any more for "normal" BOs 
>> anyway, don't we?
>
> Correct, we don't map BOs through the KFD device file. The only 
> mappings we still use it for are:
>
>  * Doorbells on APUs
>  * Events page on APUs
>  * MMIO page for HDP flushing
>
> The code for mmapping regular BOs through /dev/kfd was never upstream.

Good, that's probably much less problematic.

Regards,
Christian.


>
> Regards,
>   Felix
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> Regards,
>>>   Felix
>>

Ramesh Errabolu Jan. 16, 2023, 10:04 p.m. UTC | #10

[AMD Official Use Only - General]

A minor comment, unrelated to the patch. The comments are inline.

Regards,
Ramesh

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Felix Kuehling
Sent: Thursday, January 12, 2023 7:02 AM
To: amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Chen, Xiaogang <Xiaogang.Chen@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>
Subject: [PATCH 1/6] drm/amdgpu: Generalize KFD dmabuf import

Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


Use proper amdgpu_gem_prime_import function to handle all kinds of imports. Remember the dmabuf reference to enable proper multi-GPU attachment to multiple VMs without erroneously re-exporting the underlying BO multiple times.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 ++++++++++---------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 6f236ded5f12..e13c3493b786 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2209,30 +2209,27 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
        struct amdgpu_bo *bo;
        int ret;

-       if (dma_buf->ops != &amdgpu_dmabuf_ops)
-               /* Can't handle non-graphics buffers */
-               return -EINVAL;
-
-       obj = dma_buf->priv;
-       if (drm_to_adev(obj->dev) != adev)
-               /* Can't handle buffers from other devices */
-               return -EINVAL;
+       obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
+       if (IS_ERR(obj))
+               return PTR_ERR(obj);

        bo = gem_to_amdgpu_bo(obj);
        if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
-                                   AMDGPU_GEM_DOMAIN_GTT)))
+                                   AMDGPU_GEM_DOMAIN_GTT))) {
                /* Only VRAM and GTT BOs are supported */
-               return -EINVAL;
+               ret = -EINVAL;
+               goto err_put_obj;
+       }

        *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
-       if (!*mem)
-               return -ENOMEM;
+       if (!*mem) {
+               ret = -ENOMEM;
+               goto err_put_obj;
+       }

        ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
-       if (ret) {
-               kfree(*mem);
-               return ret;
-       }
+       if (ret)
+               goto err_free_mem;

        if (size)
                *size = amdgpu_bo_size(bo); @@ -2249,7 +2246,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
                | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
                | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
Ramesh: Is it correct to assign WRITE & EXECUTABLE permissions to alloc_flags variable.

-       drm_gem_object_get(&bo->tbo.base);
+       get_dma_buf(dma_buf);
+       (*mem)->dmabuf = dma_buf;
        (*mem)->bo = bo;
        (*mem)->va = va;
        (*mem)->domain = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
@@ -2261,6 +2259,12 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
        (*mem)->is_imported = true;

        return 0;
+
+err_free_mem:
+       kfree(*mem);
+err_put_obj:
+       drm_gem_object_put(obj);
+       return ret;
 }

 /* Evict a userptr BO by stopping the queues if necessary
--
2.34.1

Felix Kuehling Jan. 16, 2023, 10:25 p.m. UTC | #11

On 2023-01-16 17:04, Errabolu, Ramesh wrote:
> [AMD Official Use Only - General]
>
> A minor comment, unrelated to the patch. The comments are inline.
>
> Regards,
> Ramesh
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Felix Kuehling
> Sent: Thursday, January 12, 2023 7:02 AM
> To: amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Chen, Xiaogang <Xiaogang.Chen@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>
> Subject: [PATCH 1/6] drm/amdgpu: Generalize KFD dmabuf import
>
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>
>
> Use proper amdgpu_gem_prime_import function to handle all kinds of imports. Remember the dmabuf reference to enable proper multi-GPU attachment to multiple VMs without erroneously re-exporting the underlying BO multiple times.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 ++++++++++---------
>   1 file changed, 21 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 6f236ded5f12..e13c3493b786 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -2209,30 +2209,27 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>          struct amdgpu_bo *bo;
>          int ret;
>
> -       if (dma_buf->ops != &amdgpu_dmabuf_ops)
> -               /* Can't handle non-graphics buffers */
> -               return -EINVAL;
> -
> -       obj = dma_buf->priv;
> -       if (drm_to_adev(obj->dev) != adev)
> -               /* Can't handle buffers from other devices */
> -               return -EINVAL;
> +       obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
> +       if (IS_ERR(obj))
> +               return PTR_ERR(obj);
>
>          bo = gem_to_amdgpu_bo(obj);
>          if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
> -                                   AMDGPU_GEM_DOMAIN_GTT)))
> +                                   AMDGPU_GEM_DOMAIN_GTT))) {
>                  /* Only VRAM and GTT BOs are supported */
> -               return -EINVAL;
> +               ret = -EINVAL;
> +               goto err_put_obj;
> +       }
>
>          *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
> -       if (!*mem)
> -               return -ENOMEM;
> +       if (!*mem) {
> +               ret = -ENOMEM;
> +               goto err_put_obj;
> +       }
>
>          ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
> -       if (ret) {
> -               kfree(*mem);
> -               return ret;
> -       }
> +       if (ret)
> +               goto err_free_mem;
>
>          if (size)
>                  *size = amdgpu_bo_size(bo); @@ -2249,7 +2246,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>                  | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
>                  | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
> Ramesh: Is it correct to assign WRITE & EXECUTABLE permissions to alloc_flags variable.

These flags affect GPUVM mappings of the BO. If we didn't set these 
flags, the imported BO would be read-only. The existing graphics-interop 
API (struct kfd_ioctl_import_dmabuf_args) doesn't have a way for user 
mode to provide these flags. Changing this would break the API.

Regards,
   Felix


>
> -       drm_gem_object_get(&bo->tbo.base);
> +       get_dma_buf(dma_buf);
> +       (*mem)->dmabuf = dma_buf;
>          (*mem)->bo = bo;
>          (*mem)->va = va;
>          (*mem)->domain = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
> @@ -2261,6 +2259,12 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
>          (*mem)->is_imported = true;
>
>          return 0;
> +
> +err_free_mem:
> +       kfree(*mem);
> +err_put_obj:
> +       drm_gem_object_put(obj);
> +       return ret;
>   }
>
>   /* Evict a userptr BO by stopping the queues if necessary
> --
> 2.34.1

Dmitry Osipenko Jan. 17, 2023, 1:06 a.m. UTC | #12

16.01.2023 18:11, Christian König пишет:
> 
>>>>>
>>>>>> mmapping the memory with that new offset should still work. The
>>>>>> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c
>>>>>> supports mapping of SG BOs.
>>>>>
>>>>> Actually it shouldn't. This can go boom really easily.
>>>>
>>>> OK. I don't think we're doing this, but after Xiaogang raised the
>>>> question I went looking through the code whether it's theoretically
>>>> possible. I didn't find anything in the code that says that mmapping
>>>> imported dmabufs would be prohibited or even dangerous. On the
>>>> contrary, I found that ttm_bo_vm explicitly supports mmapping SG BOs.
>>>>
>>>>
>>>>>
>>>>> When you have imported a BO the only correct way of to mmap() it is
>>>>> to do so on the original exporter.
>>>>
>>>> That seems sensible, and this is what we do today. That said, if
>>>> mmapping an imported BO is dangerous, I'm missing a mechanism to
>>>> protect against this. It could be as simple as setting
>>>> AMDGPU_GEM_CREATE_NO_CPU_ACCESS in amdgpu_dma_buf_create_obj.
>>>
>>> At least for the GEM mmap() handler this is double checked very early
>>> by looking at obj->import_attach and then either rejecting it or
>>> redirecting the request to the DMA-buf file instead.
>>
>> Can you point me at where this check is? I see a check for
>> obj->import_attach in drm_gem_dumb_map_offset. But I can't see how
>> this function is called in amdgpu. I don't think it is used at all.
> 
> Uff, good question! @Thomas and @Dmitry: I clearly remember that one of
> you guys was involved in the DRM/GEM mmap cleanup and DMA-buf with
> workarounds for the KFD and DMA-buf.
> 
> What was the final solution to this? I can't find it of hand any more.

I was looking at it. The AMDGPU indeed allows to map imported GEMs, but
then touching the mapped area by CPU results in a bus fault. You,
Christian, suggested that this an AMDGPU bug that should be fixed by
prohibiting the mapping in the first place and I was going to fix it,
but then the plan changed from prohibiting the mapping into fixing it.

The first proposal was to make DRM core to handle the dma-buf mappings
for all drivers universally [1]. Then we decided that will be better to
prohibit mapping of imported GEMs [2]. In the end, Rob Clark argued that
better to implement the [1], otherwise current userspace (Android) will
be broken if mapping will be prohibited.

The last question was about the cache syncing of imported dma-bufs, how
to ensure that drivers will do the cache maintenance/syncing properly.
Rob suggested that it should be a problem for drivers and not for DRM core.

I was going to re-send the [1], but other things were getting priority.
It's good that you reminded me about it :) I may re-send it sometime
soon if there are no new objections.

[1] https://patchwork.freedesktop.org/patch/487481/

[2]
https://lore.kernel.org/all/20220701090240.1896131-1-dmitry.osipenko@collabora.com/

Daniel Vetter Feb. 16, 2023, 12:22 p.m. UTC | #13

On Tue, Jan 17, 2023 at 04:06:05AM +0300, Dmitry Osipenko wrote:
> 16.01.2023 18:11, Christian König пишет:
> > 
> >>>>>
> >>>>>> mmapping the memory with that new offset should still work. The
> >>>>>> imported BO is created with ttm_bo_type_sg, and AFAICT ttm_bo_vm.c
> >>>>>> supports mapping of SG BOs.
> >>>>>
> >>>>> Actually it shouldn't. This can go boom really easily.
> >>>>
> >>>> OK. I don't think we're doing this, but after Xiaogang raised the
> >>>> question I went looking through the code whether it's theoretically
> >>>> possible. I didn't find anything in the code that says that mmapping
> >>>> imported dmabufs would be prohibited or even dangerous. On the
> >>>> contrary, I found that ttm_bo_vm explicitly supports mmapping SG BOs.
> >>>>
> >>>>
> >>>>>
> >>>>> When you have imported a BO the only correct way of to mmap() it is
> >>>>> to do so on the original exporter.
> >>>>
> >>>> That seems sensible, and this is what we do today. That said, if
> >>>> mmapping an imported BO is dangerous, I'm missing a mechanism to
> >>>> protect against this. It could be as simple as setting
> >>>> AMDGPU_GEM_CREATE_NO_CPU_ACCESS in amdgpu_dma_buf_create_obj.
> >>>
> >>> At least for the GEM mmap() handler this is double checked very early
> >>> by looking at obj->import_attach and then either rejecting it or
> >>> redirecting the request to the DMA-buf file instead.
> >>
> >> Can you point me at where this check is? I see a check for
> >> obj->import_attach in drm_gem_dumb_map_offset. But I can't see how
> >> this function is called in amdgpu. I don't think it is used at all.
> > 
> > Uff, good question! @Thomas and @Dmitry: I clearly remember that one of
> > you guys was involved in the DRM/GEM mmap cleanup and DMA-buf with
> > workarounds for the KFD and DMA-buf.
> > 
> > What was the final solution to this? I can't find it of hand any more.
> 
> I was looking at it. The AMDGPU indeed allows to map imported GEMs, but
> then touching the mapped area by CPU results in a bus fault. You,
> Christian, suggested that this an AMDGPU bug that should be fixed by
> prohibiting the mapping in the first place and I was going to fix it,
> but then the plan changed from prohibiting the mapping into fixing it.
> 
> The first proposal was to make DRM core to handle the dma-buf mappings
> for all drivers universally [1]. Then we decided that will be better to
> prohibit mapping of imported GEMs [2]. In the end, Rob Clark argued that
> better to implement the [1], otherwise current userspace (Android) will
> be broken if mapping will be prohibited.
> 
> The last question was about the cache syncing of imported dma-bufs, how
> to ensure that drivers will do the cache maintenance/syncing properly.
> Rob suggested that it should be a problem for drivers and not for DRM core.
> 
> I was going to re-send the [1], but other things were getting priority.
> It's good that you reminded me about it :) I may re-send it sometime
> soon if there are no new objections.
> 
> [1] https://patchwork.freedesktop.org/patch/487481/
> 
> [2]
> https://lore.kernel.org/all/20220701090240.1896131-1-dmitry.osipenko@collabora.com/

Hm I still don't like allowing this in general, because in general it just
doesn't work.

I think more like a per-driver opt-in or something might be needed, so
that drivers which "know" that it's ok to just mmap without coherency can
allow that. Allowing this in general essentially gives up on the entire
idea of dma-buf cache flushing completely.
-Daniel

[1/6] drm/amdgpu: Generalize KFD dmabuf import

Commit Message

Comments

Patch