diff mbox series

[RFC] driver: dma-buf: use vmf_insert_page for cma_heap_vm_fault

Message ID 20250115061805.3495048-1-zhaoyang.huang@unisoc.com (mailing list archive)
State New
Headers show
Series [RFC] driver: dma-buf: use vmf_insert_page for cma_heap_vm_fault | expand

Commit Message

zhaoyang.huang Jan. 15, 2025, 6:18 a.m. UTC
From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

When using dma-buf as memory pool for VMM. The vmf_insert_pfn will
apply PTE_SPECIAL on pte which have vm_normal_page report bad_pte and
return NULL. This commit would like to suggest to replace
vmf_insert_pfn by vmf_insert_page.

[  103.402787] kvm [5276]: gfn(ipa)=0x80000 hva=0x7d4a400000 write_fault=0
[  103.403822] BUG: Bad page map in process crosvm_vcpu0  pte:168000140000f43 pmd:8000000c1ca0003
[  103.405144] addr:0000007d4a400000 vm_flags:040400fb anon_vma:0000000000000000 mapping:ffffff8085163df0 index:0
[  103.406536] file:dmabuf fault:cma_heap_vm_fault [cma_heap] mmap:dma_buf_mmap_internal read_folio:0x0
[  103.407877] CPU: 3 PID: 5276 Comm: crosvm_vcpu0 Tainted: G        W  OE      6.6.46-android15-8-g8bab72b63c20-dirty-4k #1 1e474a12dac4553a3ebba3a911f3b744176a5d2d
[  103.409818] Hardware name: Unisoc UMS9632-base Board (DT)
[  103.410613] Call trace:
[  103.411038] dump_backtrace+0xf4/0x140
[  103.411641] show_stack+0x20/0x30
[  103.412184] dump_stack_lvl+0x60/0x84
[  103.412766] dump_stack+0x18/0x24
[  103.413304] print_bad_pte+0x1b8/0x1cc
[  103.413909] vm_normal_page+0xc8/0xd0
[  103.414491] follow_page_pte+0xb0/0x304
[  103.415096] follow_page_mask+0x108/0x240
[  103.415721] __get_user_pages+0x168/0x4ac
[  103.416342] __gup_longterm_locked+0x15c/0x864
[  103.417023] pin_user_pages+0x70/0xcc
[  103.417609] pkvm_mem_abort+0xf8/0x5c0
[  103.418207] kvm_handle_guest_abort+0x3e0/0x3e4
[  103.418906] handle_exit+0xac/0x33c
[  103.419472] kvm_arch_vcpu_ioctl_run+0x48c/0x8d8
[  103.420176] kvm_vcpu_ioctl+0x504/0x5bc
[  103.420785] __arm64_sys_ioctl+0xb0/0xec
[  103.421401] invoke_syscall+0x60/0x11c
[  103.422000] el0_svc_common+0xb4/0xe8
[  103.422590] do_el0_svc+0x24/0x30
[  103.423131] el0_svc+0x3c/0x70
[  103.423640] el0t_64_sync_handler+0x68/0xbc
[  103.424288] el0t_64_sync+0x1a8/0x1ac

Signed-off-by: Xiwei Wang <xiwei.wang1@unisoc.com>
Signed-off-by: Aijun Sun <aijun.sun@unisoc.com>
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 drivers/dma-buf/heaps/cma_heap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Christian König Jan. 15, 2025, 11:49 a.m. UTC | #1
Am 15.01.25 um 07:18 schrieb zhaoyang.huang:
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>
> When using dma-buf as memory pool for VMM. The vmf_insert_pfn will
> apply PTE_SPECIAL on pte which have vm_normal_page report bad_pte and
> return NULL. This commit would like to suggest to replace
> vmf_insert_pfn by vmf_insert_page.

Setting PTE_SPECIAL is completely intentional here to prevent 
get_user_pages() from working on DMA-buf mappings.

So absolutely clear NAK to this patch here.

What exactly are you trying to do?

Regards,
Christian.

>
> [  103.402787] kvm [5276]: gfn(ipa)=0x80000 hva=0x7d4a400000 write_fault=0
> [  103.403822] BUG: Bad page map in process crosvm_vcpu0  pte:168000140000f43 pmd:8000000c1ca0003
> [  103.405144] addr:0000007d4a400000 vm_flags:040400fb anon_vma:0000000000000000 mapping:ffffff8085163df0 index:0
> [  103.406536] file:dmabuf fault:cma_heap_vm_fault [cma_heap] mmap:dma_buf_mmap_internal read_folio:0x0
> [  103.407877] CPU: 3 PID: 5276 Comm: crosvm_vcpu0 Tainted: G        W  OE      6.6.46-android15-8-g8bab72b63c20-dirty-4k #1 1e474a12dac4553a3ebba3a911f3b744176a5d2d
> [  103.409818] Hardware name: Unisoc UMS9632-base Board (DT)
> [  103.410613] Call trace:
> [  103.411038] dump_backtrace+0xf4/0x140
> [  103.411641] show_stack+0x20/0x30
> [  103.412184] dump_stack_lvl+0x60/0x84
> [  103.412766] dump_stack+0x18/0x24
> [  103.413304] print_bad_pte+0x1b8/0x1cc
> [  103.413909] vm_normal_page+0xc8/0xd0
> [  103.414491] follow_page_pte+0xb0/0x304
> [  103.415096] follow_page_mask+0x108/0x240
> [  103.415721] __get_user_pages+0x168/0x4ac
> [  103.416342] __gup_longterm_locked+0x15c/0x864
> [  103.417023] pin_user_pages+0x70/0xcc
> [  103.417609] pkvm_mem_abort+0xf8/0x5c0
> [  103.418207] kvm_handle_guest_abort+0x3e0/0x3e4
> [  103.418906] handle_exit+0xac/0x33c
> [  103.419472] kvm_arch_vcpu_ioctl_run+0x48c/0x8d8
> [  103.420176] kvm_vcpu_ioctl+0x504/0x5bc
> [  103.420785] __arm64_sys_ioctl+0xb0/0xec
> [  103.421401] invoke_syscall+0x60/0x11c
> [  103.422000] el0_svc_common+0xb4/0xe8
> [  103.422590] do_el0_svc+0x24/0x30
> [  103.423131] el0_svc+0x3c/0x70
> [  103.423640] el0t_64_sync_handler+0x68/0xbc
> [  103.424288] el0t_64_sync+0x1a8/0x1ac
>
> Signed-off-by: Xiwei Wang <xiwei.wang1@unisoc.com>
> Signed-off-by: Aijun Sun <aijun.sun@unisoc.com>
> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> ---
>   drivers/dma-buf/heaps/cma_heap.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
> index c384004b918e..b301fb63f16b 100644
> --- a/drivers/dma-buf/heaps/cma_heap.c
> +++ b/drivers/dma-buf/heaps/cma_heap.c
> @@ -168,7 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf)
>   	if (vmf->pgoff > buffer->pagecount)
>   		return VM_FAULT_SIGBUS;
>   
> -	return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff]));
> +	return vmf_insert_page(vma, vmf->address, buffer->pages[vmf->pgoff]);
>   }
>   
>   static const struct vm_operations_struct dma_heap_vm_ops = {
Zhaoyang Huang Jan. 16, 2025, 1:46 a.m. UTC | #2
On Wed, Jan 15, 2025 at 7:49 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 15.01.25 um 07:18 schrieb zhaoyang.huang:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> > When using dma-buf as memory pool for VMM. The vmf_insert_pfn will
> > apply PTE_SPECIAL on pte which have vm_normal_page report bad_pte and
> > return NULL. This commit would like to suggest to replace
> > vmf_insert_pfn by vmf_insert_page.
>
> Setting PTE_SPECIAL is completely intentional here to prevent
> get_user_pages() from working on DMA-buf mappings.
ok. May I ask the reason?
>
> So absolutely clear NAK to this patch here.
>
> What exactly are you trying to do?
I would like to have pkvm have guest kernel be faulted of its second
stage page fault(ARM64's memory virtualization method) on dma-buf
which use pin_user_pages.
>
> Regards,
> Christian.
>
> >
> > [  103.402787] kvm [5276]: gfn(ipa)=0x80000 hva=0x7d4a400000 write_fault=0
> > [  103.403822] BUG: Bad page map in process crosvm_vcpu0  pte:168000140000f43 pmd:8000000c1ca0003
> > [  103.405144] addr:0000007d4a400000 vm_flags:040400fb anon_vma:0000000000000000 mapping:ffffff8085163df0 index:0
> > [  103.406536] file:dmabuf fault:cma_heap_vm_fault [cma_heap] mmap:dma_buf_mmap_internal read_folio:0x0
> > [  103.407877] CPU: 3 PID: 5276 Comm: crosvm_vcpu0 Tainted: G        W  OE      6.6.46-android15-8-g8bab72b63c20-dirty-4k #1 1e474a12dac4553a3ebba3a911f3b744176a5d2d
> > [  103.409818] Hardware name: Unisoc UMS9632-base Board (DT)
> > [  103.410613] Call trace:
> > [  103.411038] dump_backtrace+0xf4/0x140
> > [  103.411641] show_stack+0x20/0x30
> > [  103.412184] dump_stack_lvl+0x60/0x84
> > [  103.412766] dump_stack+0x18/0x24
> > [  103.413304] print_bad_pte+0x1b8/0x1cc
> > [  103.413909] vm_normal_page+0xc8/0xd0
> > [  103.414491] follow_page_pte+0xb0/0x304
> > [  103.415096] follow_page_mask+0x108/0x240
> > [  103.415721] __get_user_pages+0x168/0x4ac
> > [  103.416342] __gup_longterm_locked+0x15c/0x864
> > [  103.417023] pin_user_pages+0x70/0xcc
> > [  103.417609] pkvm_mem_abort+0xf8/0x5c0
> > [  103.418207] kvm_handle_guest_abort+0x3e0/0x3e4
> > [  103.418906] handle_exit+0xac/0x33c
> > [  103.419472] kvm_arch_vcpu_ioctl_run+0x48c/0x8d8
> > [  103.420176] kvm_vcpu_ioctl+0x504/0x5bc
> > [  103.420785] __arm64_sys_ioctl+0xb0/0xec
> > [  103.421401] invoke_syscall+0x60/0x11c
> > [  103.422000] el0_svc_common+0xb4/0xe8
> > [  103.422590] do_el0_svc+0x24/0x30
> > [  103.423131] el0_svc+0x3c/0x70
> > [  103.423640] el0t_64_sync_handler+0x68/0xbc
> > [  103.424288] el0t_64_sync+0x1a8/0x1ac
> >
> > Signed-off-by: Xiwei Wang <xiwei.wang1@unisoc.com>
> > Signed-off-by: Aijun Sun <aijun.sun@unisoc.com>
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> > ---
> >   drivers/dma-buf/heaps/cma_heap.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
> > index c384004b918e..b301fb63f16b 100644
> > --- a/drivers/dma-buf/heaps/cma_heap.c
> > +++ b/drivers/dma-buf/heaps/cma_heap.c
> > @@ -168,7 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf)
> >       if (vmf->pgoff > buffer->pagecount)
> >               return VM_FAULT_SIGBUS;
> >
> > -     return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff]));
> > +     return vmf_insert_page(vma, vmf->address, buffer->pages[vmf->pgoff]);
> >   }
> >
> >   static const struct vm_operations_struct dma_heap_vm_ops = {
>
Christian König Jan. 16, 2025, 3:31 p.m. UTC | #3
Am 16.01.25 um 02:46 schrieb Zhaoyang Huang:
> On Wed, Jan 15, 2025 at 7:49 PM Christian König
> <christian.koenig@amd.com>  wrote:
>> Am 15.01.25 um 07:18 schrieb zhaoyang.huang:
>>> From: Zhaoyang Huang<zhaoyang.huang@unisoc.com>
>>>
>>> When using dma-buf as memory pool for VMM. The vmf_insert_pfn will
>>> apply PTE_SPECIAL on pte which have vm_normal_page report bad_pte and
>>> return NULL. This commit would like to suggest to replace
>>> vmf_insert_pfn by vmf_insert_page.
>> Setting PTE_SPECIAL is completely intentional here to prevent
>> get_user_pages() from working on DMA-buf mappings.
> ok. May I ask the reason?

Drivers using this interface own the backing store for their specific 
use cases. There are a couple of things get_user_pages(), 
pin_user_pages(), direct I/O etc.. do which usually clash with those use 
cases. So that is intentionally completely disabled.

We have the possibility to create a DMA-buf from memfd object and you 
can then do direct I/O to the memfd and still use the DMA-buf with GPUs 
or V4L for example.

>> So absolutely clear NAK to this patch here.
>>
>> What exactly are you trying to do?
> I would like to have pkvm have guest kernel be faulted of its second
> stage page fault(ARM64's memory virtualization method) on dma-buf
> which use pin_user_pages.

Yeah, exactly that's one of the use case which we intentionally prevent 
here.

The backing store drivers use don't care about the pin count of the 
memory and happily give it back to memory pools and/or swap it with 
device local memory if necessary.

When this happens the ARM VM wouldn't be informed of the change and 
potentially accesses the wrong address.

So sorry, but this approach won't work.

You could try with the memfd+DMA-buf approach I mentioned earlier, but 
that won't give you all functionality on all DMA-buf supporting devices.

For example GPUs usually can't scan out to a monitor from such buffers 
because of hardware limitations.

Regards,
Christian.

>> Regards,
>> Christian.
>>
>>> [  103.402787] kvm [5276]: gfn(ipa)=0x80000 hva=0x7d4a400000 write_fault=0
>>> [  103.403822] BUG: Bad page map in process crosvm_vcpu0  pte:168000140000f43 pmd:8000000c1ca0003
>>> [  103.405144] addr:0000007d4a400000 vm_flags:040400fb anon_vma:0000000000000000 mapping:ffffff8085163df0 index:0
>>> [  103.406536]file:dmabuf  fault:cma_heap_vm_fault [cma_heap] mmap:dma_buf_mmap_internal read_folio:0x0
>>> [  103.407877] CPU: 3 PID: 5276 Comm: crosvm_vcpu0 Tainted: G        W  OE      6.6.46-android15-8-g8bab72b63c20-dirty-4k #1 1e474a12dac4553a3ebba3a911f3b744176a5d2d
>>> [  103.409818] Hardware name: Unisoc UMS9632-base Board (DT)
>>> [  103.410613] Call trace:
>>> [  103.411038] dump_backtrace+0xf4/0x140
>>> [  103.411641] show_stack+0x20/0x30
>>> [  103.412184] dump_stack_lvl+0x60/0x84
>>> [  103.412766] dump_stack+0x18/0x24
>>> [  103.413304] print_bad_pte+0x1b8/0x1cc
>>> [  103.413909] vm_normal_page+0xc8/0xd0
>>> [  103.414491] follow_page_pte+0xb0/0x304
>>> [  103.415096] follow_page_mask+0x108/0x240
>>> [  103.415721] __get_user_pages+0x168/0x4ac
>>> [  103.416342] __gup_longterm_locked+0x15c/0x864
>>> [  103.417023] pin_user_pages+0x70/0xcc
>>> [  103.417609] pkvm_mem_abort+0xf8/0x5c0
>>> [  103.418207] kvm_handle_guest_abort+0x3e0/0x3e4
>>> [  103.418906] handle_exit+0xac/0x33c
>>> [  103.419472] kvm_arch_vcpu_ioctl_run+0x48c/0x8d8
>>> [  103.420176] kvm_vcpu_ioctl+0x504/0x5bc
>>> [  103.420785] __arm64_sys_ioctl+0xb0/0xec
>>> [  103.421401] invoke_syscall+0x60/0x11c
>>> [  103.422000] el0_svc_common+0xb4/0xe8
>>> [  103.422590] do_el0_svc+0x24/0x30
>>> [  103.423131] el0_svc+0x3c/0x70
>>> [  103.423640] el0t_64_sync_handler+0x68/0xbc
>>> [  103.424288] el0t_64_sync+0x1a8/0x1ac
>>>
>>> Signed-off-by: Xiwei Wang<xiwei.wang1@unisoc.com>
>>> Signed-off-by: Aijun Sun<aijun.sun@unisoc.com>
>>> Signed-off-by: Zhaoyang Huang<zhaoyang.huang@unisoc.com>
>>> ---
>>>    drivers/dma-buf/heaps/cma_heap.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
>>> index c384004b918e..b301fb63f16b 100644
>>> --- a/drivers/dma-buf/heaps/cma_heap.c
>>> +++ b/drivers/dma-buf/heaps/cma_heap.c
>>> @@ -168,7 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf)
>>>        if (vmf->pgoff > buffer->pagecount)
>>>                return VM_FAULT_SIGBUS;
>>>
>>> -     return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff]));
>>> +     return vmf_insert_page(vma, vmf->address, buffer->pages[vmf->pgoff]);
>>>    }
>>>
>>>    static const struct vm_operations_struct dma_heap_vm_ops = {
diff mbox series

Patch

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index c384004b918e..b301fb63f16b 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -168,7 +168,7 @@  static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf)
 	if (vmf->pgoff > buffer->pagecount)
 		return VM_FAULT_SIGBUS;
 
-	return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff]));
+	return vmf_insert_page(vma, vmf->address, buffer->pages[vmf->pgoff]);
 }
 
 static const struct vm_operations_struct dma_heap_vm_ops = {