Message ID | 20240324234542.2038726-2-hch@lst.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/3] virt: acrn: stop using follow_pfn | expand |
On 25.03.24 00:45, Christoph Hellwig wrote: > Switch from follow_pfn to follow_pte so that we can get rid of > follow_pfn. Note that this doesn't fix any of the pre-existing > raciness and lack of permission checking in the code. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > drivers/virt/acrn/mm.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c > index fa5d9ca6be5706..69c3f619f88196 100644 > --- a/drivers/virt/acrn/mm.c > +++ b/drivers/virt/acrn/mm.c > @@ -171,18 +171,24 @@ int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap) > mmap_read_lock(current->mm); > vma = vma_lookup(current->mm, memmap->vma_base); > if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) { > + spinlock_t *ptl; > + pte_t *ptep; > + > if ((memmap->vma_base + memmap->len) > vma->vm_end) { > mmap_read_unlock(current->mm); > return -EINVAL; > } > > - ret = follow_pfn(vma, memmap->vma_base, &pfn); > - mmap_read_unlock(current->mm); > + ret = follow_pte(vma->vm_mm, memmap->vma_base, &ptep, &ptl); > if (ret < 0) { > + mmap_read_unlock(current->mm); > dev_dbg(acrn_dev.this_device, > "Failed to lookup PFN at VMA:%pK.\n", (void *)memmap->vma_base); > return ret; > } > + pfn = pte_pfn(ptep_get(ptep)); > + pte_unmap_unlock(ptep, ptl); > + mmap_read_unlock(current->mm); > > return acrn_mm_region_add(vm, memmap->user_vm_pa, > PFN_PHYS(pfn), memmap->len, ... I have similar patches lying around here (see bwlow). I added some actual access permission checks. (I also realized, that if we get an anon folio in a COW mapping via follow_pte() here, I suspect one might be able to do some nasty things. Just imagine if we munmap(), free the anon folio, and then it gets used in other context ... At least KVM/vfio handle that using references+MMU notifiers.) Reviewed-by: David Hildenbrand <david@redhat.com> commit 812e577dea97327bcc68d34504e7387dff2ffd8f Author: David Hildenbrand <david@redhat.com> Date: Fri Mar 8 13:53:04 2024 +0100 virt/acrn/mm: use follow_pte() instead of follow_pfn() follow_pfn() should not be used. Instead, use follow_pte() and do some best-guess PTE permission checks. Should we simply always require pte_write()? Maybe. Performing no checks clearly looks wrong, and pin_user_pages_fast() is unconditionally called with FOLL_WRITE. Signed-off-by: David Hildenbrand <david@redhat.com> diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c index fa5d9ca6be57..563c545adb2c 100644 --- a/drivers/virt/acrn/mm.c +++ b/drivers/virt/acrn/mm.c @@ -171,12 +171,22 @@ int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap) mmap_read_lock(current->mm); vma = vma_lookup(current->mm, memmap->vma_base); if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) { + spinlock_t *ptl; + pte_t *ptep; + if ((memmap->vma_base + memmap->len) > vma->vm_end) { mmap_read_unlock(current->mm); return -EINVAL; } - ret = follow_pfn(vma, memmap->vma_base, &pfn); + ret = follow_pte(vma, memmap->vma_base, &ptep, &ptl); + if (!ret) { + pfn = pte_pfn(ptep_get(ptep)); + if (!pte_write(ptep_get(ptep)) && + (memmap->attr & ACRN_MEM_ACCESS_WRITE)) + ret = -EFAULT; + pte_unmap_unlock(ptep, ptl); + } mmap_read_unlock(current->mm); if (ret < 0) {
On Mon, Mar 25, 2024 at 11:33:31AM +0100, David Hildenbrand wrote: > ... I have similar patches lying around here (see bwlow). I added some > actual access permission checks. > > (I also realized, that if we get an anon folio in a COW mapping via follow_pte() > here, I suspect one might be able to do some nasty things. Just imagine if we > munmap(), free the anon folio, and then it gets used in other context ... At > least KVM/vfio handle that using references+MMU notifiers.) How about you just send out your series that seems to go further and I retract mine?
On 26.03.24 07:04, Christoph Hellwig wrote: > On Mon, Mar 25, 2024 at 11:33:31AM +0100, David Hildenbrand wrote: >> ... I have similar patches lying around here (see bwlow). I added some >> actual access permission checks. >> >> (I also realized, that if we get an anon folio in a COW mapping via follow_pte() >> here, I suspect one might be able to do some nasty things. Just imagine if we >> munmap(), free the anon folio, and then it gets used in other context ... At >> least KVM/vfio handle that using references+MMU notifiers.) > > How about you just send out your series that seems to go further and > I retract mine? Let's go with yours first and I'll rebase. Regarding above issue, I still have not made up my mind: likely we should reject any PFN in acrn that has a valid "struct page", and that page does not have PG_reserved set. That's what VFIO effectively does IIRC.
diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c index fa5d9ca6be5706..69c3f619f88196 100644 --- a/drivers/virt/acrn/mm.c +++ b/drivers/virt/acrn/mm.c @@ -171,18 +171,24 @@ int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap) mmap_read_lock(current->mm); vma = vma_lookup(current->mm, memmap->vma_base); if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) { + spinlock_t *ptl; + pte_t *ptep; + if ((memmap->vma_base + memmap->len) > vma->vm_end) { mmap_read_unlock(current->mm); return -EINVAL; } - ret = follow_pfn(vma, memmap->vma_base, &pfn); - mmap_read_unlock(current->mm); + ret = follow_pte(vma->vm_mm, memmap->vma_base, &ptep, &ptl); if (ret < 0) { + mmap_read_unlock(current->mm); dev_dbg(acrn_dev.this_device, "Failed to lookup PFN at VMA:%pK.\n", (void *)memmap->vma_base); return ret; } + pfn = pte_pfn(ptep_get(ptep)); + pte_unmap_unlock(ptep, ptl); + mmap_read_unlock(current->mm); return acrn_mm_region_add(vm, memmap->user_vm_pa, PFN_PHYS(pfn), memmap->len,
Switch from follow_pfn to follow_pte so that we can get rid of follow_pfn. Note that this doesn't fix any of the pre-existing raciness and lack of permission checking in the code. Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/virt/acrn/mm.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)