Message ID | 20181003185854.GA1174@jordon-HP-15-Notebook-PC (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] mm: Introduce new function vm_insert_kmem_page | expand |
Hi Souptick, On Wed, Oct 3, 2018 at 8:55 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > vm_insert_kmem_page is similar to vm_insert_page and will > be used by drivers to map kernel (kmalloc/vmalloc/pages) > allocated memory to user vma. > > Going forward, the plan is to restrict future drivers not > to use vm_insert_page ( *it will generate new errno to > VM_FAULT_CODE mapping code for new drivers which were already > cleaned up for existing drivers*) in #PF (page fault handler) > context but to make use of vmf_insert_page which returns > VMF_FAULT_CODE and that is not possible until both vm_insert_page > and vmf_insert_page API exists. > > But there are some consumers of vm_insert_page which use it > outside #PF context. straight forward conversion of vm_insert_page > to vmf_insert_page won't work there as those function calls expects > errno not vm_fault_t in return. > > These are the approaches which could have been taken to handle > this scenario - > > * Replace vm_insert_page with vmf_insert_page and then write few > extra lines of code to convert VM_FAULT_CODE to errno which > makes driver users more complex ( also the reverse mapping errno to > VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , > not preferred to introduce anything similar again) > > * Maintain both vm_insert_page and vmf_insert_page and use it in > respective places. But it won't gurantee that vm_insert_page will > never be used in #PF context. > > * Introduce a similar API like vm_insert_page, convert all non #PF > consumer to use it and finally remove vm_insert_page by converting > it to vmf_insert_page. > > And the 3rd approach was taken by introducing vm_insert_kmem_page(). This looks better than the previous one of adding non-trivial code to each driver, thank you! A couple of comments below. > > In short, vmf_insert_page will be used in page fault handlers > context and vm_insert_kmem_page will be used to map kernel > memory to user vma outside page fault handlers context. > > Few drivers are converted to use vm_insert_kmem_page(). This will > allow both to review the api and that it serves it purpose. other > consumers of vm_insert_page (*used in non #PF context*) will be > replaced by vm_insert_kmem_page, but in separate patches. > other -> Other Also, as far as I can see, there are only a few vm_insert_page users remaining. With the new function, they should be trivial to convert, no? Therefore, could we do them all in one go, possibly in a patch series? Or, maybe, even better: wait until you remove the vm_* functions and simply reuse vm_insert_page for this -- that way you don't need a new name and you don't have to change any of the last users (I mean the drivers using it outside the page fault handlers). > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > --- > v2: Few non #PF consumers of vm_insert_page are converted > to use vm_insert_kmem_page in patch v2. > > Updated the change log. > > arch/arm/mm/dma-mapping.c | 2 +- > drivers/auxdisplay/cfag12864bfb.c | 2 +- > drivers/auxdisplay/ht16k33.c | 2 +- > drivers/firewire/core-iso.c | 2 +- > drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 2 +- > include/linux/mm.h | 2 + > kernel/kcov.c | 4 +- > mm/memory.c | 69 +++++++++++++++++++++++++++++ > mm/nommu.c | 7 +++ > mm/vmalloc.c | 2 +- > 10 files changed, 86 insertions(+), 8 deletions(-) > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 6656647..58d7971 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -1598,7 +1598,7 @@ static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma > pages += off; > > do { > - int ret = vm_insert_page(vma, uaddr, *pages++); > + int ret = vm_insert_kmem_page(vma, uaddr, *pages++); > if (ret) { > pr_err("Remapping memory failed: %d\n", ret); > return ret; > diff --git a/drivers/auxdisplay/cfag12864bfb.c b/drivers/auxdisplay/cfag12864bfb.c > index 40c8a55..82fd627 100644 > --- a/drivers/auxdisplay/cfag12864bfb.c > +++ b/drivers/auxdisplay/cfag12864bfb.c > @@ -52,7 +52,7 @@ > > static int cfag12864bfb_mmap(struct fb_info *info, struct vm_area_struct *vma) > { > - return vm_insert_page(vma, vma->vm_start, > + return vm_insert_kmem_page(vma, vma->vm_start, > virt_to_page(cfag12864b_buffer)); > } > > diff --git a/drivers/auxdisplay/ht16k33.c b/drivers/auxdisplay/ht16k33.c > index a43276c..64de30b 100644 > --- a/drivers/auxdisplay/ht16k33.c > +++ b/drivers/auxdisplay/ht16k33.c > @@ -224,7 +224,7 @@ static int ht16k33_mmap(struct fb_info *info, struct vm_area_struct *vma) > { > struct ht16k33_priv *priv = info->par; > > - return vm_insert_page(vma, vma->vm_start, > + return vm_insert_kmem_page(vma, vma->vm_start, > virt_to_page(priv->fbdev.buffer)); > } > > diff --git a/drivers/firewire/core-iso.c b/drivers/firewire/core-iso.c > index 051327a..5f1548d 100644 > --- a/drivers/firewire/core-iso.c > +++ b/drivers/firewire/core-iso.c > @@ -112,7 +112,7 @@ int fw_iso_buffer_map_vma(struct fw_iso_buffer *buffer, > > uaddr = vma->vm_start; > for (i = 0; i < buffer->page_count; i++) { > - err = vm_insert_page(vma, uaddr, buffer->pages[i]); > + err = vm_insert_kmem_page(vma, uaddr, buffer->pages[i]); > if (err) > return err; > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c > index a8db758..57eb7af 100644 > --- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c > @@ -234,7 +234,7 @@ static int rockchip_drm_gem_object_mmap_iommu(struct drm_gem_object *obj, > return -ENXIO; > > for (i = offset; i < end; i++) { > - ret = vm_insert_page(vma, uaddr, rk_obj->pages[i]); > + ret = vm_insert_kmem_page(vma, uaddr, rk_obj->pages[i]); > if (ret) > return ret; > uaddr += PAGE_SIZE; > diff --git a/include/linux/mm.h b/include/linux/mm.h > index a61ebe8..5f42d35 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2477,6 +2477,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, > struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr); > int remap_pfn_range(struct vm_area_struct *, unsigned long addr, > unsigned long pfn, unsigned long size, pgprot_t); > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > + struct page *page); > int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); > int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, > unsigned long pfn); > diff --git a/kernel/kcov.c b/kernel/kcov.c > index 3ebd09e..2afaeb4 100644 > --- a/kernel/kcov.c > +++ b/kernel/kcov.c > @@ -293,8 +293,8 @@ static int kcov_mmap(struct file *filep, struct vm_area_struct *vma) > spin_unlock(&kcov->lock); > for (off = 0; off < size; off += PAGE_SIZE) { > page = vmalloc_to_page(kcov->area + off); > - if (vm_insert_page(vma, vma->vm_start + off, page)) > - WARN_ONCE(1, "vm_insert_page() failed"); > + if (vm_insert_kmem_page(vma, vma->vm_start + off, page)) > + WARN_ONCE(1, "vm_insert_kmem_page() failed"); > } > return 0; > } > diff --git a/mm/memory.c b/mm/memory.c > index c467102..b800c10 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1682,6 +1682,75 @@ pte_t *__get_locked_pte(struct mm_struct *mm, unsigned long addr, > return pte_alloc_map_lock(mm, pmd, addr, ptl); > } > > +static int insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > + struct page *page, pgprot_t prot) > +{ > + struct mm_struct *mm = vma->vm_mm; > + int retval; > + pte_t *pte; > + spinlock_t *ptl; > + > + retval = -EINVAL; > + if (PageAnon(page)) > + goto out; > + retval = -ENOMEM; > + flush_dcache_page(page); > + pte = get_locked_pte(mm, addr, &ptl); > + if (!pte) > + goto out; > + retval = -EBUSY; > + if (!pte_none(*pte)) > + goto out_unlock; > + > + get_page(page); > + inc_mm_counter_fast(mm, mm_counter_file(page)); > + page_add_file_rmap(page, false); > + set_pte_at(mm, addr, pte, mk_pte(page, prot)); > + > + retval = 0; > + pte_unmap_unlock(pte, ptl); > + return retval; > +out_unlock: > + pte_unmap_unlock(pte, ptl); > +out: > + return retval; > +} > + > +/** > + * vm_insert_kmem_page - insert single page into user vma > + * @vma: user vma to map to > + * @addr: target user address of this page > + * @page: source kernel page > + * > + * This allows drivers to insert individual kernel memory into a user vma. > + * This API should be used outside page fault handlers context. > + * > + * Previously the same has been done with vm_insert_page by drivers. But > + * vm_insert_page will be converted to vmf_insert_page and will be used > + * in fault handlers context and return type of vmf_insert_page will be > + * vm_fault_t type. This is a "temporal" comment, i.e. it refers to things that are happening at the moment -- I would say that should be part of the commit message, not the code, since it will be obsolete soon. Also, consider that, in a way, vm_insert_page is actually being replaced by vmf_insert_page only in one of the use cases (the other being replaced by this). Maybe you could instead say something like: In the past, vm_insert_page was used for this purpose. Do not use vmf_insert_page because... and leave the full explanation in the commit. > + * > + * But there are places where drivers need to map kernel memory into user > + * vma outside fault handlers context. As vmf_insert_page will be restricted > + * to use within page fault handlers, vm_insert_kmem_page could be used > + * to map kernel memory to user vma outside fault handlers context. > + */ Ditto. > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > + struct page *page) > +{ > + if (addr < vma->vm_start || addr >= vma->vm_end) > + return -EFAULT; > + if (!page_count(page)) > + return -EINVAL; > + if (!(vma->vm_flags & VM_MIXEDMAP)) { > + BUG_ON(down_read_trylock(&vma->vm_mm->mmap_sem)); > + BUG_ON(vma->vm_flags & VM_PFNMAP); > + vma->vm_flags |= VM_MIXEDMAP; > + } > + return insert_kmem_page(vma, addr, page, vma->vm_page_prot); > +} > +EXPORT_SYMBOL(vm_insert_kmem_page); > + > /* > * This is the old fallback for page remapping. > * > diff --git a/mm/nommu.c b/mm/nommu.c > index e4aac33..153b8c8 100644 > --- a/mm/nommu.c > +++ b/mm/nommu.c > @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > } > EXPORT_SYMBOL(vm_insert_page); > > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > + struct page *page) > +{ > + return -EINVAL; > +} > +EXPORT_SYMBOL(vm_insert_kmem_page); > + > /* > * sys_brk() for the most part doesn't need the global kernel > * lock, except when an application is doing something nasty > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index a728fc4..61d279f 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2251,7 +2251,7 @@ int remap_vmalloc_range_partial(struct vm_area_struct *vma, unsigned long uaddr, > struct page *page = vmalloc_to_page(kaddr); > int ret; > > - ret = vm_insert_page(vma, uaddr, page); > + ret = vm_insert_kmem_page(vma, uaddr, page); > if (ret) > return ret; > > -- > 1.9.1 > Cheers, Miguel
On Wed, Oct 03, 2018 at 11:14:45PM +0100, Russell King - ARM Linux wrote: > On Wed, Oct 03, 2018 at 01:00:03PM -0700, Matthew Wilcox wrote: > > On Thu, Oct 04, 2018 at 12:28:54AM +0530, Souptick Joarder wrote: > > > These are the approaches which could have been taken to handle > > > this scenario - > > > > > > * Replace vm_insert_page with vmf_insert_page and then write few > > > extra lines of code to convert VM_FAULT_CODE to errno which > > > makes driver users more complex ( also the reverse mapping errno to > > > VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , > > > not preferred to introduce anything similar again) > > > > > > * Maintain both vm_insert_page and vmf_insert_page and use it in > > > respective places. But it won't gurantee that vm_insert_page will > > > never be used in #PF context. > > > > > > * Introduce a similar API like vm_insert_page, convert all non #PF > > > consumer to use it and finally remove vm_insert_page by converting > > > it to vmf_insert_page. > > > > > > And the 3rd approach was taken by introducing vm_insert_kmem_page(). > > > > > > In short, vmf_insert_page will be used in page fault handlers > > > context and vm_insert_kmem_page will be used to map kernel > > > memory to user vma outside page fault handlers context. > > > > As far as I can tell, vm_insert_kmem_page() is line-for-line identical > > with vm_insert_page(). Seriously, here's a diff I just did: [...] > > What on earth are you trying to do? > > Reading the commit log, it seems that the intention is to split out > vm_insert_page() used outside of page-fault handling with the use > within page-fault handling, so that different return codes can be > used. Right, but we already did that. We now have vmf_insert_page() which returns a VM_FAULT_* code and vm_insert_page() which returns an errno.
On Thu, Oct 4, 2018 at 1:28 AM Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> wrote: > > Hi Souptick, > > On Wed, Oct 3, 2018 at 8:55 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > vm_insert_kmem_page is similar to vm_insert_page and will > > be used by drivers to map kernel (kmalloc/vmalloc/pages) > > allocated memory to user vma. > > > > Going forward, the plan is to restrict future drivers not > > to use vm_insert_page ( *it will generate new errno to > > VM_FAULT_CODE mapping code for new drivers which were already > > cleaned up for existing drivers*) in #PF (page fault handler) > > context but to make use of vmf_insert_page which returns > > VMF_FAULT_CODE and that is not possible until both vm_insert_page > > and vmf_insert_page API exists. > > > > But there are some consumers of vm_insert_page which use it > > outside #PF context. straight forward conversion of vm_insert_page > > to vmf_insert_page won't work there as those function calls expects > > errno not vm_fault_t in return. > > > > These are the approaches which could have been taken to handle > > this scenario - > > > > * Replace vm_insert_page with vmf_insert_page and then write few > > extra lines of code to convert VM_FAULT_CODE to errno which > > makes driver users more complex ( also the reverse mapping errno to > > VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , > > not preferred to introduce anything similar again) > > > > * Maintain both vm_insert_page and vmf_insert_page and use it in > > respective places. But it won't gurantee that vm_insert_page will > > never be used in #PF context. > > > > * Introduce a similar API like vm_insert_page, convert all non #PF > > consumer to use it and finally remove vm_insert_page by converting > > it to vmf_insert_page. > > > > And the 3rd approach was taken by introducing vm_insert_kmem_page(). > > This looks better than the previous one of adding non-trivial code to > each driver, thank you! > > A couple of comments below. > > > > > In short, vmf_insert_page will be used in page fault handlers > > context and vm_insert_kmem_page will be used to map kernel > > memory to user vma outside page fault handlers context. > > > > Few drivers are converted to use vm_insert_kmem_page(). This will > > allow both to review the api and that it serves it purpose. other > > consumers of vm_insert_page (*used in non #PF context*) will be > > replaced by vm_insert_kmem_page, but in separate patches. > > > > other -> Other > > Also, as far as I can see, there are only a few vm_insert_page users > remaining. With the new function, they should be trivial to convert, > no? Therefore, could we do them all in one go, possibly in a patch > series? yes, all the user can be converted in a patch series. > > Or, maybe, even better: wait until you remove the vm_* functions and > simply reuse vm_insert_page for this -- that way you don't need a new > name and you don't have to change any of the last users (I mean the > drivers using it outside the page fault handlers). > > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > > --- > > v2: Few non #PF consumers of vm_insert_page are converted > > to use vm_insert_kmem_page in patch v2. > > > > Updated the change log. > > > > arch/arm/mm/dma-mapping.c | 2 +- > > drivers/auxdisplay/cfag12864bfb.c | 2 +- > > drivers/auxdisplay/ht16k33.c | 2 +- > > drivers/firewire/core-iso.c | 2 +- > > drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 2 +- > > include/linux/mm.h | 2 + > > kernel/kcov.c | 4 +- > > mm/memory.c | 69 +++++++++++++++++++++++++++++ > > mm/nommu.c | 7 +++ > > mm/vmalloc.c | 2 +- > > 10 files changed, 86 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > > index 6656647..58d7971 100644 > > --- a/arch/arm/mm/dma-mapping.c > > +++ b/arch/arm/mm/dma-mapping.c > > @@ -1598,7 +1598,7 @@ static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma > > pages += off; > > > > do { > > - int ret = vm_insert_page(vma, uaddr, *pages++); > > + int ret = vm_insert_kmem_page(vma, uaddr, *pages++); > > if (ret) { > > pr_err("Remapping memory failed: %d\n", ret); > > return ret; > > diff --git a/drivers/auxdisplay/cfag12864bfb.c b/drivers/auxdisplay/cfag12864bfb.c > > index 40c8a55..82fd627 100644 > > --- a/drivers/auxdisplay/cfag12864bfb.c > > +++ b/drivers/auxdisplay/cfag12864bfb.c > > @@ -52,7 +52,7 @@ > > > > static int cfag12864bfb_mmap(struct fb_info *info, struct vm_area_struct *vma) > > { > > - return vm_insert_page(vma, vma->vm_start, > > + return vm_insert_kmem_page(vma, vma->vm_start, > > virt_to_page(cfag12864b_buffer)); > > } > > > > diff --git a/drivers/auxdisplay/ht16k33.c b/drivers/auxdisplay/ht16k33.c > > index a43276c..64de30b 100644 > > --- a/drivers/auxdisplay/ht16k33.c > > +++ b/drivers/auxdisplay/ht16k33.c > > @@ -224,7 +224,7 @@ static int ht16k33_mmap(struct fb_info *info, struct vm_area_struct *vma) > > { > > struct ht16k33_priv *priv = info->par; > > > > - return vm_insert_page(vma, vma->vm_start, > > + return vm_insert_kmem_page(vma, vma->vm_start, > > virt_to_page(priv->fbdev.buffer)); > > } > > > > diff --git a/drivers/firewire/core-iso.c b/drivers/firewire/core-iso.c > > index 051327a..5f1548d 100644 > > --- a/drivers/firewire/core-iso.c > > +++ b/drivers/firewire/core-iso.c > > @@ -112,7 +112,7 @@ int fw_iso_buffer_map_vma(struct fw_iso_buffer *buffer, > > > > uaddr = vma->vm_start; > > for (i = 0; i < buffer->page_count; i++) { > > - err = vm_insert_page(vma, uaddr, buffer->pages[i]); > > + err = vm_insert_kmem_page(vma, uaddr, buffer->pages[i]); > > if (err) > > return err; > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c > > index a8db758..57eb7af 100644 > > --- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c > > @@ -234,7 +234,7 @@ static int rockchip_drm_gem_object_mmap_iommu(struct drm_gem_object *obj, > > return -ENXIO; > > > > for (i = offset; i < end; i++) { > > - ret = vm_insert_page(vma, uaddr, rk_obj->pages[i]); > > + ret = vm_insert_kmem_page(vma, uaddr, rk_obj->pages[i]); > > if (ret) > > return ret; > > uaddr += PAGE_SIZE; > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index a61ebe8..5f42d35 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2477,6 +2477,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, > > struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr); > > int remap_pfn_range(struct vm_area_struct *, unsigned long addr, > > unsigned long pfn, unsigned long size, pgprot_t); > > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > + struct page *page); > > int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); > > int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, > > unsigned long pfn); > > diff --git a/kernel/kcov.c b/kernel/kcov.c > > index 3ebd09e..2afaeb4 100644 > > --- a/kernel/kcov.c > > +++ b/kernel/kcov.c > > @@ -293,8 +293,8 @@ static int kcov_mmap(struct file *filep, struct vm_area_struct *vma) > > spin_unlock(&kcov->lock); > > for (off = 0; off < size; off += PAGE_SIZE) { > > page = vmalloc_to_page(kcov->area + off); > > - if (vm_insert_page(vma, vma->vm_start + off, page)) > > - WARN_ONCE(1, "vm_insert_page() failed"); > > + if (vm_insert_kmem_page(vma, vma->vm_start + off, page)) > > + WARN_ONCE(1, "vm_insert_kmem_page() failed"); > > } > > return 0; > > } > > diff --git a/mm/memory.c b/mm/memory.c > > index c467102..b800c10 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1682,6 +1682,75 @@ pte_t *__get_locked_pte(struct mm_struct *mm, unsigned long addr, > > return pte_alloc_map_lock(mm, pmd, addr, ptl); > > } > > > > +static int insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > + struct page *page, pgprot_t prot) > > +{ > > + struct mm_struct *mm = vma->vm_mm; > > + int retval; > > + pte_t *pte; > > + spinlock_t *ptl; > > + > > + retval = -EINVAL; > > + if (PageAnon(page)) > > + goto out; > > + retval = -ENOMEM; > > + flush_dcache_page(page); > > + pte = get_locked_pte(mm, addr, &ptl); > > + if (!pte) > > + goto out; > > + retval = -EBUSY; > > + if (!pte_none(*pte)) > > + goto out_unlock; > > + > > + get_page(page); > > + inc_mm_counter_fast(mm, mm_counter_file(page)); > > + page_add_file_rmap(page, false); > > + set_pte_at(mm, addr, pte, mk_pte(page, prot)); > > + > > + retval = 0; > > + pte_unmap_unlock(pte, ptl); > > + return retval; > > +out_unlock: > > + pte_unmap_unlock(pte, ptl); > > +out: > > + return retval; > > +} > > + > > +/** > > + * vm_insert_kmem_page - insert single page into user vma > > + * @vma: user vma to map to > > + * @addr: target user address of this page > > + * @page: source kernel page > > + * > > + * This allows drivers to insert individual kernel memory into a user vma. > > + * This API should be used outside page fault handlers context. > > + * > > + * Previously the same has been done with vm_insert_page by drivers. But > > + * vm_insert_page will be converted to vmf_insert_page and will be used > > + * in fault handlers context and return type of vmf_insert_page will be > > + * vm_fault_t type. > > This is a "temporal" comment, i.e. it refers to things that are > happening at the moment -- I would say that should be part of the > commit message, not the code, since it will be obsolete soon. Also, > consider that, in a way, vm_insert_page is actually being replaced by > vmf_insert_page only in one of the use cases (the other being replaced > by this). Maybe you could instead say something like: > > In the past, vm_insert_page was used for this purpose. Do not use > vmf_insert_page because... > > and leave the full explanation in the commit. Sure , I will work on it. > > > + * > > + * But there are places where drivers need to map kernel memory into user > > + * vma outside fault handlers context. As vmf_insert_page will be restricted > > + * to use within page fault handlers, vm_insert_kmem_page could be used > > + * to map kernel memory to user vma outside fault handlers context. > > + */ > > Ditto. > > > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > + struct page *page) > > +{ > > + if (addr < vma->vm_start || addr >= vma->vm_end) > > + return -EFAULT; > > + if (!page_count(page)) > > + return -EINVAL; > > + if (!(vma->vm_flags & VM_MIXEDMAP)) { > > + BUG_ON(down_read_trylock(&vma->vm_mm->mmap_sem)); > > + BUG_ON(vma->vm_flags & VM_PFNMAP); > > + vma->vm_flags |= VM_MIXEDMAP; > > + } > > + return insert_kmem_page(vma, addr, page, vma->vm_page_prot); > > +} > > +EXPORT_SYMBOL(vm_insert_kmem_page); > > + > > /* > > * This is the old fallback for page remapping. > > * > > diff --git a/mm/nommu.c b/mm/nommu.c > > index e4aac33..153b8c8 100644 > > --- a/mm/nommu.c > > +++ b/mm/nommu.c > > @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > > } > > EXPORT_SYMBOL(vm_insert_page); > > > > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > + struct page *page) > > +{ > > + return -EINVAL; > > +} > > +EXPORT_SYMBOL(vm_insert_kmem_page); > > + > > /* > > * sys_brk() for the most part doesn't need the global kernel > > * lock, except when an application is doing something nasty > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index a728fc4..61d279f 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2251,7 +2251,7 @@ int remap_vmalloc_range_partial(struct vm_area_struct *vma, unsigned long uaddr, > > struct page *page = vmalloc_to_page(kaddr); > > int ret; > > > > - ret = vm_insert_page(vma, uaddr, page); > > + ret = vm_insert_kmem_page(vma, uaddr, page); > > if (ret) > > return ret; > > > > -- > > 1.9.1 > > > > Cheers, > Miguel
On Thu, Oct 4, 2018 at 3:45 AM Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > > On Wed, Oct 03, 2018 at 01:00:03PM -0700, Matthew Wilcox wrote: > > On Thu, Oct 04, 2018 at 12:28:54AM +0530, Souptick Joarder wrote: > > > These are the approaches which could have been taken to handle > > > this scenario - > > > > > > * Replace vm_insert_page with vmf_insert_page and then write few > > > extra lines of code to convert VM_FAULT_CODE to errno which > > > makes driver users more complex ( also the reverse mapping errno to > > > VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , > > > not preferred to introduce anything similar again) > > > > > > * Maintain both vm_insert_page and vmf_insert_page and use it in > > > respective places. But it won't gurantee that vm_insert_page will > > > never be used in #PF context. > > > > > > * Introduce a similar API like vm_insert_page, convert all non #PF > > > consumer to use it and finally remove vm_insert_page by converting > > > it to vmf_insert_page. > > > > > > And the 3rd approach was taken by introducing vm_insert_kmem_page(). > > > > > > In short, vmf_insert_page will be used in page fault handlers > > > context and vm_insert_kmem_page will be used to map kernel > > > memory to user vma outside page fault handlers context. > > > > As far as I can tell, vm_insert_kmem_page() is line-for-line identical > > with vm_insert_page(). Seriously, here's a diff I just did: > > > > -static int insert_page(struct vm_area_struct *vma, unsigned long addr, > > - struct page *page, pgprot_t prot) > > +static int insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > + struct page *page, pgprot_t prot) > > - /* Ok, finally just insert the thing.. */ > > -int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > - return insert_page(vma, addr, page, vma->vm_page_prot); > > + return insert_kmem_page(vma, addr, page, vma->vm_page_prot); > > -EXPORT_SYMBOL(vm_insert_page); > > +EXPORT_SYMBOL(vm_insert_kmem_page); > > > > What on earth are you trying to do? > > Reading the commit log, it seems that the intention is to split out > vm_insert_page() used outside of page-fault handling with the use > within page-fault handling, so that different return codes can be > used. > > I don't see that justifies the code duplication - can't > vm_insert_page() and vm_insert_kmem_page() use the same mechanics > to do their job, and just translate the error code from the most- > specific to the least-specific error code? Do we really need two > copies of the same code just to return different error codes. Sorry about it. can I take below approach in a patch series -> create a wrapper function vm_insert_kmem_page using vm_insert_page. Convert all the non #PF users to use it. Then make vm_insert_page static and convert inline vmf_insert_page to caller.
On Thu, Oct 04, 2018 at 05:45:13PM +0530, Souptick Joarder wrote: > On Thu, Oct 4, 2018 at 3:45 AM Russell King - ARM Linux > <linux@armlinux.org.uk> wrote: > > > > On Wed, Oct 03, 2018 at 01:00:03PM -0700, Matthew Wilcox wrote: > > > On Thu, Oct 04, 2018 at 12:28:54AM +0530, Souptick Joarder wrote: > > > > These are the approaches which could have been taken to handle > > > > this scenario - > > > > > > > > * Replace vm_insert_page with vmf_insert_page and then write few > > > > extra lines of code to convert VM_FAULT_CODE to errno which > > > > makes driver users more complex ( also the reverse mapping errno to > > > > VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , > > > > not preferred to introduce anything similar again) > > > > > > > > * Maintain both vm_insert_page and vmf_insert_page and use it in > > > > respective places. But it won't gurantee that vm_insert_page will > > > > never be used in #PF context. > > > > > > > > * Introduce a similar API like vm_insert_page, convert all non #PF > > > > consumer to use it and finally remove vm_insert_page by converting > > > > it to vmf_insert_page. > > > > > > > > And the 3rd approach was taken by introducing vm_insert_kmem_page(). > > > > > > > > In short, vmf_insert_page will be used in page fault handlers > > > > context and vm_insert_kmem_page will be used to map kernel > > > > memory to user vma outside page fault handlers context. > > > > > > As far as I can tell, vm_insert_kmem_page() is line-for-line identical > > > with vm_insert_page(). Seriously, here's a diff I just did: > > > > > > -static int insert_page(struct vm_area_struct *vma, unsigned long addr, > > > - struct page *page, pgprot_t prot) > > > +static int insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > > + struct page *page, pgprot_t prot) > > > - /* Ok, finally just insert the thing.. */ > > > -int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > > > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > > - return insert_page(vma, addr, page, vma->vm_page_prot); > > > + return insert_kmem_page(vma, addr, page, vma->vm_page_prot); > > > -EXPORT_SYMBOL(vm_insert_page); > > > +EXPORT_SYMBOL(vm_insert_kmem_page); > > > > > > What on earth are you trying to do? > > > > > Reading the commit log, it seems that the intention is to split out > > vm_insert_page() used outside of page-fault handling with the use > > within page-fault handling, so that different return codes can be > > used. > > > > I don't see that justifies the code duplication - can't > > vm_insert_page() and vm_insert_kmem_page() use the same mechanics > > to do their job, and just translate the error code from the most- > > specific to the least-specific error code? Do we really need two > > copies of the same code just to return different error codes. > > Sorry about it. > can I take below approach in a patch series -> > > create a wrapper function vm_insert_kmem_page using vm_insert_page. > Convert all the non #PF users to use it. > Then make vm_insert_page static and convert inline vmf_insert_page to caller. I'm confused, what are you trying to do? It seems that we already have: vm_insert_page() - returns an errno vmf_insert_page() - returns a VM_FAULT_* code From what I _think_ you're saying, you're trying to provide vm_insert_kmem_page() as a direct replacement for the existing vm_insert_page(), and then make vm_insert_page() behave as per vmf_insert_page(), so we end up with: vm_insert_kmem_page() - returns an errno vm_insert_page() - returns a VM_FAULT_* code vmf_insert_page() - returns a VM_FAULT_* code and is identical to vm_insert_page() Given that the documentation for vm_insert_page() says: * Usually this function is called from f_op->mmap() handler * under mm->mmap_sem write-lock, so it can change vma->vm_flags. * Caller must set VM_MIXEDMAP on vma if it wants to call this * function from other places, for example from page-fault handler. this says that the "usual" use method for vm_insert_page() is _outside_ of page fault handling - if it is used _inside_ page fault handling, then it states that additional fixups are required on the VMA. So I don't get why your patch commentry seems to be saying that users of vm_insert_page() outside of page fault handling all need to be patched - isn't this the use case that this function is defined to be handling? If you're going to be changing the semantics, doesn't this create a flag day where we could get new users of vm_insert_page() using the _existing_ semantics merged after you've changed its semantics (eg, the return code)? Maybe I don't understand fully what you're trying to achieve here.
On Thu, Oct 4, 2018 at 6:04 PM Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > > On Thu, Oct 04, 2018 at 05:45:13PM +0530, Souptick Joarder wrote: > > On Thu, Oct 4, 2018 at 3:45 AM Russell King - ARM Linux > > <linux@armlinux.org.uk> wrote: > > > > > > On Wed, Oct 03, 2018 at 01:00:03PM -0700, Matthew Wilcox wrote: > > > > On Thu, Oct 04, 2018 at 12:28:54AM +0530, Souptick Joarder wrote: > > > > > These are the approaches which could have been taken to handle > > > > > this scenario - > > > > > > > > > > * Replace vm_insert_page with vmf_insert_page and then write few > > > > > extra lines of code to convert VM_FAULT_CODE to errno which > > > > > makes driver users more complex ( also the reverse mapping errno to > > > > > VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , > > > > > not preferred to introduce anything similar again) > > > > > > > > > > * Maintain both vm_insert_page and vmf_insert_page and use it in > > > > > respective places. But it won't gurantee that vm_insert_page will > > > > > never be used in #PF context. > > > > > > > > > > * Introduce a similar API like vm_insert_page, convert all non #PF > > > > > consumer to use it and finally remove vm_insert_page by converting > > > > > it to vmf_insert_page. > > > > > > > > > > And the 3rd approach was taken by introducing vm_insert_kmem_page(). > > > > > > > > > > In short, vmf_insert_page will be used in page fault handlers > > > > > context and vm_insert_kmem_page will be used to map kernel > > > > > memory to user vma outside page fault handlers context. > > > > > > > > As far as I can tell, vm_insert_kmem_page() is line-for-line identical > > > > with vm_insert_page(). Seriously, here's a diff I just did: > > > > > > > > -static int insert_page(struct vm_area_struct *vma, unsigned long addr, > > > > - struct page *page, pgprot_t prot) > > > > +static int insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > > > + struct page *page, pgprot_t prot) > > > > - /* Ok, finally just insert the thing.. */ > > > > -int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > > > > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > > > > - return insert_page(vma, addr, page, vma->vm_page_prot); > > > > + return insert_kmem_page(vma, addr, page, vma->vm_page_prot); > > > > -EXPORT_SYMBOL(vm_insert_page); > > > > +EXPORT_SYMBOL(vm_insert_kmem_page); > > > > > > > > What on earth are you trying to do? > > > > > > > > Reading the commit log, it seems that the intention is to split out > > > vm_insert_page() used outside of page-fault handling with the use > > > within page-fault handling, so that different return codes can be > > > used. > > > > > > I don't see that justifies the code duplication - can't > > > vm_insert_page() and vm_insert_kmem_page() use the same mechanics > > > to do their job, and just translate the error code from the most- > > > specific to the least-specific error code? Do we really need two > > > copies of the same code just to return different error codes. > > > > Sorry about it. > > can I take below approach in a patch series -> > > > > create a wrapper function vm_insert_kmem_page using vm_insert_page. > > Convert all the non #PF users to use it. > > Then make vm_insert_page static and convert inline vmf_insert_page to caller. > > I'm confused, what are you trying to do? > > It seems that we already have: > > vm_insert_page() - returns an errno > vmf_insert_page() - returns a VM_FAULT_* code > > From what I _think_ you're saying, you're trying to provide > vm_insert_kmem_page() as a direct replacement for the existing > vm_insert_page(), and then make vm_insert_page() behave as per > vmf_insert_page(), so we end up with: yes, vm_insert_kmem_page() can be a direct replacement of vm_insert_page or might be a wrapper function written using vm_insert_page whichever suites better based on feedback. > > vm_insert_kmem_page() - returns an errno > vm_insert_page() - returns a VM_FAULT_* code > vmf_insert_page() - returns a VM_FAULT_* code and is identical to > vm_insert_page() > After completion of conversion we end up with vm_insert_kmem_page() - returns an errno vmf_insert_page() - returns a VM_FAULT_* code > Given that the documentation for vm_insert_page() says: > > * Usually this function is called from f_op->mmap() handler > * under mm->mmap_sem write-lock, so it can change vma->vm_flags. > * Caller must set VM_MIXEDMAP on vma if it wants to call this > * function from other places, for example from page-fault handler. > > this says that the "usual" use method for vm_insert_page() is > _outside_ of page fault handling - if it is used _inside_ page fault > handling, then it states that additional fixups are required on the > VMA. So I don't get why your patch commentry seems to be saying that > users of vm_insert_page() outside of page fault handling all need to > be patched - isn't this the use case that this function is defined > to be handling? The answer is yes best of my knowledge. But as mentioned in change log -> Going forward, the plan is to restrict future drivers not to use vm_insert_page ( *it will generate new errno to VM_FAULT_CODE mapping code for new drivers which were already cleaned up for existing drivers*) in #PF (page fault handler) context but to make use of vmf_insert_page which returns VMF_FAULT_CODE and that is not possible until both vm_insert_page and vmf_insert_page API exists. But there are some consumers of vm_insert_page which use it outside #PF context. straight forward conversion of vm_insert_page to vmf_insert_page won't work there as those function calls expects errno not vm_fault_t in return. If both {vm, vmf}_insert_page exists, vm_insert_page might be used for #PF context which we want to protect by removing/ replacing vm_insert_page with another similar/ wrapper API. Is that the right answer of your question ? no ? > > If you're going to be changing the semantics, doesn't this create a > flag day where we could get new users of vm_insert_page() using the > _existing_ semantics merged after you've changed its semantics (eg, > the return code)? No, vm_insert_page API will remove/ replace only when all the user are converted. We will do it intermediately by first introducing new API, convert all user to use it and at final step remove the old API.
On Thu, Oct 04, 2018 at 11:42:18PM +0530, Souptick Joarder wrote: > On Thu, Oct 4, 2018 at 6:04 PM Russell King - ARM Linux > <linux@armlinux.org.uk> wrote: > > I'm confused, what are you trying to do? > > > > It seems that we already have: > > > > vm_insert_page() - returns an errno > > vmf_insert_page() - returns a VM_FAULT_* code > > > > From what I _think_ you're saying, you're trying to provide > > vm_insert_kmem_page() as a direct replacement for the existing > > vm_insert_page(), and then make vm_insert_page() behave as per > > vmf_insert_page(), so we end up with: > > yes, vm_insert_kmem_page() can be a direct replacement of vm_insert_page > or might be a wrapper function written using vm_insert_page whichever > suites better > based on feedback. > > > > > vm_insert_kmem_page() - returns an errno > > vm_insert_page() - returns a VM_FAULT_* code > > vmf_insert_page() - returns a VM_FAULT_* code and is identical to > > vm_insert_page() > > > > After completion of conversion we end up with > > vm_insert_kmem_page() - returns an errno > vmf_insert_page() - returns a VM_FAULT_* code > > > > Given that the documentation for vm_insert_page() says: > > > > * Usually this function is called from f_op->mmap() handler > > * under mm->mmap_sem write-lock, so it can change vma->vm_flags. > > * Caller must set VM_MIXEDMAP on vma if it wants to call this > > * function from other places, for example from page-fault handler. > > > > this says that the "usual" use method for vm_insert_page() is > > _outside_ of page fault handling - if it is used _inside_ page fault > > handling, then it states that additional fixups are required on the > > VMA. So I don't get why your patch commentry seems to be saying that > > users of vm_insert_page() outside of page fault handling all need to > > be patched - isn't this the use case that this function is defined > > to be handling? > > The answer is yes best of my knowledge. > > But as mentioned in change log -> > > Going forward, the plan is to restrict future drivers not > to use vm_insert_page ( *it will generate new errno to > VM_FAULT_CODE mapping code for new drivers which were already > cleaned up for existing drivers*) in #PF (page fault handler) > context but to make use of vmf_insert_page which returns > VMF_FAULT_CODE and that is not possible until both vm_insert_page > and vmf_insert_page API exists. > > But there are some consumers of vm_insert_page which use it > outside #PF context. straight forward conversion of vm_insert_page > to vmf_insert_page won't work there as those function calls expects > errno not vm_fault_t in return. > > If both {vm, vmf}_insert_page exists, vm_insert_page might be used for > #PF context which we want to protect by removing/ replacing vm_insert_page > with another similar/ wrapper API. > > Is that the right answer of your question ? no ? I think this is a bad plan. What we should rather do is examine the current users of vm_insert_page() and ask "What interface would better replace vm_insert_page()?" As I've said to you before, I believe the right answer is to have a vm_insert_range() which takes an array of struct page pointers. That fits the majority of remaining users. ---- If we do want to rename vm_insert_page() to vm_insert_kmem_page(), then the right answer is to _just do that_. Not duplicate vm_insert_page() in its entirety. I don't see the point to doing that.
Hi Matthew, On Thu, Oct 4, 2018 at 1:30 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Thu, Oct 04, 2018 at 12:28:54AM +0530, Souptick Joarder wrote: > > These are the approaches which could have been taken to handle > > this scenario - > > > > * Replace vm_insert_page with vmf_insert_page and then write few > > extra lines of code to convert VM_FAULT_CODE to errno which > > makes driver users more complex ( also the reverse mapping errno to > > VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , > > not preferred to introduce anything similar again) > > > > * Maintain both vm_insert_page and vmf_insert_page and use it in > > respective places. But it won't gurantee that vm_insert_page will > > never be used in #PF context. > > > > * Introduce a similar API like vm_insert_page, convert all non #PF > > consumer to use it and finally remove vm_insert_page by converting > > it to vmf_insert_page. > > > > And the 3rd approach was taken by introducing vm_insert_kmem_page(). > > > > In short, vmf_insert_page will be used in page fault handlers > > context and vm_insert_kmem_page will be used to map kernel > > memory to user vma outside page fault handlers context. > > As far as I can tell, vm_insert_kmem_page() is line-for-line identical > with vm_insert_page(). Seriously, here's a diff I just did: > > -static int insert_page(struct vm_area_struct *vma, unsigned long addr, > - struct page *page, pgprot_t prot) > +static int insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > + struct page *page, pgprot_t prot) > - /* Ok, finally just insert the thing.. */ > -int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, > - return insert_page(vma, addr, page, vma->vm_page_prot); > + return insert_kmem_page(vma, addr, page, vma->vm_page_prot); > -EXPORT_SYMBOL(vm_insert_page); > +EXPORT_SYMBOL(vm_insert_kmem_page); > > What on earth are you trying to do? Shall I take below approach rather than just creating a identical API same as vm_insert_page ?? 1. create a wrapper function vm_insert_kmem_page using vm_insert_page. 2. Convert all the non #PF users to use it. 3. Then make vm_insert_page static and convert inline vmf_insert_page to caller. In that way we will be having two functions vmf_insert_page (#PF) and vm_insert_kmem_page (non #PF) and both will be using common vm_insert_page which will be static. I am clear with the problem statement but not very clear on my solution.
On Thu, Oct 4, 2018 at 11:47 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Thu, Oct 04, 2018 at 11:42:18PM +0530, Souptick Joarder wrote: > > On Thu, Oct 4, 2018 at 6:04 PM Russell King - ARM Linux > > <linux@armlinux.org.uk> wrote: > > > I'm confused, what are you trying to do? > > > > > > It seems that we already have: > > > > > > vm_insert_page() - returns an errno > > > vmf_insert_page() - returns a VM_FAULT_* code > > > > > > From what I _think_ you're saying, you're trying to provide > > > vm_insert_kmem_page() as a direct replacement for the existing > > > vm_insert_page(), and then make vm_insert_page() behave as per > > > vmf_insert_page(), so we end up with: > > > > yes, vm_insert_kmem_page() can be a direct replacement of vm_insert_page > > or might be a wrapper function written using vm_insert_page whichever > > suites better > > based on feedback. > > > > > > > > vm_insert_kmem_page() - returns an errno > > > vm_insert_page() - returns a VM_FAULT_* code > > > vmf_insert_page() - returns a VM_FAULT_* code and is identical to > > > vm_insert_page() > > > > > > > After completion of conversion we end up with > > > > vm_insert_kmem_page() - returns an errno > > vmf_insert_page() - returns a VM_FAULT_* code > > > > > > > Given that the documentation for vm_insert_page() says: > > > > > > * Usually this function is called from f_op->mmap() handler > > > * under mm->mmap_sem write-lock, so it can change vma->vm_flags. > > > * Caller must set VM_MIXEDMAP on vma if it wants to call this > > > * function from other places, for example from page-fault handler. > > > > > > this says that the "usual" use method for vm_insert_page() is > > > _outside_ of page fault handling - if it is used _inside_ page fault > > > handling, then it states that additional fixups are required on the > > > VMA. So I don't get why your patch commentry seems to be saying that > > > users of vm_insert_page() outside of page fault handling all need to > > > be patched - isn't this the use case that this function is defined > > > to be handling? > > > > The answer is yes best of my knowledge. > > > > But as mentioned in change log -> > > > > Going forward, the plan is to restrict future drivers not > > to use vm_insert_page ( *it will generate new errno to > > VM_FAULT_CODE mapping code for new drivers which were already > > cleaned up for existing drivers*) in #PF (page fault handler) > > context but to make use of vmf_insert_page which returns > > VMF_FAULT_CODE and that is not possible until both vm_insert_page > > and vmf_insert_page API exists. > > > > But there are some consumers of vm_insert_page which use it > > outside #PF context. straight forward conversion of vm_insert_page > > to vmf_insert_page won't work there as those function calls expects > > errno not vm_fault_t in return. > > > > If both {vm, vmf}_insert_page exists, vm_insert_page might be used for > > #PF context which we want to protect by removing/ replacing vm_insert_page > > with another similar/ wrapper API. > > > > Is that the right answer of your question ? no ? > > I think this is a bad plan. What we should rather do is examine the current > users of vm_insert_page() and ask "What interface would better replace > vm_insert_page()?" > > As I've said to you before, I believe the right answer is to have a > vm_insert_range() which takes an array of struct page pointers. That > fits the majority of remaining users. Ok, but it will take some time. Is it a good idea to introduce the final vm_fault_t patch and then start working on vm_insert_range as it will be bit time consuming ? > > ---- > > If we do want to rename vm_insert_page() to vm_insert_kmem_page(), then > the right answer is to _just do that_. Not duplicate vm_insert_page() > in its entirety. I don't see the point to doing that.
Hi Souptick, On Thu, Oct 4, 2018 at 8:49 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > On Thu, Oct 4, 2018 at 11:47 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > I think this is a bad plan. What we should rather do is examine the current > > users of vm_insert_page() and ask "What interface would better replace > > vm_insert_page()?" > > > > As I've said to you before, I believe the right answer is to have a > > vm_insert_range() which takes an array of struct page pointers. That > > fits the majority of remaining users. > > Ok, but it will take some time. > Is it a good idea to introduce the final vm_fault_t patch and then > start working on vm_insert_range as it will be bit time consuming ? > Well, why is there a rush? Development should be done in a patch series or a tree, and submitted as a whole, instead of sending partial patches. Also, not sure if you saw my comments/review: if the interface is not going to change, why the name change? Why can't we simply keep using vm_insert_page? Cheers, Miguel
On Fri, Oct 5, 2018 at 1:16 AM Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> wrote: > > Hi Souptick, > > On Thu, Oct 4, 2018 at 8:49 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > On Thu, Oct 4, 2018 at 11:47 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > I think this is a bad plan. What we should rather do is examine the current > > > users of vm_insert_page() and ask "What interface would better replace > > > vm_insert_page()?" > > > > > > As I've said to you before, I believe the right answer is to have a > > > vm_insert_range() which takes an array of struct page pointers. That > > > fits the majority of remaining users. > > > > Ok, but it will take some time. > > Is it a good idea to introduce the final vm_fault_t patch and then > > start working on vm_insert_range as it will be bit time consuming ? > > > > Well, why is there a rush? Development should be done in a patch > series or a tree, and submitted as a whole, instead of sending partial > patches. Not in hurry, will do it in a patch series :-) > > Also, not sure if you saw my comments/review: if the interface is not > going to change, why the name change? Why can't we simply keep using > vm_insert_page? yes, changing the name without changing the interface is a bad approach and this can't be taken. As Matthew mentioned, "vm_insert_range() which takes an array of struct page pointers. That fits the majority of remaining users" would be a better approach to fit this use case. But yes, we can't keep vm_insert_page and vmf_insert_page together as it doesn't guarantee that future drivers will not use vm_insert_page in #PF context ( which will generate new errno to VM_FAULT_CODE). Any further comment form others on vm_Insert_range() approach ?
Hi Souptick, On Fri, Oct 5, 2018 at 7:51 AM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > On Fri, Oct 5, 2018 at 1:16 AM Miguel Ojeda > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > > > Also, not sure if you saw my comments/review: if the interface is not > > going to change, why the name change? Why can't we simply keep using > > vm_insert_page? > > yes, changing the name without changing the interface is a > bad approach and this can't be taken. As Matthew mentioned, > "vm_insert_range() which takes an array of struct page pointers. > That fits the majority of remaining users" would be a better approach > to fit this use case. > > But yes, we can't keep vm_insert_page and vmf_insert_page together > as it doesn't guarantee that future drivers will not use vm_insert_page > in #PF context ( which will generate new errno to VM_FAULT_CODE). > Maybe I am hard of thinking, but aren't you planning to remove vm_insert_page with these changes? If yes, why you can't use the keep vm_insert_page name? In other words, keep returning what the drivers expect? Cheers, Miguel
On Fri, Oct 5, 2018 at 2:22 PM Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> wrote: > > Hi Souptick, > > On Fri, Oct 5, 2018 at 7:51 AM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > On Fri, Oct 5, 2018 at 1:16 AM Miguel Ojeda > > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > > > > > > Also, not sure if you saw my comments/review: if the interface is not > > > going to change, why the name change? Why can't we simply keep using > > > vm_insert_page? > > > > yes, changing the name without changing the interface is a > > bad approach and this can't be taken. As Matthew mentioned, > > "vm_insert_range() which takes an array of struct page pointers. > > That fits the majority of remaining users" would be a better approach > > to fit this use case. > > > > But yes, we can't keep vm_insert_page and vmf_insert_page together > > as it doesn't guarantee that future drivers will not use vm_insert_page > > in #PF context ( which will generate new errno to VM_FAULT_CODE). > > > > Maybe I am hard of thinking, but aren't you planning to remove > vm_insert_page with these changes? If yes, why you can't use the keep > vm_insert_page name? In other words, keep returning what the drivers > expect? The final goal is to remove vm_insert_page by converting it to vmf_insert_page. But to do that we have to first introduce the new API which is similar to vm_insert_page (for non #PF). I tried this by introducing vm_insert_kmem_page ( * identical as vm_insert_page except API name *) in this patch. But this looks like a bad approach. The new proposal is to introduce vm_insert_range() ( * which might be bit different from vm_insert_page but will serve all the non #PF use cases)
Hi Souptick, On Fri, Oct 5, 2018 at 12:01 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > The final goal is to remove vm_insert_page by converting it to > vmf_insert_page. But to do that we have to first introduce the > new API which is similar to vm_insert_page (for non #PF). I tried this by > introducing vm_insert_kmem_page ( * identical as vm_insert_page > except API name *) in this patch. But this looks like a bad approach. We are going in circles here. That you want to convert vm_insert_page to vmf_insert_page for the PF case is fine and understood. However, you don't *need* to introduce a new name for the remaining non-PF cases if the function is going to be the exact same thing as before. You say "The final goal is to remove vm_insert_page", but you haven't justified *why* you need to remove that name. Now, if we want to rename the function for some reason (e.g. avoid confusion with vmf_insert_page), that is fine but is another topic. It may be or not a good idea, but it is orthogonal to the vmf_ work. Matthew, on this regard, told you that you shouldn't duplicate functions. If you want a rename, do so; but don't copy the code. In other words: nobody said introducing the vm_insert_kmem_page name is a bad idea -- what Matthew told you is that *duplicating* vm_insert_page just for that is bad. Further, you are copying the code (if I understand your thought process) because you want to change the callers of non-PF first, and then do the "full conversion from vm_* to vmf_*". However, that is confusing, because there is no need to change non-PF callers of vm_insert_page since they don't care about the new vmf_* functions. Instead, the proper way of doing this is: 1. Introduce the vmf_* API 2. Change all PF-users users to that (leaving all non-PF ones untouched!) -- if this is too big, you can split this patch into several patches, one per subsystem, etc. 3. Remove the vm_* functions (except the ones that are still used in non-PF contexts, e.g. vm_insert_page) Then, optionally, if you want to rename the function for the remaining non-PF users: 4. Rename vm_insert_page (justifying why the current name is confusing *on its own merits*). Otherwise, if you want to pursue Matthew's idea: 4. Introduce the vm_insert_range (possibly leveraging vm_insert_page, or not; you have to see what is best). 5. Replace those callers that can take advantage of vm_insert_range 6. Remove vm_insert_page and replace callers with vm_insert_range (only if it is not worth to keep vm_insert_range, again justifying it *on its own merits*) As you see, these are all logical step-by-step improvements, without duplicating functions temporarily, leaving temporary changes or changing current callers to new APIs for unrelated reasons (i.e. no need to introduce vm_insert_kmem_page simply to do a "conversion" to vmf_). Cheers, Miguel
On Fri, Oct 5, 2018 at 4:19 PM Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> wrote: > > Hi Souptick, > > On Fri, Oct 5, 2018 at 12:01 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > The final goal is to remove vm_insert_page by converting it to > > vmf_insert_page. But to do that we have to first introduce the > > new API which is similar to vm_insert_page (for non #PF). I tried this by > > introducing vm_insert_kmem_page ( * identical as vm_insert_page > > except API name *) in this patch. But this looks like a bad approach. > > We are going in circles here. That you want to convert vm_insert_page > to vmf_insert_page for the PF case is fine and understood. However, > you don't *need* to introduce a new name for the remaining non-PF > cases if the function is going to be the exact same thing as before. > You say "The final goal is to remove vm_insert_page", but you haven't > justified *why* you need to remove that name. > > Now, if we want to rename the function for some reason (e.g. avoid > confusion with vmf_insert_page), that is fine but is another topic. It > may be or not a good idea, but it is orthogonal to the vmf_ work. > Matthew, on this regard, told you that you shouldn't duplicate > functions. If you want a rename, do so; but don't copy the code. In > other words: nobody said introducing the vm_insert_kmem_page name is a > bad idea -- what Matthew told you is that *duplicating* vm_insert_page > just for that is bad. > > Further, you are copying the code (if I understand your thought > process) because you want to change the callers of non-PF first, and > then do the "full conversion from vm_* to vmf_*". However, that is > confusing, because there is no need to change non-PF callers of > vm_insert_page since they don't care about the new vmf_* functions. > > Instead, the proper way of doing this is: > > 1. Introduce the vmf_* API > 2. Change all PF-users users to that (leaving all non-PF ones > untouched!) -- if this is too big, you can split this patch into > several patches, one per subsystem, etc. We are done with step 2. All the PF-users are converted to use vmf_insert_page. ( Ref - linux-next-20181005) > 3. Remove the vm_* functions (except the ones that are still used in > non-PF contexts, e.g. vm_insert_page) Step 3 is part of step 2. Already done. > > Then, optionally, if you want to rename the function for the remaining > non-PF users: > > 4. Rename vm_insert_page (justifying why the current name is > confusing *on its own merits*). > > Otherwise, if you want to pursue Matthew's idea: > > 4. Introduce the vm_insert_range (possibly leveraging > vm_insert_page, or not; you have to see what is best). > 5. Replace those callers that can take advantage of vm_insert_range > 6. Remove vm_insert_page and replace callers with vm_insert_range > (only if it is not worth to keep vm_insert_range, again justifying it > *on its own merits*) Step 4 to 6, going to do it. It is part of plan now :-) > > As you see, these are all logical step-by-step improvements, without > duplicating functions temporarily, leaving temporary changes or > changing current callers to new APIs for unrelated reasons (i.e. no > need to introduce vm_insert_kmem_page simply to do a "conversion" to > vmf_). > > Cheers, > Miguel
On Fri, Oct 5, 2018 at 2:11 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > On Fri, Oct 5, 2018 at 4:19 PM Miguel Ojeda > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > 1. Introduce the vmf_* API > > 2. Change all PF-users users to that (leaving all non-PF ones > > untouched!) -- if this is too big, you can split this patch into > > several patches, one per subsystem, etc. > > We are done with step 2. All the PF-users are converted to use > vmf_insert_page. ( Ref - linux-next-20181005) They are not supposed to be "steps". You did it with 70+ commits (!!) over the course of several months. Why a tree wasn't created, stuff developed there, and when done, submitted it for review? > > > > Otherwise, if you want to pursue Matthew's idea: > > > > 4. Introduce the vm_insert_range (possibly leveraging > > vm_insert_page, or not; you have to see what is best). > > 5. Replace those callers that can take advantage of vm_insert_range > > 6. Remove vm_insert_page and replace callers with vm_insert_range > > (only if it is not worth to keep vm_insert_range, again justifying it > > *on its own merits*) > > Step 4 to 6, going to do it. It is part of plan now :-) > Fine, but you haven't answered to the other parts of my email: you don't explain why you choose one alternative over the others, you simply keep changing the approach. Cheers, Miguel
On Fri, Oct 5, 2018 at 11:39 PM Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> wrote: > > On Fri, Oct 5, 2018 at 2:11 PM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > On Fri, Oct 5, 2018 at 4:19 PM Miguel Ojeda > > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > > > 1. Introduce the vmf_* API > > > 2. Change all PF-users users to that (leaving all non-PF ones > > > untouched!) -- if this is too big, you can split this patch into > > > several patches, one per subsystem, etc. > > > > We are done with step 2. All the PF-users are converted to use > > vmf_insert_page. ( Ref - linux-next-20181005) > > They are not supposed to be "steps". You did it with 70+ commits (!!) > over the course of several months. Why a tree wasn't created, stuff > developed there, and when done, submitted it for review? Because we already have a plan for entire vm_fault_t migration and the * instruction * was to send one patch per driver. > > > > > > > Otherwise, if you want to pursue Matthew's idea: > > > > > > 4. Introduce the vm_insert_range (possibly leveraging > > > vm_insert_page, or not; you have to see what is best). > > > 5. Replace those callers that can take advantage of vm_insert_range > > > 6. Remove vm_insert_page and replace callers with vm_insert_range > > > (only if it is not worth to keep vm_insert_range, again justifying it > > > *on its own merits*) > > > > Step 4 to 6, going to do it. It is part of plan now :-) > > > > Fine, but you haven't answered to the other parts of my email: you > don't explain why you choose one alternative over the others, you > simply keep changing the approach. We are going in circles here. That you want to convert vm_insert_page to vmf_insert_page for the PF case is fine and understood. However, you don't *need* to introduce a new name for the remaining non-PF cases if the function is going to be the exact same thing as before. You say "The final goal is to remove vm_insert_page", but you haven't justified *why* you need to remove that name. I think I have given that answer. If we don't remove vm_insert_page, future #PF caller will have option to use it. But those should be restricted. How are we going to restrict vm_insert_page in one half of kernel when other half is still using it ?? Is there any way ? ( I don't know)
On Sat, Oct 6, 2018 at 7:11 AM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > On Fri, Oct 5, 2018 at 11:39 PM Miguel Ojeda > <miguel.ojeda.sandonis@gmail.com> wrote: > > They are not supposed to be "steps". You did it with 70+ commits (!!) > > over the course of several months. Why a tree wasn't created, stuff > > developed there, and when done, submitted it for review? > > Because we already have a plan for entire vm_fault_t migration and > the * instruction * was to send one patch per driver. The instruction? > > > > Fine, but you haven't answered to the other parts of my email: you > > don't explain why you choose one alternative over the others, you > > simply keep changing the approach. > > We are going in circles here. That you want to convert vm_insert_page > to vmf_insert_page for the PF case is fine and understood. However, > you don't *need* to introduce a new name for the remaining non-PF > cases if the function is going to be the exact same thing as before. > You say "The final goal is to remove vm_insert_page", but you haven't > justified *why* you need to remove that name. > > I think I have given that answer. If we don't remove vm_insert_page, > future #PF caller will have option to use it. But those should be > restricted. How are we going to restrict vm_insert_page in one half > of kernel when other half is still using it ?? Is there any way ? ( I don't > know) Ah, so that is what you are concerned about: future misuses. Well, I don't really see the problem. There are only ~18 calls to vm_insert_page() in the entire kernel: checking if people is using it properly for a while should be easy. As long as the new behavior is documented properly, it should be fine. If you are really concerned about mistakes being made, then fine, we can rename it as I suggested. Now, the new vm_insert_range() is another topic. It simplifies a few of the callers and buys us the rename at the same time, so I am also OK with it. As you see, I am not against the changes -- it is just that they should clearly justified. :-) It wasn't clear what your problem with the current vm_insert_page() is. Cheers, Miguel
On Sat, Oct 6, 2018 at 4:19 PM Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> wrote: > > On Sat, Oct 6, 2018 at 7:11 AM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > On Fri, Oct 5, 2018 at 11:39 PM Miguel Ojeda > > <miguel.ojeda.sandonis@gmail.com> wrote: > > > They are not supposed to be "steps". You did it with 70+ commits (!!) > > > over the course of several months. Why a tree wasn't created, stuff > > > developed there, and when done, submitted it for review? > > > > Because we already have a plan for entire vm_fault_t migration and > > the * instruction * was to send one patch per driver. > > The instruction? Sorry for the delayed response. Instruction from Matthew Wilcox who is supervising the entire vm_fault_t migration work :-) > > > > > > > Fine, but you haven't answered to the other parts of my email: you > > > don't explain why you choose one alternative over the others, you > > > simply keep changing the approach. > > > > We are going in circles here. That you want to convert vm_insert_page > > to vmf_insert_page for the PF case is fine and understood. However, > > you don't *need* to introduce a new name for the remaining non-PF > > cases if the function is going to be the exact same thing as before. > > You say "The final goal is to remove vm_insert_page", but you haven't > > justified *why* you need to remove that name. > > > > I think I have given that answer. If we don't remove vm_insert_page, > > future #PF caller will have option to use it. But those should be > > restricted. How are we going to restrict vm_insert_page in one half > > of kernel when other half is still using it ?? Is there any way ? ( I don't > > know) > > Ah, so that is what you are concerned about: future misuses. Well, I > don't really see the problem. There are only ~18 calls to > vm_insert_page() in the entire kernel: checking if people is using it > properly for a while should be easy. As long as the new behavior is > documented properly, it should be fine. If you are really concerned > about mistakes being made, then fine, we can rename it as I suggested. > > Now, the new vm_insert_range() is another topic. It simplifies a few > of the callers and buys us the rename at the same time, so I am also > OK with it. > > As you see, I am not against the changes -- it is just that they > should clearly justified. :-) It wasn't clear what your problem with > the current vm_insert_page() is. > > Cheers, > Miguel
On Tue, Oct 23, 2018 at 05:44:32PM +0530, Souptick Joarder wrote: > On Sat, Oct 6, 2018 at 4:19 PM Miguel Ojeda > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > On Sat, Oct 6, 2018 at 7:11 AM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > > > On Fri, Oct 5, 2018 at 11:39 PM Miguel Ojeda > > > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > They are not supposed to be "steps". You did it with 70+ commits (!!) > > > > over the course of several months. Why a tree wasn't created, stuff > > > > developed there, and when done, submitted it for review? > > > > > > Because we already have a plan for entire vm_fault_t migration and > > > the * instruction * was to send one patch per driver. > > > > The instruction? > > Sorry for the delayed response. > Instruction from Matthew Wilcox who is supervising the entire vm_fault_t > migration work :-) Hang on. That was for the initial vm_fault_t conversion in which each step was clearly an improvement. What you're looking at now is far from that.
On Tue, Oct 23, 2018 at 5:54 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Oct 23, 2018 at 05:44:32PM +0530, Souptick Joarder wrote: > > On Sat, Oct 6, 2018 at 4:19 PM Miguel Ojeda > > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > > > On Sat, Oct 6, 2018 at 7:11 AM Souptick Joarder <jrdr.linux@gmail.com> wrote: > > > > > > > > On Fri, Oct 5, 2018 at 11:39 PM Miguel Ojeda > > > > <miguel.ojeda.sandonis@gmail.com> wrote: > > > > > They are not supposed to be "steps". You did it with 70+ commits (!!) > > > > > over the course of several months. Why a tree wasn't created, stuff > > > > > developed there, and when done, submitted it for review? > > > > > > > > Because we already have a plan for entire vm_fault_t migration and > > > > the * instruction * was to send one patch per driver. > > > > > > The instruction? > > > > Sorry for the delayed response. > > Instruction from Matthew Wilcox who is supervising the entire vm_fault_t > > migration work :-) > > Hang on. That was for the initial vm_fault_t conversion in which each > step was clearly an improvement. What you're looking at now is far > from that. Ok. But my understanding was, the approach of vm_insert_range comes into discussion as part of converting vm_insert_page into vmf_insert_page which is still part of original vm_fault_t conversion discussion. No ?
On Tue, Oct 23, 2018 at 06:03:42PM +0530, Souptick Joarder wrote: > On Tue, Oct 23, 2018 at 5:54 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Oct 23, 2018 at 05:44:32PM +0530, Souptick Joarder wrote: > > > Instruction from Matthew Wilcox who is supervising the entire vm_fault_t > > > migration work :-) > > > > Hang on. That was for the initial vm_fault_t conversion in which each > > step was clearly an improvement. What you're looking at now is far > > from that. > > Ok. But my understanding was, the approach of vm_insert_range comes > into discussion as part of converting vm_insert_page into vmf_insert_page > which is still part of original vm_fault_t conversion discussion. No ? No. The initial part (converting all page fault methods to vm_fault_t) is done. What remains undone (looking at akpm's tree) is changing the typedef of vm_fault_t from int to unsigned int. That will prevent new page fault handlers with the wrong type from being added. I don't necessarily want to get rid of vm_insert_page(). Maybe it will make sense to do that, and maybe not. What I do want to see is thought, and not "Matthew told me to do it", when I didn't.
On Tue, Oct 23, 2018 at 6:29 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Oct 23, 2018 at 06:03:42PM +0530, Souptick Joarder wrote: > > On Tue, Oct 23, 2018 at 5:54 PM Matthew Wilcox <willy@infradead.org> wrote: > > > On Tue, Oct 23, 2018 at 05:44:32PM +0530, Souptick Joarder wrote: > > > > Instruction from Matthew Wilcox who is supervising the entire vm_fault_t > > > > migration work :-) > > > > > > Hang on. That was for the initial vm_fault_t conversion in which each > > > step was clearly an improvement. What you're looking at now is far > > > from that. > > > > Ok. But my understanding was, the approach of vm_insert_range comes > > into discussion as part of converting vm_insert_page into vmf_insert_page > > which is still part of original vm_fault_t conversion discussion. No ? > > No. The initial part (converting all page fault methods to vm_fault_t) > is done. What remains undone (looking at akpm's tree) is changing the > typedef of vm_fault_t from int to unsigned int. That will prevent new > page fault handlers with the wrong type from being added. Ok, I will post the final typedef of vm_fault_t patch. > > I don't necessarily want to get rid of vm_insert_page(). Maybe it will > make sense to do that, and maybe not. What I do want to see is thought, > and not "Matthew told me to do it", when I didn't. I didn't mean it in other way. Sorry about it. I will work on it.
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 6656647..58d7971 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1598,7 +1598,7 @@ static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma pages += off; do { - int ret = vm_insert_page(vma, uaddr, *pages++); + int ret = vm_insert_kmem_page(vma, uaddr, *pages++); if (ret) { pr_err("Remapping memory failed: %d\n", ret); return ret; diff --git a/drivers/auxdisplay/cfag12864bfb.c b/drivers/auxdisplay/cfag12864bfb.c index 40c8a55..82fd627 100644 --- a/drivers/auxdisplay/cfag12864bfb.c +++ b/drivers/auxdisplay/cfag12864bfb.c @@ -52,7 +52,7 @@ static int cfag12864bfb_mmap(struct fb_info *info, struct vm_area_struct *vma) { - return vm_insert_page(vma, vma->vm_start, + return vm_insert_kmem_page(vma, vma->vm_start, virt_to_page(cfag12864b_buffer)); } diff --git a/drivers/auxdisplay/ht16k33.c b/drivers/auxdisplay/ht16k33.c index a43276c..64de30b 100644 --- a/drivers/auxdisplay/ht16k33.c +++ b/drivers/auxdisplay/ht16k33.c @@ -224,7 +224,7 @@ static int ht16k33_mmap(struct fb_info *info, struct vm_area_struct *vma) { struct ht16k33_priv *priv = info->par; - return vm_insert_page(vma, vma->vm_start, + return vm_insert_kmem_page(vma, vma->vm_start, virt_to_page(priv->fbdev.buffer)); } diff --git a/drivers/firewire/core-iso.c b/drivers/firewire/core-iso.c index 051327a..5f1548d 100644 --- a/drivers/firewire/core-iso.c +++ b/drivers/firewire/core-iso.c @@ -112,7 +112,7 @@ int fw_iso_buffer_map_vma(struct fw_iso_buffer *buffer, uaddr = vma->vm_start; for (i = 0; i < buffer->page_count; i++) { - err = vm_insert_page(vma, uaddr, buffer->pages[i]); + err = vm_insert_kmem_page(vma, uaddr, buffer->pages[i]); if (err) return err; diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c index a8db758..57eb7af 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c @@ -234,7 +234,7 @@ static int rockchip_drm_gem_object_mmap_iommu(struct drm_gem_object *obj, return -ENXIO; for (i = offset; i < end; i++) { - ret = vm_insert_page(vma, uaddr, rk_obj->pages[i]); + ret = vm_insert_kmem_page(vma, uaddr, rk_obj->pages[i]); if (ret) return ret; uaddr += PAGE_SIZE; diff --git a/include/linux/mm.h b/include/linux/mm.h index a61ebe8..5f42d35 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2477,6 +2477,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr); int remap_pfn_range(struct vm_area_struct *, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t); +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, + struct page *page); int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); diff --git a/kernel/kcov.c b/kernel/kcov.c index 3ebd09e..2afaeb4 100644 --- a/kernel/kcov.c +++ b/kernel/kcov.c @@ -293,8 +293,8 @@ static int kcov_mmap(struct file *filep, struct vm_area_struct *vma) spin_unlock(&kcov->lock); for (off = 0; off < size; off += PAGE_SIZE) { page = vmalloc_to_page(kcov->area + off); - if (vm_insert_page(vma, vma->vm_start + off, page)) - WARN_ONCE(1, "vm_insert_page() failed"); + if (vm_insert_kmem_page(vma, vma->vm_start + off, page)) + WARN_ONCE(1, "vm_insert_kmem_page() failed"); } return 0; } diff --git a/mm/memory.c b/mm/memory.c index c467102..b800c10 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1682,6 +1682,75 @@ pte_t *__get_locked_pte(struct mm_struct *mm, unsigned long addr, return pte_alloc_map_lock(mm, pmd, addr, ptl); } +static int insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, + struct page *page, pgprot_t prot) +{ + struct mm_struct *mm = vma->vm_mm; + int retval; + pte_t *pte; + spinlock_t *ptl; + + retval = -EINVAL; + if (PageAnon(page)) + goto out; + retval = -ENOMEM; + flush_dcache_page(page); + pte = get_locked_pte(mm, addr, &ptl); + if (!pte) + goto out; + retval = -EBUSY; + if (!pte_none(*pte)) + goto out_unlock; + + get_page(page); + inc_mm_counter_fast(mm, mm_counter_file(page)); + page_add_file_rmap(page, false); + set_pte_at(mm, addr, pte, mk_pte(page, prot)); + + retval = 0; + pte_unmap_unlock(pte, ptl); + return retval; +out_unlock: + pte_unmap_unlock(pte, ptl); +out: + return retval; +} + +/** + * vm_insert_kmem_page - insert single page into user vma + * @vma: user vma to map to + * @addr: target user address of this page + * @page: source kernel page + * + * This allows drivers to insert individual kernel memory into a user vma. + * This API should be used outside page fault handlers context. + * + * Previously the same has been done with vm_insert_page by drivers. But + * vm_insert_page will be converted to vmf_insert_page and will be used + * in fault handlers context and return type of vmf_insert_page will be + * vm_fault_t type. + * + * But there are places where drivers need to map kernel memory into user + * vma outside fault handlers context. As vmf_insert_page will be restricted + * to use within page fault handlers, vm_insert_kmem_page could be used + * to map kernel memory to user vma outside fault handlers context. + */ +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, + struct page *page) +{ + if (addr < vma->vm_start || addr >= vma->vm_end) + return -EFAULT; + if (!page_count(page)) + return -EINVAL; + if (!(vma->vm_flags & VM_MIXEDMAP)) { + BUG_ON(down_read_trylock(&vma->vm_mm->mmap_sem)); + BUG_ON(vma->vm_flags & VM_PFNMAP); + vma->vm_flags |= VM_MIXEDMAP; + } + return insert_kmem_page(vma, addr, page, vma->vm_page_prot); +} +EXPORT_SYMBOL(vm_insert_kmem_page); + /* * This is the old fallback for page remapping. * diff --git a/mm/nommu.c b/mm/nommu.c index e4aac33..153b8c8 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL(vm_insert_page); +int vm_insert_kmem_page(struct vm_area_struct *vma, unsigned long addr, + struct page *page) +{ + return -EINVAL; +} +EXPORT_SYMBOL(vm_insert_kmem_page); + /* * sys_brk() for the most part doesn't need the global kernel * lock, except when an application is doing something nasty diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a728fc4..61d279f 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2251,7 +2251,7 @@ int remap_vmalloc_range_partial(struct vm_area_struct *vma, unsigned long uaddr, struct page *page = vmalloc_to_page(kaddr); int ret; - ret = vm_insert_page(vma, uaddr, page); + ret = vm_insert_kmem_page(vma, uaddr, page); if (ret) return ret;
vm_insert_kmem_page is similar to vm_insert_page and will be used by drivers to map kernel (kmalloc/vmalloc/pages) allocated memory to user vma. Going forward, the plan is to restrict future drivers not to use vm_insert_page ( *it will generate new errno to VM_FAULT_CODE mapping code for new drivers which were already cleaned up for existing drivers*) in #PF (page fault handler) context but to make use of vmf_insert_page which returns VMF_FAULT_CODE and that is not possible until both vm_insert_page and vmf_insert_page API exists. But there are some consumers of vm_insert_page which use it outside #PF context. straight forward conversion of vm_insert_page to vmf_insert_page won't work there as those function calls expects errno not vm_fault_t in return. These are the approaches which could have been taken to handle this scenario - * Replace vm_insert_page with vmf_insert_page and then write few extra lines of code to convert VM_FAULT_CODE to errno which makes driver users more complex ( also the reverse mapping errno to VM_FAULT_CODE have been cleaned up as part of vm_fault_t migration , not preferred to introduce anything similar again) * Maintain both vm_insert_page and vmf_insert_page and use it in respective places. But it won't gurantee that vm_insert_page will never be used in #PF context. * Introduce a similar API like vm_insert_page, convert all non #PF consumer to use it and finally remove vm_insert_page by converting it to vmf_insert_page. And the 3rd approach was taken by introducing vm_insert_kmem_page(). In short, vmf_insert_page will be used in page fault handlers context and vm_insert_kmem_page will be used to map kernel memory to user vma outside page fault handlers context. Few drivers are converted to use vm_insert_kmem_page(). This will allow both to review the api and that it serves it purpose. other consumers of vm_insert_page (*used in non #PF context*) will be replaced by vm_insert_kmem_page, but in separate patches. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> --- v2: Few non #PF consumers of vm_insert_page are converted to use vm_insert_kmem_page in patch v2. Updated the change log. arch/arm/mm/dma-mapping.c | 2 +- drivers/auxdisplay/cfag12864bfb.c | 2 +- drivers/auxdisplay/ht16k33.c | 2 +- drivers/firewire/core-iso.c | 2 +- drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 2 +- include/linux/mm.h | 2 + kernel/kcov.c | 4 +- mm/memory.c | 69 +++++++++++++++++++++++++++++ mm/nommu.c | 7 +++ mm/vmalloc.c | 2 +- 10 files changed, 86 insertions(+), 8 deletions(-)