Message ID | 20181206183945.GA20932@jordon-HP-15-Notebook-PC (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3,1/9] mm: Introduce new vm_insert_range API | expand |
Em Fri, 7 Dec 2018 00:09:45 +0530 Souptick Joarder <jrdr.linux@gmail.com> escreveu: > Previouly drivers have their own way of mapping range of > kernel pages/memory into user vma and this was done by > invoking vm_insert_page() within a loop. > > As this pattern is common across different drivers, it can > be generalized by creating a new function and use it across > the drivers. > > vm_insert_range is the new API which will be used to map a > range of kernel memory/pages to user vma. > > This API is tested by Heiko for Rockchip drm driver, on rk3188, > rk3288, rk3328 and rk3399 with graphics. > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > Reviewed-by: Matthew Wilcox <willy@infradead.org> > Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> > Tested-by: Heiko Stuebner <heiko@sntech.de> Looks good to me. Reviewed-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> > --- > include/linux/mm.h | 2 ++ > mm/memory.c | 38 ++++++++++++++++++++++++++++++++++++++ > mm/nommu.c | 7 +++++++ > 3 files changed, 47 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index fcf9cc9..2bc399f 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2506,6 +2506,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, > int remap_pfn_range(struct vm_area_struct *, unsigned long addr, > unsigned long pfn, unsigned long size, pgprot_t); > int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > + struct page **pages, unsigned long page_count); > vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, > unsigned long pfn); > vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, > diff --git a/mm/memory.c b/mm/memory.c > index 15c417e..84ea46c 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr, > } > > /** > + * vm_insert_range - insert range of kernel pages into user vma > + * @vma: user vma to map to > + * @addr: target user address of this page > + * @pages: pointer to array of source kernel pages > + * @page_count: number of pages need to insert into user vma > + * > + * This allows drivers to insert range of kernel pages they've allocated > + * into a user vma. This is a generic function which drivers can use > + * rather than using their own way of mapping range of kernel pages into > + * user vma. > + * > + * If we fail to insert any page into the vma, the function will return > + * immediately leaving any previously-inserted pages present. Callers > + * from the mmap handler may immediately return the error as their caller > + * will destroy the vma, removing any successfully-inserted pages. Other > + * callers should make their own arrangements for calling unmap_region(). > + * > + * Context: Process context. Called by mmap handlers. > + * Return: 0 on success and error code otherwise > + */ > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > + struct page **pages, unsigned long page_count) > +{ > + unsigned long uaddr = addr; > + int ret = 0, i; > + > + for (i = 0; i < page_count; i++) { > + ret = vm_insert_page(vma, uaddr, pages[i]); > + if (ret < 0) > + return ret; > + uaddr += PAGE_SIZE; > + } > + > + return ret; > +} > +EXPORT_SYMBOL(vm_insert_range); > + > +/** > * vm_insert_page - insert single page into user vma > * @vma: user vma to map to > * @addr: target user address of this page > diff --git a/mm/nommu.c b/mm/nommu.c > index 749276b..d6ef5c7 100644 > --- a/mm/nommu.c > +++ b/mm/nommu.c > @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > } > EXPORT_SYMBOL(vm_insert_page); > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > + struct page **pages, unsigned long page_count) > +{ > + return -EINVAL; > +} > +EXPORT_SYMBOL(vm_insert_range); > + > /* > * sys_brk() for the most part doesn't need the global kernel > * lock, except when an application is doing something nasty Thanks, Mauro
On 06/12/2018 18:39, Souptick Joarder wrote: > Previouly drivers have their own way of mapping range of > kernel pages/memory into user vma and this was done by > invoking vm_insert_page() within a loop. > > As this pattern is common across different drivers, it can > be generalized by creating a new function and use it across > the drivers. > > vm_insert_range is the new API which will be used to map a > range of kernel memory/pages to user vma. > > This API is tested by Heiko for Rockchip drm driver, on rk3188, > rk3288, rk3328 and rk3399 with graphics. > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > Reviewed-by: Matthew Wilcox <willy@infradead.org> > Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> > Tested-by: Heiko Stuebner <heiko@sntech.de> > --- > include/linux/mm.h | 2 ++ > mm/memory.c | 38 ++++++++++++++++++++++++++++++++++++++ > mm/nommu.c | 7 +++++++ > 3 files changed, 47 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index fcf9cc9..2bc399f 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2506,6 +2506,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, > int remap_pfn_range(struct vm_area_struct *, unsigned long addr, > unsigned long pfn, unsigned long size, pgprot_t); > int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > + struct page **pages, unsigned long page_count); > vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, > unsigned long pfn); > vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, > diff --git a/mm/memory.c b/mm/memory.c > index 15c417e..84ea46c 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr, > } > > /** > + * vm_insert_range - insert range of kernel pages into user vma > + * @vma: user vma to map to > + * @addr: target user address of this page > + * @pages: pointer to array of source kernel pages > + * @page_count: number of pages need to insert into user vma > + * > + * This allows drivers to insert range of kernel pages they've allocated > + * into a user vma. This is a generic function which drivers can use > + * rather than using their own way of mapping range of kernel pages into > + * user vma. > + * > + * If we fail to insert any page into the vma, the function will return > + * immediately leaving any previously-inserted pages present. Callers > + * from the mmap handler may immediately return the error as their caller > + * will destroy the vma, removing any successfully-inserted pages. Other > + * callers should make their own arrangements for calling unmap_region(). > + * > + * Context: Process context. Called by mmap handlers. > + * Return: 0 on success and error code otherwise > + */ > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > + struct page **pages, unsigned long page_count) > +{ > + unsigned long uaddr = addr; > + int ret = 0, i; Some of the sites being replaced were effectively ensuring that vma and pages were mutually compatible as an initial condition - would it be worth adding something here for robustness, e.g.: + if (page_count != vma_pages(vma)) + return -ENXIO; ? (then you could also clean up a couple more places where you're not already removing such checks) Robin. > + > + for (i = 0; i < page_count; i++) { > + ret = vm_insert_page(vma, uaddr, pages[i]); > + if (ret < 0) > + return ret; > + uaddr += PAGE_SIZE; > + } > + > + return ret; > +} > +EXPORT_SYMBOL(vm_insert_range); > + > +/** > * vm_insert_page - insert single page into user vma > * @vma: user vma to map to > * @addr: target user address of this page > diff --git a/mm/nommu.c b/mm/nommu.c > index 749276b..d6ef5c7 100644 > --- a/mm/nommu.c > +++ b/mm/nommu.c > @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > } > EXPORT_SYMBOL(vm_insert_page); > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > + struct page **pages, unsigned long page_count) > +{ > + return -EINVAL; > +} > +EXPORT_SYMBOL(vm_insert_range); > + > /* > * sys_brk() for the most part doesn't need the global kernel > * lock, except when an application is doing something nasty >
On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote: > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > > + struct page **pages, unsigned long page_count) > > +{ > > + unsigned long uaddr = addr; > > + int ret = 0, i; > > Some of the sites being replaced were effectively ensuring that vma and > pages were mutually compatible as an initial condition - would it be worth > adding something here for robustness, e.g.: > > + if (page_count != vma_pages(vma)) > + return -ENXIO; I think we want to allow this to be used to populate part of a VMA. So perhaps: if (page_count > vma_pages(vma)) return -ENXIO;
On Fri, Dec 7, 2018 at 10:41 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote: > > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > > > + struct page **pages, unsigned long page_count) > > > +{ > > > + unsigned long uaddr = addr; > > > + int ret = 0, i; > > > > Some of the sites being replaced were effectively ensuring that vma and > > pages were mutually compatible as an initial condition - would it be worth > > adding something here for robustness, e.g.: > > > > + if (page_count != vma_pages(vma)) > > + return -ENXIO; > > I think we want to allow this to be used to populate part of a VMA. > So perhaps: > > if (page_count > vma_pages(vma)) > return -ENXIO; Ok, This can be added. I think Patch [2/9] is the only leftover place where this check could be removed.
On 2018-12-07 7:28 pm, Souptick Joarder wrote: > On Fri, Dec 7, 2018 at 10:41 PM Matthew Wilcox <willy@infradead.org> wrote: >> >> On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote: >>>> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, >>>> + struct page **pages, unsigned long page_count) >>>> +{ >>>> + unsigned long uaddr = addr; >>>> + int ret = 0, i; >>> >>> Some of the sites being replaced were effectively ensuring that vma and >>> pages were mutually compatible as an initial condition - would it be worth >>> adding something here for robustness, e.g.: >>> >>> + if (page_count != vma_pages(vma)) >>> + return -ENXIO; >> >> I think we want to allow this to be used to populate part of a VMA. >> So perhaps: >> >> if (page_count > vma_pages(vma)) >> return -ENXIO; > > Ok, This can be added. > > I think Patch [2/9] is the only leftover place where this > check could be removed. Right, 9/9 could also have relied on my stricter check here, but since it's really testing whether it actually managed to allocate vma_pages() worth of pages earlier, Matthew's more lenient version won't help for that one. (Why privcmd_buf_mmap() doesn't clean up and return an error as soon as that allocation loop fails, without taking the mutex under which it still does a bunch more pointless work to only undo it again, is a mind-boggling mystery, but that's not our problem here...) Robin.
On Sat, Dec 8, 2018 at 2:40 AM Robin Murphy <robin.murphy@arm.com> wrote: > > On 2018-12-07 7:28 pm, Souptick Joarder wrote: > > On Fri, Dec 7, 2018 at 10:41 PM Matthew Wilcox <willy@infradead.org> wrote: > >> > >> On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote: > >>>> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > >>>> + struct page **pages, unsigned long page_count) > >>>> +{ > >>>> + unsigned long uaddr = addr; > >>>> + int ret = 0, i; > >>> > >>> Some of the sites being replaced were effectively ensuring that vma and > >>> pages were mutually compatible as an initial condition - would it be worth > >>> adding something here for robustness, e.g.: > >>> > >>> + if (page_count != vma_pages(vma)) > >>> + return -ENXIO; > >> > >> I think we want to allow this to be used to populate part of a VMA. > >> So perhaps: > >> > >> if (page_count > vma_pages(vma)) > >> return -ENXIO; > > > > Ok, This can be added. > > > > I think Patch [2/9] is the only leftover place where this > > check could be removed. > > Right, 9/9 could also have relied on my stricter check here, but since > it's really testing whether it actually managed to allocate vma_pages() > worth of pages earlier, Matthew's more lenient version won't help for > that one. (Why privcmd_buf_mmap() doesn't clean up and return an error > as soon as that allocation loop fails, without taking the mutex under > which it still does a bunch more pointless work to only undo it again, > is a mind-boggling mystery, but that's not our problem here...) I think some clean up can be done here in a separate patch.
diff --git a/include/linux/mm.h b/include/linux/mm.h index fcf9cc9..2bc399f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2506,6 +2506,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma, int remap_pfn_range(struct vm_area_struct *, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t); int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, + struct page **pages, unsigned long page_count); vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 15c417e..84ea46c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr, } /** + * vm_insert_range - insert range of kernel pages into user vma + * @vma: user vma to map to + * @addr: target user address of this page + * @pages: pointer to array of source kernel pages + * @page_count: number of pages need to insert into user vma + * + * This allows drivers to insert range of kernel pages they've allocated + * into a user vma. This is a generic function which drivers can use + * rather than using their own way of mapping range of kernel pages into + * user vma. + * + * If we fail to insert any page into the vma, the function will return + * immediately leaving any previously-inserted pages present. Callers + * from the mmap handler may immediately return the error as their caller + * will destroy the vma, removing any successfully-inserted pages. Other + * callers should make their own arrangements for calling unmap_region(). + * + * Context: Process context. Called by mmap handlers. + * Return: 0 on success and error code otherwise + */ +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, + struct page **pages, unsigned long page_count) +{ + unsigned long uaddr = addr; + int ret = 0, i; + + for (i = 0; i < page_count; i++) { + ret = vm_insert_page(vma, uaddr, pages[i]); + if (ret < 0) + return ret; + uaddr += PAGE_SIZE; + } + + return ret; +} +EXPORT_SYMBOL(vm_insert_range); + +/** * vm_insert_page - insert single page into user vma * @vma: user vma to map to * @addr: target user address of this page diff --git a/mm/nommu.c b/mm/nommu.c index 749276b..d6ef5c7 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL(vm_insert_page); +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, + struct page **pages, unsigned long page_count) +{ + return -EINVAL; +} +EXPORT_SYMBOL(vm_insert_range); + /* * sys_brk() for the most part doesn't need the global kernel * lock, except when an application is doing something nasty