Message ID | 20230130125504.2509710-3-fengwei.yin@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | folio based filemap_map_pages() | expand |
On Mon, Jan 30, 2023 at 08:55:01PM +0800, Yin Fengwei wrote: > Add function to do file page mapping based on folio and update > filemap_map_pages() to use new function. So the filemap page > mapping will deal with folio granularity instead of page > granularity. This allow batched folio refcount update. > > Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> > --- > mm/filemap.c | 82 ++++++++++++++++++++++++++++++---------------------- > 1 file changed, 48 insertions(+), 34 deletions(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index c915ded191f0..fe0c226c8b1e 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3351,6 +3351,43 @@ static inline struct folio *next_map_page(struct address_space *mapping, > mapping, xas, end_pgoff); > } > > + I'd remove this blank line, we typically only have one blank line between functions. > +static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, > + struct folio *folio, struct page *page, unsigned long addr, > + int len) I see this under-indentation in other parts of the mm and it drives me crazy. Two tabs to indent the arguments please, otherwise they look like part of the function. Also, 'len' is ambiguous. I'd call this 'nr' or 'nr_pages'. Also it should be an unsigned int. > +{ > + vm_fault_t ret = 0; > + struct vm_area_struct *vma = vmf->vma; > + struct file *file = vma->vm_file; > + unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); > + int ref_count = 0, count = 0; Also make these unsigned. > - /* > - * NOTE: If there're PTE markers, we'll leave them to be > - * handled in the specific fault path, and it'll prohibit the > - * fault-around logic. > - */ I'd rather not lose this comment; can you move it into filemap_map_folio_range() please? > - if (!pte_none(*vmf->pte)) > - goto unlock; > - > - /* We're about to handle the fault */ > - if (vmf->address == addr) > + if (VM_FAULT_NOPAGE == > + filemap_map_folio_range(vmf, folio, page, addr, len)) > ret = VM_FAULT_NOPAGE; That indentation is also confusing. Try this: if (filemap_map_folio_range(vmf, folio, page, addr, len) == VM_FAULT_NOPAGE) ret = VM_FAULT_NOPAGE; Except there's an easier way to write it: ret |= filemap_map_folio_range(vmf, folio, page, addr, len); Thanks for doing this! Looks so much better and performs better!
On 1/30/2023 9:35 PM, Matthew Wilcox wrote: > On Mon, Jan 30, 2023 at 08:55:01PM +0800, Yin Fengwei wrote: >> Add function to do file page mapping based on folio and update >> filemap_map_pages() to use new function. So the filemap page >> mapping will deal with folio granularity instead of page >> granularity. This allow batched folio refcount update. >> >> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> >> --- >> mm/filemap.c | 82 ++++++++++++++++++++++++++++++---------------------- >> 1 file changed, 48 insertions(+), 34 deletions(-) >> >> diff --git a/mm/filemap.c b/mm/filemap.c >> index c915ded191f0..fe0c226c8b1e 100644 >> --- a/mm/filemap.c >> +++ b/mm/filemap.c >> @@ -3351,6 +3351,43 @@ static inline struct folio *next_map_page(struct address_space *mapping, >> mapping, xas, end_pgoff); >> } >> >> + > > I'd remove this blank line, we typically only have one blank line > between functions. OK. > >> +static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, >> + struct folio *folio, struct page *page, unsigned long addr, >> + int len) > > I see this under-indentation in other parts of the mm and it drives me > crazy. Two tabs to indent the arguments please, otherwise they look > like part of the function. OK. I will correct all the indent problems in this series in next version. > > Also, 'len' is ambiguous. I'd call this 'nr' or 'nr_pages'. Also > it should be an unsigned int. > >> +{ >> + vm_fault_t ret = 0; >> + struct vm_area_struct *vma = vmf->vma; >> + struct file *file = vma->vm_file; >> + unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); >> + int ref_count = 0, count = 0; > > Also make these unsigned. > >> - /* >> - * NOTE: If there're PTE markers, we'll leave them to be >> - * handled in the specific fault path, and it'll prohibit the >> - * fault-around logic. >> - */ > > I'd rather not lose this comment; can you move it into > filemap_map_folio_range() please? I will keep all the comments in the right place in next version. Regards Yin, Fengwei > >> - if (!pte_none(*vmf->pte)) >> - goto unlock; >> - >> - /* We're about to handle the fault */ >> - if (vmf->address == addr) >> + if (VM_FAULT_NOPAGE == >> + filemap_map_folio_range(vmf, folio, page, addr, len)) >> ret = VM_FAULT_NOPAGE; > > That indentation is also confusing. Try this: > > if (filemap_map_folio_range(vmf, folio, page, addr, len) == > VM_FAULT_NOPAGE) > ret = VM_FAULT_NOPAGE; > > Except there's an easier way to write it: > > ret |= filemap_map_folio_range(vmf, folio, page, addr, len); > > > Thanks for doing this! Looks so much better and performs better!
Yin Fengwei <fengwei.yin@intel.com> writes: > Add function to do file page mapping based on folio and update > filemap_map_pages() to use new function. So the filemap page > mapping will deal with folio granularity instead of page > granularity. This allow batched folio refcount update. > > Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> > --- > mm/filemap.c | 82 ++++++++++++++++++++++++++++++---------------------- > 1 file changed, 48 insertions(+), 34 deletions(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index c915ded191f0..fe0c226c8b1e 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3351,6 +3351,43 @@ static inline struct folio *next_map_page(struct address_space *mapping, > mapping, xas, end_pgoff); > } > > + > +static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, > + struct folio *folio, struct page *page, unsigned long addr, > + int len) As Matthew pointed out, we should rename 'len'. And some comments about the meaning of the parameters should be good. For example, /* Map sub-pages [start_page, start_page + nr_pages) of folio */ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, struct folio *folio, struct page *start_page, unsigned int nr_pages, unsigned long start) Best Regards, Huang, Ying > +{ > + vm_fault_t ret = 0; > + struct vm_area_struct *vma = vmf->vma; > + struct file *file = vma->vm_file; > + unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); > + int ref_count = 0, count = 0; > + > + do { > + if (PageHWPoison(page)) > + continue; > + > + if (mmap_miss > 0) > + mmap_miss--; > + > + if (!pte_none(*vmf->pte)) > + continue; > + > + if (vmf->address == addr) > + ret = VM_FAULT_NOPAGE; > + > + ref_count++; > + > + do_set_pte(vmf, page, addr); > + update_mmu_cache(vma, addr, vmf->pte); > + > + } while (vmf->pte++, page++, addr += PAGE_SIZE, ++count < len); > + > + folio_ref_add(folio, ref_count); > + WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss); > + > + return ret; > +} > + > vm_fault_t filemap_map_pages(struct vm_fault *vmf, > pgoff_t start_pgoff, pgoff_t end_pgoff) > { > @@ -3361,9 +3398,9 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > unsigned long addr; > XA_STATE(xas, &mapping->i_pages, start_pgoff); > struct folio *folio; > - struct page *page; > unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); > vm_fault_t ret = 0; > + int len = 0; > > rcu_read_lock(); > folio = first_map_page(mapping, &xas, end_pgoff); > @@ -3378,45 +3415,22 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > addr = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT); > vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); > do { > -again: > - page = folio_file_page(folio, xas.xa_index); > - if (PageHWPoison(page)) > - goto unlock; > - > - if (mmap_miss > 0) > - mmap_miss--; > + struct page *page; > + unsigned long end; > > + page = folio_file_page(folio, xas.xa_index); > addr += (xas.xa_index - last_pgoff) << PAGE_SHIFT; > - vmf->pte += xas.xa_index - last_pgoff; > + vmf->pte += xas.xa_index - last_pgoff - len; > last_pgoff = xas.xa_index; > + end = folio->index + folio_nr_pages(folio) - 1; > + len = min(end, end_pgoff) - xas.xa_index + 1; > > - /* > - * NOTE: If there're PTE markers, we'll leave them to be > - * handled in the specific fault path, and it'll prohibit the > - * fault-around logic. > - */ > - if (!pte_none(*vmf->pte)) > - goto unlock; > - > - /* We're about to handle the fault */ > - if (vmf->address == addr) > + if (VM_FAULT_NOPAGE == > + filemap_map_folio_range(vmf, folio, page, addr, len)) > ret = VM_FAULT_NOPAGE; > > - do_set_pte(vmf, page, addr); > - /* no need to invalidate: a not-present page won't be cached */ > - update_mmu_cache(vma, addr, vmf->pte); > - if (folio_more_pages(folio, xas.xa_index, end_pgoff)) { > - xas.xa_index++; > - folio_ref_inc(folio); > - goto again; > - } > - folio_unlock(folio); > - continue; > -unlock: > - if (folio_more_pages(folio, xas.xa_index, end_pgoff)) { > - xas.xa_index++; > - goto again; > - } > + xas.xa_index = end; > + > folio_unlock(folio); > folio_put(folio); > } while ((folio = next_map_page(mapping, &xas, end_pgoff)) != NULL);
On 1/31/2023 11:34 AM, Huang, Ying wrote: > Yin Fengwei <fengwei.yin@intel.com> writes: > >> Add function to do file page mapping based on folio and update >> filemap_map_pages() to use new function. So the filemap page >> mapping will deal with folio granularity instead of page >> granularity. This allow batched folio refcount update. >> >> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> >> --- >> mm/filemap.c | 82 ++++++++++++++++++++++++++++++---------------------- >> 1 file changed, 48 insertions(+), 34 deletions(-) >> >> diff --git a/mm/filemap.c b/mm/filemap.c >> index c915ded191f0..fe0c226c8b1e 100644 >> --- a/mm/filemap.c >> +++ b/mm/filemap.c >> @@ -3351,6 +3351,43 @@ static inline struct folio *next_map_page(struct address_space *mapping, >> mapping, xas, end_pgoff); >> } >> >> + >> +static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, >> + struct folio *folio, struct page *page, unsigned long addr, >> + int len) > > As Matthew pointed out, we should rename 'len'. And some comments about > the meaning of the parameters should be good. For example, > > /* Map sub-pages [start_page, start_page + nr_pages) of folio */ > static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, > struct folio *folio, struct page *start_page, unsigned int nr_pages, > unsigned long start) Yes. I will address this in next version series. Thanks. Regards Yin, Fengwei > > Best Regards, > Huang, Ying > >> +{ >> + vm_fault_t ret = 0; >> + struct vm_area_struct *vma = vmf->vma; >> + struct file *file = vma->vm_file; >> + unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); >> + int ref_count = 0, count = 0; >> + >> + do { >> + if (PageHWPoison(page)) >> + continue; >> + >> + if (mmap_miss > 0) >> + mmap_miss--; >> + >> + if (!pte_none(*vmf->pte)) >> + continue; >> + >> + if (vmf->address == addr) >> + ret = VM_FAULT_NOPAGE; >> + >> + ref_count++; >> + >> + do_set_pte(vmf, page, addr); >> + update_mmu_cache(vma, addr, vmf->pte); >> + >> + } while (vmf->pte++, page++, addr += PAGE_SIZE, ++count < len); >> + >> + folio_ref_add(folio, ref_count); >> + WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss); >> + >> + return ret; >> +} >> + >> vm_fault_t filemap_map_pages(struct vm_fault *vmf, >> pgoff_t start_pgoff, pgoff_t end_pgoff) >> { >> @@ -3361,9 +3398,9 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, >> unsigned long addr; >> XA_STATE(xas, &mapping->i_pages, start_pgoff); >> struct folio *folio; >> - struct page *page; >> unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); >> vm_fault_t ret = 0; >> + int len = 0; >> >> rcu_read_lock(); >> folio = first_map_page(mapping, &xas, end_pgoff); >> @@ -3378,45 +3415,22 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, >> addr = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT); >> vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); >> do { >> -again: >> - page = folio_file_page(folio, xas.xa_index); >> - if (PageHWPoison(page)) >> - goto unlock; >> - >> - if (mmap_miss > 0) >> - mmap_miss--; >> + struct page *page; >> + unsigned long end; >> >> + page = folio_file_page(folio, xas.xa_index); >> addr += (xas.xa_index - last_pgoff) << PAGE_SHIFT; >> - vmf->pte += xas.xa_index - last_pgoff; >> + vmf->pte += xas.xa_index - last_pgoff - len; >> last_pgoff = xas.xa_index; >> + end = folio->index + folio_nr_pages(folio) - 1; >> + len = min(end, end_pgoff) - xas.xa_index + 1; >> >> - /* >> - * NOTE: If there're PTE markers, we'll leave them to be >> - * handled in the specific fault path, and it'll prohibit the >> - * fault-around logic. >> - */ >> - if (!pte_none(*vmf->pte)) >> - goto unlock; >> - >> - /* We're about to handle the fault */ >> - if (vmf->address == addr) >> + if (VM_FAULT_NOPAGE == >> + filemap_map_folio_range(vmf, folio, page, addr, len)) >> ret = VM_FAULT_NOPAGE; >> >> - do_set_pte(vmf, page, addr); >> - /* no need to invalidate: a not-present page won't be cached */ >> - update_mmu_cache(vma, addr, vmf->pte); >> - if (folio_more_pages(folio, xas.xa_index, end_pgoff)) { >> - xas.xa_index++; >> - folio_ref_inc(folio); >> - goto again; >> - } >> - folio_unlock(folio); >> - continue; >> -unlock: >> - if (folio_more_pages(folio, xas.xa_index, end_pgoff)) { >> - xas.xa_index++; >> - goto again; >> - } >> + xas.xa_index = end; >> + >> folio_unlock(folio); >> folio_put(folio); >> } while ((folio = next_map_page(mapping, &xas, end_pgoff)) != NULL);
diff --git a/mm/filemap.c b/mm/filemap.c index c915ded191f0..fe0c226c8b1e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3351,6 +3351,43 @@ static inline struct folio *next_map_page(struct address_space *mapping, mapping, xas, end_pgoff); } + +static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, + struct folio *folio, struct page *page, unsigned long addr, + int len) +{ + vm_fault_t ret = 0; + struct vm_area_struct *vma = vmf->vma; + struct file *file = vma->vm_file; + unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); + int ref_count = 0, count = 0; + + do { + if (PageHWPoison(page)) + continue; + + if (mmap_miss > 0) + mmap_miss--; + + if (!pte_none(*vmf->pte)) + continue; + + if (vmf->address == addr) + ret = VM_FAULT_NOPAGE; + + ref_count++; + + do_set_pte(vmf, page, addr); + update_mmu_cache(vma, addr, vmf->pte); + + } while (vmf->pte++, page++, addr += PAGE_SIZE, ++count < len); + + folio_ref_add(folio, ref_count); + WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss); + + return ret; +} + vm_fault_t filemap_map_pages(struct vm_fault *vmf, pgoff_t start_pgoff, pgoff_t end_pgoff) { @@ -3361,9 +3398,9 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, unsigned long addr; XA_STATE(xas, &mapping->i_pages, start_pgoff); struct folio *folio; - struct page *page; unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); vm_fault_t ret = 0; + int len = 0; rcu_read_lock(); folio = first_map_page(mapping, &xas, end_pgoff); @@ -3378,45 +3415,22 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, addr = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT); vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); do { -again: - page = folio_file_page(folio, xas.xa_index); - if (PageHWPoison(page)) - goto unlock; - - if (mmap_miss > 0) - mmap_miss--; + struct page *page; + unsigned long end; + page = folio_file_page(folio, xas.xa_index); addr += (xas.xa_index - last_pgoff) << PAGE_SHIFT; - vmf->pte += xas.xa_index - last_pgoff; + vmf->pte += xas.xa_index - last_pgoff - len; last_pgoff = xas.xa_index; + end = folio->index + folio_nr_pages(folio) - 1; + len = min(end, end_pgoff) - xas.xa_index + 1; - /* - * NOTE: If there're PTE markers, we'll leave them to be - * handled in the specific fault path, and it'll prohibit the - * fault-around logic. - */ - if (!pte_none(*vmf->pte)) - goto unlock; - - /* We're about to handle the fault */ - if (vmf->address == addr) + if (VM_FAULT_NOPAGE == + filemap_map_folio_range(vmf, folio, page, addr, len)) ret = VM_FAULT_NOPAGE; - do_set_pte(vmf, page, addr); - /* no need to invalidate: a not-present page won't be cached */ - update_mmu_cache(vma, addr, vmf->pte); - if (folio_more_pages(folio, xas.xa_index, end_pgoff)) { - xas.xa_index++; - folio_ref_inc(folio); - goto again; - } - folio_unlock(folio); - continue; -unlock: - if (folio_more_pages(folio, xas.xa_index, end_pgoff)) { - xas.xa_index++; - goto again; - } + xas.xa_index = end; + folio_unlock(folio); folio_put(folio); } while ((folio = next_map_page(mapping, &xas, end_pgoff)) != NULL);
Add function to do file page mapping based on folio and update filemap_map_pages() to use new function. So the filemap page mapping will deal with folio granularity instead of page granularity. This allow batched folio refcount update. Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> --- mm/filemap.c | 82 ++++++++++++++++++++++++++++++---------------------- 1 file changed, 48 insertions(+), 34 deletions(-)