Message ID | 20230201081737.2330141-2-fengwei.yin@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | folio based filemap_map_pages() | expand |
On Wed, Feb 01, 2023 at 04:17:33PM +0800, Yin Fengwei wrote: > The shared fault handler can also benefit from fault-around. While it > is uncommon to write to MAP_SHARED files, applications that do will see > a huge benefit with will-it-scale:page_fault3 (shared file write fault) > improving by 375%. > > Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> > Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > mm/memory.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index 7a04a1130ec1..51c04bb60724 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4546,6 +4546,17 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf) > struct vm_area_struct *vma = vmf->vma; > vm_fault_t ret, tmp; > > + /* > + * Let's call ->map_pages() first and use ->fault() as fallback > + * if page by the offset is not ready to be mapped (cold cache or > + * something). > + */ > + if (should_fault_around(vmf)) { > + ret = do_fault_around(vmf); > + if (ret) > + return ret; > + } > + I believe it bypasses ->page_mkwrite() completely, no? So you get a writable PTEs without notifying the filesystem. Smells like a data loss. > ret = __do_fault(vmf); > if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) > return ret; > -- > 2.30.2 > >
On 2/1/2023 10:34 PM, Kirill A. Shutemov wrote: > On Wed, Feb 01, 2023 at 04:17:33PM +0800, Yin Fengwei wrote: >> The shared fault handler can also benefit from fault-around. While it >> is uncommon to write to MAP_SHARED files, applications that do will see >> a huge benefit with will-it-scale:page_fault3 (shared file write fault) >> improving by 375%. >> >> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> >> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> >> --- >> mm/memory.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 7a04a1130ec1..51c04bb60724 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4546,6 +4546,17 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf) >> struct vm_area_struct *vma = vmf->vma; >> vm_fault_t ret, tmp; >> >> + /* >> + * Let's call ->map_pages() first and use ->fault() as fallback >> + * if page by the offset is not ready to be mapped (cold cache or >> + * something). >> + */ >> + if (should_fault_around(vmf)) { >> + ret = do_fault_around(vmf); >> + if (ret) >> + return ret; >> + } >> + > > I believe it bypasses ->page_mkwrite() completely, no? > > So you get a writable PTEs without notifying the filesystem. Smells like a > data loss. Yes. You are right. This may be the reason why fault around is not enabled for shared file write fault? I will drop this patch. Thanks. Regards Yin, Fengwei > >> ret = __do_fault(vmf); >> if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) >> return ret; >> -- >> 2.30.2 >> >> >
diff --git a/mm/memory.c b/mm/memory.c index 7a04a1130ec1..51c04bb60724 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4546,6 +4546,17 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; vm_fault_t ret, tmp; + /* + * Let's call ->map_pages() first and use ->fault() as fallback + * if page by the offset is not ready to be mapped (cold cache or + * something). + */ + if (should_fault_around(vmf)) { + ret = do_fault_around(vmf); + if (ret) + return ret; + } + ret = __do_fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) return ret;