Message ID | 20190416134522.17540-21-ldufour@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Speculative page faults | expand |
On Tue, Apr 16, 2019 at 03:45:11PM +0200, Laurent Dufour wrote: > The final goal is to be able to use a VMA structure without holding the > mmap_sem and to be sure that the structure will not be freed in our back. > > The lockless use of the VMA will be done through RCU protection and thus a > dedicated freeing service is required to manage it asynchronously. > > As reported in a 2010's thread [1], this may impact file handling when a > file is still referenced while the mapping is no more there. As the final > goal is to handle anonymous VMA in a speculative way and not file backed > mapping, we could close and free the file pointer in a synchronous way, as > soon as we are guaranteed to not use it without holding the mmap_sem. For > sanity reason, in a minimal effort, the vm_file file pointer is unset once > the file pointer is put. > > [1] https://lore.kernel.org/linux-mm/20100104182429.833180340@chello.nl/ > > Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Using kref would have been better from my POV even with RCU freeing but anyway: Reviewed-by: Jérôme Glisse <jglisse@redhat.com> > --- > include/linux/mm.h | 4 ++++ > include/linux/mm_types.h | 3 +++ > mm/internal.h | 27 +++++++++++++++++++++++++++ > mm/mmap.c | 13 +++++++++---- > 4 files changed, 43 insertions(+), 4 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index f14b2c9ddfd4..f761a9c65c74 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -529,6 +529,9 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) > vma->vm_mm = mm; > vma->vm_ops = &dummy_vm_ops; > INIT_LIST_HEAD(&vma->anon_vma_chain); > +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT > + atomic_set(&vma->vm_ref_count, 1); > +#endif > } > > static inline void vma_set_anonymous(struct vm_area_struct *vma) > @@ -1418,6 +1421,7 @@ static inline void INIT_VMA(struct vm_area_struct *vma) > INIT_LIST_HEAD(&vma->anon_vma_chain); > #ifdef CONFIG_SPECULATIVE_PAGE_FAULT > seqcount_init(&vma->vm_sequence); > + atomic_set(&vma->vm_ref_count, 1); > #endif > } > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 24b3f8ce9e42..6a6159e11a3f 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -285,6 +285,9 @@ struct vm_area_struct { > /* linked list of VM areas per task, sorted by address */ > struct vm_area_struct *vm_next, *vm_prev; > > +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT > + atomic_t vm_ref_count; > +#endif > struct rb_node vm_rb; > > /* > diff --git a/mm/internal.h b/mm/internal.h > index 9eeaf2b95166..302382bed406 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -40,6 +40,33 @@ void page_writeback_init(void); > > vm_fault_t do_swap_page(struct vm_fault *vmf); > > + > +extern void __free_vma(struct vm_area_struct *vma); > + > +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT > +static inline void get_vma(struct vm_area_struct *vma) > +{ > + atomic_inc(&vma->vm_ref_count); > +} > + > +static inline void put_vma(struct vm_area_struct *vma) > +{ > + if (atomic_dec_and_test(&vma->vm_ref_count)) > + __free_vma(vma); > +} > + > +#else > + > +static inline void get_vma(struct vm_area_struct *vma) > +{ > +} > + > +static inline void put_vma(struct vm_area_struct *vma) > +{ > + __free_vma(vma); > +} > +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ > + > void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, > unsigned long floor, unsigned long ceiling); > > diff --git a/mm/mmap.c b/mm/mmap.c > index f7f6027a7dff..c106440dcae7 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -188,6 +188,12 @@ static inline void mm_write_sequnlock(struct mm_struct *mm) > } > #endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ > > +void __free_vma(struct vm_area_struct *vma) > +{ > + mpol_put(vma_policy(vma)); > + vm_area_free(vma); > +} > + > /* > * Close a vm structure and free it, returning the next. > */ > @@ -200,8 +206,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) > vma->vm_ops->close(vma); > if (vma->vm_file) > fput(vma->vm_file); > - mpol_put(vma_policy(vma)); > - vm_area_free(vma); > + vma->vm_file = NULL; > + put_vma(vma); > return next; > } > > @@ -990,8 +996,7 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start, > if (next->anon_vma) > anon_vma_merge(vma, next); > mm->map_count--; > - mpol_put(vma_policy(next)); > - vm_area_free(next); > + put_vma(next); > /* > * In mprotect's case 6 (see comments on vma_merge), > * we must remove another next too. It would clutter > -- > 2.21.0 >
Le 22/04/2019 à 22:36, Jerome Glisse a écrit : > On Tue, Apr 16, 2019 at 03:45:11PM +0200, Laurent Dufour wrote: >> The final goal is to be able to use a VMA structure without holding the >> mmap_sem and to be sure that the structure will not be freed in our back. >> >> The lockless use of the VMA will be done through RCU protection and thus a >> dedicated freeing service is required to manage it asynchronously. >> >> As reported in a 2010's thread [1], this may impact file handling when a >> file is still referenced while the mapping is no more there. As the final >> goal is to handle anonymous VMA in a speculative way and not file backed >> mapping, we could close and free the file pointer in a synchronous way, as >> soon as we are guaranteed to not use it without holding the mmap_sem. For >> sanity reason, in a minimal effort, the vm_file file pointer is unset once >> the file pointer is put. >> >> [1] https://lore.kernel.org/linux-mm/20100104182429.833180340@chello.nl/ >> >> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> > > Using kref would have been better from my POV even with RCU freeing > but anyway: > > Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Thanks Jérôme, I think kref is a good option here, I'll give it a try. >> --- >> include/linux/mm.h | 4 ++++ >> include/linux/mm_types.h | 3 +++ >> mm/internal.h | 27 +++++++++++++++++++++++++++ >> mm/mmap.c | 13 +++++++++---- >> 4 files changed, 43 insertions(+), 4 deletions(-) >> >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index f14b2c9ddfd4..f761a9c65c74 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -529,6 +529,9 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) >> vma->vm_mm = mm; >> vma->vm_ops = &dummy_vm_ops; >> INIT_LIST_HEAD(&vma->anon_vma_chain); >> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT >> + atomic_set(&vma->vm_ref_count, 1); >> +#endif >> } >> >> static inline void vma_set_anonymous(struct vm_area_struct *vma) >> @@ -1418,6 +1421,7 @@ static inline void INIT_VMA(struct vm_area_struct *vma) >> INIT_LIST_HEAD(&vma->anon_vma_chain); >> #ifdef CONFIG_SPECULATIVE_PAGE_FAULT >> seqcount_init(&vma->vm_sequence); >> + atomic_set(&vma->vm_ref_count, 1); >> #endif >> } >> >> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h >> index 24b3f8ce9e42..6a6159e11a3f 100644 >> --- a/include/linux/mm_types.h >> +++ b/include/linux/mm_types.h >> @@ -285,6 +285,9 @@ struct vm_area_struct { >> /* linked list of VM areas per task, sorted by address */ >> struct vm_area_struct *vm_next, *vm_prev; >> >> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT >> + atomic_t vm_ref_count; >> +#endif >> struct rb_node vm_rb; >> >> /* >> diff --git a/mm/internal.h b/mm/internal.h >> index 9eeaf2b95166..302382bed406 100644 >> --- a/mm/internal.h >> +++ b/mm/internal.h >> @@ -40,6 +40,33 @@ void page_writeback_init(void); >> >> vm_fault_t do_swap_page(struct vm_fault *vmf); >> >> + >> +extern void __free_vma(struct vm_area_struct *vma); >> + >> +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT >> +static inline void get_vma(struct vm_area_struct *vma) >> +{ >> + atomic_inc(&vma->vm_ref_count); >> +} >> + >> +static inline void put_vma(struct vm_area_struct *vma) >> +{ >> + if (atomic_dec_and_test(&vma->vm_ref_count)) >> + __free_vma(vma); >> +} >> + >> +#else >> + >> +static inline void get_vma(struct vm_area_struct *vma) >> +{ >> +} >> + >> +static inline void put_vma(struct vm_area_struct *vma) >> +{ >> + __free_vma(vma); >> +} >> +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ >> + >> void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, >> unsigned long floor, unsigned long ceiling); >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index f7f6027a7dff..c106440dcae7 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -188,6 +188,12 @@ static inline void mm_write_sequnlock(struct mm_struct *mm) >> } >> #endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ >> >> +void __free_vma(struct vm_area_struct *vma) >> +{ >> + mpol_put(vma_policy(vma)); >> + vm_area_free(vma); >> +} >> + >> /* >> * Close a vm structure and free it, returning the next. >> */ >> @@ -200,8 +206,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) >> vma->vm_ops->close(vma); >> if (vma->vm_file) >> fput(vma->vm_file); >> - mpol_put(vma_policy(vma)); >> - vm_area_free(vma); >> + vma->vm_file = NULL; >> + put_vma(vma); >> return next; >> } >> >> @@ -990,8 +996,7 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start, >> if (next->anon_vma) >> anon_vma_merge(vma, next); >> mm->map_count--; >> - mpol_put(vma_policy(next)); >> - vm_area_free(next); >> + put_vma(next); >> /* >> * In mprotect's case 6 (see comments on vma_merge), >> * we must remove another next too. It would clutter >> -- >> 2.21.0 >> >
diff --git a/include/linux/mm.h b/include/linux/mm.h index f14b2c9ddfd4..f761a9c65c74 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -529,6 +529,9 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) vma->vm_mm = mm; vma->vm_ops = &dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + atomic_set(&vma->vm_ref_count, 1); +#endif } static inline void vma_set_anonymous(struct vm_area_struct *vma) @@ -1418,6 +1421,7 @@ static inline void INIT_VMA(struct vm_area_struct *vma) INIT_LIST_HEAD(&vma->anon_vma_chain); #ifdef CONFIG_SPECULATIVE_PAGE_FAULT seqcount_init(&vma->vm_sequence); + atomic_set(&vma->vm_ref_count, 1); #endif } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 24b3f8ce9e42..6a6159e11a3f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -285,6 +285,9 @@ struct vm_area_struct { /* linked list of VM areas per task, sorted by address */ struct vm_area_struct *vm_next, *vm_prev; +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + atomic_t vm_ref_count; +#endif struct rb_node vm_rb; /* diff --git a/mm/internal.h b/mm/internal.h index 9eeaf2b95166..302382bed406 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -40,6 +40,33 @@ void page_writeback_init(void); vm_fault_t do_swap_page(struct vm_fault *vmf); + +extern void __free_vma(struct vm_area_struct *vma); + +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +static inline void get_vma(struct vm_area_struct *vma) +{ + atomic_inc(&vma->vm_ref_count); +} + +static inline void put_vma(struct vm_area_struct *vma) +{ + if (atomic_dec_and_test(&vma->vm_ref_count)) + __free_vma(vma); +} + +#else + +static inline void get_vma(struct vm_area_struct *vma) +{ +} + +static inline void put_vma(struct vm_area_struct *vma) +{ + __free_vma(vma); +} +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ + void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, unsigned long floor, unsigned long ceiling); diff --git a/mm/mmap.c b/mm/mmap.c index f7f6027a7dff..c106440dcae7 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -188,6 +188,12 @@ static inline void mm_write_sequnlock(struct mm_struct *mm) } #endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ +void __free_vma(struct vm_area_struct *vma) +{ + mpol_put(vma_policy(vma)); + vm_area_free(vma); +} + /* * Close a vm structure and free it, returning the next. */ @@ -200,8 +206,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) vma->vm_ops->close(vma); if (vma->vm_file) fput(vma->vm_file); - mpol_put(vma_policy(vma)); - vm_area_free(vma); + vma->vm_file = NULL; + put_vma(vma); return next; } @@ -990,8 +996,7 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start, if (next->anon_vma) anon_vma_merge(vma, next); mm->map_count--; - mpol_put(vma_policy(next)); - vm_area_free(next); + put_vma(next); /* * In mprotect's case 6 (see comments on vma_merge), * we must remove another next too. It would clutter
The final goal is to be able to use a VMA structure without holding the mmap_sem and to be sure that the structure will not be freed in our back. The lockless use of the VMA will be done through RCU protection and thus a dedicated freeing service is required to manage it asynchronously. As reported in a 2010's thread [1], this may impact file handling when a file is still referenced while the mapping is no more there. As the final goal is to handle anonymous VMA in a speculative way and not file backed mapping, we could close and free the file pointer in a synchronous way, as soon as we are guaranteed to not use it without holding the mmap_sem. For sanity reason, in a minimal effort, the vm_file file pointer is unset once the file pointer is put. [1] https://lore.kernel.org/linux-mm/20100104182429.833180340@chello.nl/ Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> --- include/linux/mm.h | 4 ++++ include/linux/mm_types.h | 3 +++ mm/internal.h | 27 +++++++++++++++++++++++++++ mm/mmap.c | 13 +++++++++---- 4 files changed, 43 insertions(+), 4 deletions(-)