Message ID | 20241216192419.2970941-12-surenb@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | move per-vma lock into vm_area_struct | expand |
On Mon, Dec 16, 2024 at 11:24:14AM -0800, Suren Baghdasaryan wrote: > exit_mmap() frees vmas without detaching them. This will become a problem > when we introduce vma reuse. Ensure that vmas are always detached before > being freed. > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > --- > kernel/fork.c | 4 ++++ > mm/vma.c | 10 ++++++++-- > 2 files changed, 12 insertions(+), 2 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 283909d082cb..f1ddfc7b3b48 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -473,6 +473,10 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) > > void __vm_area_free(struct vm_area_struct *vma) > { > +#ifdef CONFIG_PER_VMA_LOCK > + /* The vma should be detached while being destroyed. */ > + VM_BUG_ON_VMA(!is_vma_detached(vma), vma); > +#endif > vma_numab_state_free(vma); > free_anon_vma_name(vma); > kmem_cache_free(vm_area_cachep, vma); > diff --git a/mm/vma.c b/mm/vma.c > index fbd7254517d6..0436a7d21e01 100644 > --- a/mm/vma.c > +++ b/mm/vma.c > @@ -413,9 +413,15 @@ void remove_vma(struct vm_area_struct *vma, bool unreachable) > if (vma->vm_file) > fput(vma->vm_file); > mpol_put(vma_policy(vma)); > - if (unreachable) > + if (unreachable) { > +#ifdef CONFIG_PER_VMA_LOCK > + if (!is_vma_detached(vma)) { > + vma_start_write(vma); > + vma_mark_detached(vma); > + } > +#endif > __vm_area_free(vma); Again, can't you race with lockess RCU lookups?
On Mon, Dec 16, 2024 at 10:16:35PM +0100, Peter Zijlstra wrote: > On Mon, Dec 16, 2024 at 11:24:14AM -0800, Suren Baghdasaryan wrote: > > exit_mmap() frees vmas without detaching them. This will become a problem > > when we introduce vma reuse. Ensure that vmas are always detached before > > being freed. > > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > --- > > kernel/fork.c | 4 ++++ > > mm/vma.c | 10 ++++++++-- > > 2 files changed, 12 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > index 283909d082cb..f1ddfc7b3b48 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -473,6 +473,10 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) > > > > void __vm_area_free(struct vm_area_struct *vma) > > { > > +#ifdef CONFIG_PER_VMA_LOCK > > + /* The vma should be detached while being destroyed. */ > > + VM_BUG_ON_VMA(!is_vma_detached(vma), vma); > > +#endif > > vma_numab_state_free(vma); > > free_anon_vma_name(vma); > > kmem_cache_free(vm_area_cachep, vma); > > diff --git a/mm/vma.c b/mm/vma.c > > index fbd7254517d6..0436a7d21e01 100644 > > --- a/mm/vma.c > > +++ b/mm/vma.c > > @@ -413,9 +413,15 @@ void remove_vma(struct vm_area_struct *vma, bool unreachable) > > if (vma->vm_file) > > fput(vma->vm_file); > > mpol_put(vma_policy(vma)); > > - if (unreachable) > > + if (unreachable) { > > +#ifdef CONFIG_PER_VMA_LOCK > > + if (!is_vma_detached(vma)) { > > + vma_start_write(vma); > > + vma_mark_detached(vma); > > + } > > +#endif > > __vm_area_free(vma); > > Again, can't you race with lockess RCU lookups? Ah, no, removing vma requires holding mmap_lock for writing and having the vma locked, which would ensure preceding RCU readers are complete (per the LOCK_OFFSET waiter thing) and new RCU readers are rejected for the vma sequence thing.
On Mon, Dec 16, 2024 at 1:18 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, Dec 16, 2024 at 10:16:35PM +0100, Peter Zijlstra wrote: > > On Mon, Dec 16, 2024 at 11:24:14AM -0800, Suren Baghdasaryan wrote: > > > exit_mmap() frees vmas without detaching them. This will become a problem > > > when we introduce vma reuse. Ensure that vmas are always detached before > > > being freed. > > > > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > > --- > > > kernel/fork.c | 4 ++++ > > > mm/vma.c | 10 ++++++++-- > > > 2 files changed, 12 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > > index 283909d082cb..f1ddfc7b3b48 100644 > > > --- a/kernel/fork.c > > > +++ b/kernel/fork.c > > > @@ -473,6 +473,10 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) > > > > > > void __vm_area_free(struct vm_area_struct *vma) > > > { > > > +#ifdef CONFIG_PER_VMA_LOCK > > > + /* The vma should be detached while being destroyed. */ > > > + VM_BUG_ON_VMA(!is_vma_detached(vma), vma); > > > +#endif > > > vma_numab_state_free(vma); > > > free_anon_vma_name(vma); > > > kmem_cache_free(vm_area_cachep, vma); > > > diff --git a/mm/vma.c b/mm/vma.c > > > index fbd7254517d6..0436a7d21e01 100644 > > > --- a/mm/vma.c > > > +++ b/mm/vma.c > > > @@ -413,9 +413,15 @@ void remove_vma(struct vm_area_struct *vma, bool unreachable) > > > if (vma->vm_file) > > > fput(vma->vm_file); > > > mpol_put(vma_policy(vma)); > > > - if (unreachable) > > > + if (unreachable) { > > > +#ifdef CONFIG_PER_VMA_LOCK > > > + if (!is_vma_detached(vma)) { > > > + vma_start_write(vma); > > > + vma_mark_detached(vma); > > > + } > > > +#endif > > > __vm_area_free(vma); > > > > Again, can't you race with lockess RCU lookups? > > Ah, no, removing vma requires holding mmap_lock for writing and having > the vma locked, which would ensure preceding RCU readers are complete > (per the LOCK_OFFSET waiter thing) and new RCU readers are rejected for > the vma sequence thing. Correct. Once vma is detached it can't be found by new readers. Possible existing readers are purged later in this patchset by calling vma_ensure_detached() from vm_area_free(). I don't do that in this patch because those existing temporary readers do not pose issues up until we start reusing the vmas.
diff --git a/kernel/fork.c b/kernel/fork.c index 283909d082cb..f1ddfc7b3b48 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -473,6 +473,10 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) void __vm_area_free(struct vm_area_struct *vma) { +#ifdef CONFIG_PER_VMA_LOCK + /* The vma should be detached while being destroyed. */ + VM_BUG_ON_VMA(!is_vma_detached(vma), vma); +#endif vma_numab_state_free(vma); free_anon_vma_name(vma); kmem_cache_free(vm_area_cachep, vma); diff --git a/mm/vma.c b/mm/vma.c index fbd7254517d6..0436a7d21e01 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -413,9 +413,15 @@ void remove_vma(struct vm_area_struct *vma, bool unreachable) if (vma->vm_file) fput(vma->vm_file); mpol_put(vma_policy(vma)); - if (unreachable) + if (unreachable) { +#ifdef CONFIG_PER_VMA_LOCK + if (!is_vma_detached(vma)) { + vma_start_write(vma); + vma_mark_detached(vma); + } +#endif __vm_area_free(vma); - else + } else vm_area_free(vma); }
exit_mmap() frees vmas without detaching them. This will become a problem when we introduce vma reuse. Ensure that vmas are always detached before being freed. Signed-off-by: Suren Baghdasaryan <surenb@google.com> --- kernel/fork.c | 4 ++++ mm/vma.c | 10 ++++++++-- 2 files changed, 12 insertions(+), 2 deletions(-)