Don't mlock guardpage if the stack is growing up

On Sun, 8 May 2011, Linus Torvalds wrote:

> On Sun, May 8, 2011 at 11:55 AM, Mikulas Patocka
> <mikulas@artax.karlin.mff.cuni.cz> wrote:
> >
> > This patch fixes lvm2 on PA-RISC (and possibly other architectures with
> > up-growing stack). lvm2 calculates the number of used pages when locking
> > and when unlocking and reports an internal error if the numbers mismatch.
> 
> This patch won't apply on current kernels (including stable) because
> of commit a1fde08c74e9 that changed the test of "pages" to instead
> just test "flags & FOLL_MLOCK".
> 
> That should be trivial to fix up.
> 
> However, I really don't much like this complex test:
> 
> >  static inline int stack_guard_page(struct vm_area_struct *vma, unsigned long addr)
> >  {
> > -       return (vma->vm_flags & VM_GROWSDOWN) &&
> > +       return ((vma->vm_flags & VM_GROWSDOWN) &&
> >                (vma->vm_start == addr) &&
> > -               !vma_stack_continue(vma->vm_prev, addr);
> > +               !vma_stack_continue(vma->vm_prev, addr)) ||
> > +              ((vma->vm_flags & VM_GROWSUP) &&
> > +               (vma->vm_end == addr + PAGE_SIZE) &&
> > +               !vma_stack_growsup_continue(vma->vm_next, addr + PAGE_SIZE));
> >  }
> 
> in that format. It gets really hard to read, and I think you'd be
> better off writing it as two helper functions (or macros) for the two
> cases, and then have
> 
>   static inline int stack_guard_page(struct vm_area_struct *vma,
> unsigned long addr)
>   {
>     return stack_guard_page_growsdown(vma, addr) ||
>       stack_guard_page_growsup(vma, addr);
>   }
> 
> I'd also like to verify that it doesn't actually generate any extra
> code for the common case (iirc VM_GROWSUP is 0 for the architectures
> that don't need it, and so the compiler shouldn't generate any extra
> code, but I'd like that mentioned and verified explicitly).
> 
> Hmm?
> 
> Other than that it looks ok to me.
> 
> That said, could we please fix LVM to not do that crazy sh*t in the
> first place? The STACK_GROWSUP case is never going to have a lot of
> testing, this is just sad.

LVM reads process maps from /proc/self/maps and locks them with mlock.

Why it doesn't use mlockall()? Because glibc maps all locales to the 
process. Glibc packs all locales to a 100MB file and maps that file to 
every process. Even if the process uses just one locale, glibc maps all.

So, when LVM used mlockall, it consumed >100MB memory and it caused 
out-of-memory problems in system installers.

So, alternate way of locking was added to LVM --- read all maps and lock 
them, except for the glibc locale file.

The real fix would be to fix glibc not to map 100MB to every process.

>                    Linus
> 

This is updated patch. I tested it on x86-64 and it doesn't change 
generated code.

Mikulas

---

Don't lock guardpage if the stack is growing up

Linux kernel excludes guard page when performing mlock on a VMA with
down-growing stack. However, some architectures have up-growing stack
and locking the guard page should be excluded in this case too.

This patch fixes lvm2 on PA-RISC (and possibly other architectures with
up-growing stack). lvm2 calculates number of used pages when locking and
when unlocking and reports an internal error if the numbers mismatch.

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>

---
 include/linux/mm.h |   10 +++++++++-
 mm/memory.c        |   21 ++++++++++++++++++---
 2 files changed, 27 insertions(+), 4 deletions(-)

Don't mlock guardpage if the stack is growing up

Commit Message

Comments

Patch