Message ID | 20171113160637.jhekbdyfpccme3be@dhcp22.suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/13/2017 09:06 AM, Michal Hocko wrote: > OK, so this one should take care of the backward compatibility while > still not touching the arch code > --- > commit 39ff9bf8597e79a032da0954aea1f0d77d137765 > Author: Michal Hocko <mhocko@suse.com> > Date: Mon Nov 13 17:06:24 2017 +0100 > > mm: introduce MAP_FIXED_SAFE > > MAP_FIXED is used quite often but it is inherently dangerous because it > unmaps an existing mapping covered by the requested range. While this > might be might be really desidered behavior in many cases there are > others which would rather see a failure than a silent memory corruption. > Introduce a new MAP_FIXED_SAFE flag for mmap to achive this behavior. > It is a MAP_FIXED extension with a single exception that it fails with > ENOMEM if the requested address is already covered by an existing > mapping. We still do rely on get_unmaped_area to handle all the arch > specific MAP_FIXED treatment and check for a conflicting vma after it > returns. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > ...... deleted ....... > diff --git a/mm/mmap.c b/mm/mmap.c > index 680506faceae..aad8d37f0205 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1358,6 +1358,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > if (mm->map_count > sysctl_max_map_count) > return -ENOMEM; > > + /* force arch specific MAP_FIXED handling in get_unmapped_area */ > + if (flags & MAP_FIXED_SAFE) > + flags |= MAP_FIXED; > + > /* Obtain the address to map to. we verify (or select) it and ensure > * that it represents a valid section of the address space. > */ Do you need to move this code above: if (!(flags & MAP_FIXED)) addr = round_hint_to_min(addr); /* Careful about overflows.. */ len = PAGE_ALIGN(len); if (!len) return -ENOMEM; Not doing that might mean the hint address will end up being rounded for MAP_FIXED_SAFE which would change the behavior from MAP_FIXED. -- Khalid
On Mon 13-11-17 09:35:22, Khalid Aziz wrote: > On 11/13/2017 09:06 AM, Michal Hocko wrote: > > OK, so this one should take care of the backward compatibility while > > still not touching the arch code > > --- > > commit 39ff9bf8597e79a032da0954aea1f0d77d137765 > > Author: Michal Hocko <mhocko@suse.com> > > Date: Mon Nov 13 17:06:24 2017 +0100 > > > > mm: introduce MAP_FIXED_SAFE > > MAP_FIXED is used quite often but it is inherently dangerous because it > > unmaps an existing mapping covered by the requested range. While this > > might be might be really desidered behavior in many cases there are > > others which would rather see a failure than a silent memory corruption. > > Introduce a new MAP_FIXED_SAFE flag for mmap to achive this behavior. > > It is a MAP_FIXED extension with a single exception that it fails with > > ENOMEM if the requested address is already covered by an existing > > mapping. We still do rely on get_unmaped_area to handle all the arch > > specific MAP_FIXED treatment and check for a conflicting vma after it > > returns. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > > ...... deleted ....... > > diff --git a/mm/mmap.c b/mm/mmap.c > > index 680506faceae..aad8d37f0205 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -1358,6 +1358,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > > if (mm->map_count > sysctl_max_map_count) > > return -ENOMEM; > > + /* force arch specific MAP_FIXED handling in get_unmapped_area */ > > + if (flags & MAP_FIXED_SAFE) > > + flags |= MAP_FIXED; > > + > > /* Obtain the address to map to. we verify (or select) it and ensure > > * that it represents a valid section of the address space. > > */ > > Do you need to move this code above: > > if (!(flags & MAP_FIXED)) > addr = round_hint_to_min(addr); > > /* Careful about overflows.. */ > len = PAGE_ALIGN(len); > if (!len) > return -ENOMEM; > > Not doing that might mean the hint address will end up being rounded for > MAP_FIXED_SAFE which would change the behavior from MAP_FIXED. Yes, I will move it.
Michal Hocko <mhocko@kernel.org> writes: > [Sorry for spamming, this one is the last attempt hopefully] > > On Mon 13-11-17 16:49:39, Michal Hocko wrote: >> On Mon 13-11-17 16:16:41, Michal Hocko wrote: >> > On Mon 13-11-17 13:00:57, Michal Hocko wrote: >> > [...] >> > > Yes, I have mentioned that in the previous email but the amount of code >> > > would be even larger. Basically every arch which reimplements >> > > arch_get_unmapped_area would have to special case new MAP_FIXED flag to >> > > do vma lookup. >> > >> > It turned out that this might be much more easier than I thought after >> > all. It seems we can really handle that in the common code. This would >> > mean that we are exposing a new functionality to the userspace though. >> > Myabe this would be useful on its own though. Just a quick draft (not >> > even compile tested) whether this makes sense in general. I would be >> > worried about unexpected behavior when somebody set other bit without a >> > good reason and we might fail with ENOMEM for such a call now. >> >> Hmm, the bigger problem would be the backward compatibility actually. We >> would get silent corruptions which is exactly what the flag is trying >> fix. mmap flags handling really sucks. So I guess we would have to make >> the flag internal only :/ > > OK, so this one should take care of the backward compatibility while > still not touching the arch code I'm not sure I understand your worries about backward compatibility? If we add a new mmap flag which is currently unused then what is the problem? Are you worried about user code that accidentally passes that flag already? > diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h > index 203268f9231e..03c518777f83 100644 > --- a/include/uapi/asm-generic/mman-common.h > +++ b/include/uapi/asm-generic/mman-common.h > @@ -25,6 +25,8 @@ > # define MAP_UNINITIALIZED 0x0 /* Don't support this flag */ > #endif > > +#define MAP_FIXED_SAFE 0x2000000 /* MAP_FIXED which doesn't unmap underlying mapping */ > + As I said in my other mail I think this should be a modifier to MAP_FIXED. That way all the existing code that checks for MAP_FIXED (in the kernel) works exactly as it currently does - like the check Khalid pointed out. And I think MAP_NO_CLOBBER would be a better name. cheers
On Tue 14-11-17 20:18:04, Michael Ellerman wrote: > Michal Hocko <mhocko@kernel.org> writes: > > > [Sorry for spamming, this one is the last attempt hopefully] > > > > On Mon 13-11-17 16:49:39, Michal Hocko wrote: > >> On Mon 13-11-17 16:16:41, Michal Hocko wrote: > >> > On Mon 13-11-17 13:00:57, Michal Hocko wrote: > >> > [...] > >> > > Yes, I have mentioned that in the previous email but the amount of code > >> > > would be even larger. Basically every arch which reimplements > >> > > arch_get_unmapped_area would have to special case new MAP_FIXED flag to > >> > > do vma lookup. > >> > > >> > It turned out that this might be much more easier than I thought after > >> > all. It seems we can really handle that in the common code. This would > >> > mean that we are exposing a new functionality to the userspace though. > >> > Myabe this would be useful on its own though. Just a quick draft (not > >> > even compile tested) whether this makes sense in general. I would be > >> > worried about unexpected behavior when somebody set other bit without a > >> > good reason and we might fail with ENOMEM for such a call now. > >> > >> Hmm, the bigger problem would be the backward compatibility actually. We > >> would get silent corruptions which is exactly what the flag is trying > >> fix. mmap flags handling really sucks. So I guess we would have to make > >> the flag internal only :/ > > > > OK, so this one should take care of the backward compatibility while > > still not touching the arch code > > I'm not sure I understand your worries about backward compatibility? Just imagine you are running an application which uses the new flag combination on an older kernel. You will get no warning, yet you have no way to check that you have actually clobbered an existing mapping because MAP_FIXED will be used the old way. > If we add a new mmap flag which is currently unused then what is the > problem? Are you worried about user code that accidentally passes that > flag already? If we add a completely new flag, like in this patch, then the code using the flag will not clobber an existing mapping on older kernels which do not recognize it (we will simply fall back to the default hint based implementation). You might not get the mapping you asked for which sucks but that is not fixable AFAICS. You can at least do mapped_addr = mmap(addr, ... MAP_FIXED_SAFE...); assert(mapped_addr == addr); So I do not think we can go with the modifier unfortunatelly.
diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h index 3b26cc62dadb..767bcb8a4c28 100644 --- a/arch/alpha/include/uapi/asm/mman.h +++ b/arch/alpha/include/uapi/asm/mman.h @@ -31,6 +31,8 @@ #define MAP_STACK 0x80000 /* give out an address that is best suited for process/thread stacks */ #define MAP_HUGETLB 0x100000 /* create a huge page mapping */ +#define MAP_FIXED_SAFE 0x2000000 /* MAP_FIXED which doesn't unmap underlying mapping */ + #define MS_ASYNC 1 /* sync memory asynchronously */ #define MS_SYNC 2 /* synchronous memory sync */ #define MS_INVALIDATE 4 /* invalidate the caches */ diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h index da3216007fe0..c2311eb7219b 100644 --- a/arch/mips/include/uapi/asm/mman.h +++ b/arch/mips/include/uapi/asm/mman.h @@ -49,6 +49,8 @@ #define MAP_STACK 0x40000 /* give out an address that is best suited for process/thread stacks */ #define MAP_HUGETLB 0x80000 /* create a huge page mapping */ +#define MAP_FIXED_SAFE 0x2000000 /* MAP_FIXED which doesn't unmap underlying mapping */ + /* * Flags for msync */ diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h index cc9ba1d34779..b06fd830bc6f 100644 --- a/arch/parisc/include/uapi/asm/mman.h +++ b/arch/parisc/include/uapi/asm/mman.h @@ -25,6 +25,8 @@ #define MAP_STACK 0x40000 /* give out an address that is best suited for process/thread stacks */ #define MAP_HUGETLB 0x80000 /* create a huge page mapping */ +#define MAP_FIXED_SAFE 0x2000000 /* MAP_FIXED which doesn't unmap underlying mapping */ + #define MS_SYNC 1 /* synchronous memory sync */ #define MS_ASYNC 2 /* sync memory asynchronously */ #define MS_INVALIDATE 4 /* invalidate the caches */ diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h index b15b278aa314..f4b291bca764 100644 --- a/arch/xtensa/include/uapi/asm/mman.h +++ b/arch/xtensa/include/uapi/asm/mman.h @@ -62,6 +62,8 @@ # define MAP_UNINITIALIZED 0x0 /* Don't support this flag */ #endif +#define MAP_FIXED_SAFE 0x2000000 /* MAP_FIXED which doesn't unmap underlying mapping */ + /* * Flags for msync */ diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index 203268f9231e..03c518777f83 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -25,6 +25,8 @@ # define MAP_UNINITIALIZED 0x0 /* Don't support this flag */ #endif +#define MAP_FIXED_SAFE 0x2000000 /* MAP_FIXED which doesn't unmap underlying mapping */ + /* * Flags for mlock */ diff --git a/mm/mmap.c b/mm/mmap.c index 680506faceae..aad8d37f0205 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1358,6 +1358,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (mm->map_count > sysctl_max_map_count) return -ENOMEM; + /* force arch specific MAP_FIXED handling in get_unmapped_area */ + if (flags & MAP_FIXED_SAFE) + flags |= MAP_FIXED; + /* Obtain the address to map to. we verify (or select) it and ensure * that it represents a valid section of the address space. */ @@ -1365,6 +1369,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (offset_in_page(addr)) return addr; + if (flags & MAP_FIXED_SAFE) { + struct vm_area_struct *vma = find_vma(mm, addr); + + if (vma && vma->vm_start <= addr) + return -ENOMEM; + } + if (prot == PROT_EXEC) { pkey = execute_only_pkey(mm); if (pkey < 0)