Message ID | 20210827164926.1726765-18-agruenba@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | gfs2: Fix mmap + page fault deadlocks | expand |
On Fri, Aug 27, 2021 at 06:49:24PM +0200, Andreas Gruenbacher wrote: > Introduce a new FOLL_NOFAULT flag that causes get_user_pages to return > -EFAULT when it would otherwise trigger a page fault. This is roughly > similar to FOLL_FAST_ONLY but available on all architectures, and less > fragile. So, FOLL_FAST_ONLY only has one single user through get_user_pages_fast_only (pin_user_pages_fast_only is entirely unused, which makes totally sense given that give up on fault and pin are not exactly useful semantics). But it looks like they want to call it from atomic context, so we can't really share it. Sight, I hate all these single-user FOLL flags that make gup.c a complete mess. But otherwise this looks fine.
On Thu, Sep 9, 2021 at 4:36 AM Christoph Hellwig <hch@infradead.org> wrote: > > On Fri, Aug 27, 2021 at 06:49:24PM +0200, Andreas Gruenbacher wrote: > > Introduce a new FOLL_NOFAULT flag that causes get_user_pages to return > > -EFAULT when it would otherwise trigger a page fault. This is roughly > > similar to FOLL_FAST_ONLY but available on all architectures, and less > > fragile. > > So, FOLL_FAST_ONLY only has one single user through > get_user_pages_fast_only (pin_user_pages_fast_only is entirely unused, > which makes totally sense given that give up on fault and pin are not > exactly useful semantics). So I think we should treat FOLL_FAST_ONLY as a special "internal to gup.c" flag, and perhaps not really compare it to the new FOLL_NOFAULT. In fact, maybe we could even just make FOLL_FAST_ONLY be the high bit, and not expose it in <linux/mm.h> and make it entirely private as a name in gup.c. Because FOLL_FAST_ONLY really is meant more as a "this way we can share code easily inside gup.c, by having the internal helpers that *can* do everything, but not do it all when the user is one of the limited interfaces". Because we don't really expect people to use FOLL_FAST_ONLY externally - they'll use the explicit interfaces we have instead (ie "get_user_pages_fast()"). Those use-cases that want that fast-only thing really are so special that they need to be very explicitly so. FOLL_NOFAULT is different, in that that is something an external user _would_ use. Admittedly we'd only have one single case for now, but I think we may end up with other filesystems - or other cases entirely - having that same kind of "I am holding locks, so I can't fault into the MM, but I'm otherwise ok with the immediate mmap_sem lock usage and sleeping". End result: FOLL_FAST_ONLY and FOLL_NOFAULT have some similarities, but at the same time I think they are fundamentally different. The FAST_ONLY is the very very special "I can't sleep, I can't even take the fundamental MM lock, and we export special interfaces because it's _so_ special and can be used in interrupts etc". In contrast, NOFAULT is not _that_ special. It's just another flag, and has generic use. Linus
On Thu, Sep 09, 2021 at 10:17:14AM -0700, Linus Torvalds wrote: > So I think we should treat FOLL_FAST_ONLY as a special "internal to > gup.c" flag, and perhaps not really compare it to the new > FOLL_NOFAULT. > > In fact, maybe we could even just make FOLL_FAST_ONLY be the high bit, > and not expose it in <linux/mm.h> and make it entirely private as a > name in gup.c. There are quite a few bits like that. I've been wanting to make them private for 5.16.
diff --git a/include/linux/mm.h b/include/linux/mm.h index 7ca22e6e694a..958246aa343f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2850,7 +2850,8 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, #define FOLL_FORCE 0x10 /* get_user_pages read/write w/o permission */ #define FOLL_NOWAIT 0x20 /* if a disk transfer is needed, start the IO * and return without waiting upon it */ -#define FOLL_POPULATE 0x40 /* fault in page */ +#define FOLL_POPULATE 0x40 /* fault in pages (with FOLL_MLOCK) */ +#define FOLL_NOFAULT 0x80 /* do not fault in pages */ #define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */ #define FOLL_NUMA 0x200 /* force NUMA hinting page fault */ #define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */ diff --git a/mm/gup.c b/mm/gup.c index 03ab03b68dc7..69056adcc8c9 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -932,6 +932,8 @@ static int faultin_page(struct vm_area_struct *vma, /* mlock all present pages, but do not fault in new pages */ if ((*flags & (FOLL_POPULATE | FOLL_MLOCK)) == FOLL_MLOCK) return -ENOENT; + if (*flags & FOLL_NOFAULT) + return -EFAULT; if (*flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) @@ -2857,7 +2859,7 @@ static int internal_get_user_pages_fast(unsigned long start, if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM | FOLL_FORCE | FOLL_PIN | FOLL_GET | - FOLL_FAST_ONLY))) + FOLL_FAST_ONLY | FOLL_NOFAULT))) return -EINVAL; if (gup_flags & FOLL_PIN)
Introduce a new FOLL_NOFAULT flag that causes get_user_pages to return -EFAULT when it would otherwise trigger a page fault. This is roughly similar to FOLL_FAST_ONLY but available on all architectures, and less fragile. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> --- include/linux/mm.h | 3 ++- mm/gup.c | 4 +++- 2 files changed, 5 insertions(+), 2 deletions(-)