Message ID | 20200428032745.133556-6-jannh@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix ELF / FDPIC ELF core dumping, and use mmap_sem properly in there | expand |
On Mon, Apr 27, 2020 at 8:28 PM Jann Horn <jannh@google.com> wrote: > > Properly take the mmap_sem before calling into the GUP code from > get_dump_page(); and play nice, allowing __get_user_pages_locked() to drop > the mmap_sem if it has to sleep. This makes my skin crawl. The only reason for this all is that page cache flushing. My gut feeling is that it should be done by get_user_pages() anyway, since all the other users presumably want it to be coherent in the cache. And in fact, looking at __get_user_pages(), it already does that if (pages) { pages[i] = page; flush_anon_page(vma, page, start); flush_dcache_page(page); ctx.page_mask = 0; } and I think that the get_dump_page() logic is unnecessary to begin with. Linus
On Tue, Apr 28, 2020 at 5:50 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Mon, Apr 27, 2020 at 8:28 PM Jann Horn <jannh@google.com> wrote: > > > > Properly take the mmap_sem before calling into the GUP code from > > get_dump_page(); and play nice, allowing __get_user_pages_locked() to drop > > the mmap_sem if it has to sleep. > > This makes my skin crawl. > > The only reason for this all is that page cache flushing. > > My gut feeling is that it should be done by get_user_pages() anyway, > since all the other users presumably want it to be coherent in the > cache. > > And in fact, looking at __get_user_pages(), it already does that > > if (pages) { > pages[i] = page; > flush_anon_page(vma, page, start); > flush_dcache_page(page); > ctx.page_mask = 0; > } > > and I think that the get_dump_page() logic is unnecessary to begin with. Ah! And even though flush_cache_page() is broader than flush_dcache_page(), that's actually unnecessary, right? Since the kernel only wants to read from the page, and therefore e.g. the icache is irrelevant? Yay! :) I did think this was a bit gnarly, and it's nice to know that this can be simplified. (And now I'm going to avert my eyes from the GUP code before I start thinking too hard about how much it sucks that FOLL_LONGTERM doesn't drop the mmap_sem across the access and how much I dislike the whole idea of FOLL_LONGTERM in general...)
diff --git a/mm/gup.c b/mm/gup.c index 9a7e83772f1fe..4bb4149c0e259 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1261,7 +1261,8 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk, if (locked) { /* if VM_FAULT_RETRY can be returned, vmas become invalid */ - BUG_ON(vmas); + if (WARN_ON(vmas && nr_pages != 1)) + return -EFAULT; /* check caller initialized locked */ BUG_ON(*locked != 1); } @@ -1548,18 +1549,28 @@ static long __get_user_pages_locked(struct task_struct *tsk, * NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found - * allowing a hole to be left in the corefile to save diskspace. * - * Called without mmap_sem, but after all other threads have been killed. + * Called without mmap_sem (takes and releases the mmap_sem by itself). */ struct page *get_dump_page(unsigned long addr) { + struct mm_struct *mm = current->mm; struct vm_area_struct *vma; struct page *page; + int locked = 1; + int ret; - if (__get_user_pages(current, current->mm, addr, 1, - FOLL_FORCE | FOLL_DUMP | FOLL_GET, &page, &vma, - NULL) < 1) + if (down_read_killable(&mm->mmap_sem)) + return NULL; + ret = __get_user_pages_locked(current, mm, addr, 1, &page, &vma, + &locked, + FOLL_FORCE | FOLL_DUMP | FOLL_GET); + if (ret != 1) { + if (locked) + up_read(&mm->mmap_sem); return NULL; + } flush_cache_page(vma, addr, page_to_pfn(page)); + up_read(&mm->mmap_sem); return page; }
Properly take the mmap_sem before calling into the GUP code from get_dump_page(); and play nice, allowing __get_user_pages_locked() to drop the mmap_sem if it has to sleep. This requires adjusting the check in __get_user_pages_locked() to be slightly less strict: While `vmas != NULL` is normally incompatible with the lock-dropping retry logic, it's fine if we only want a single page, because then retries can only happen when we haven't grabbed any pages yet. Signed-off-by: Jann Horn <jannh@google.com> --- mm/gup.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)