From patchwork Mon Aug 29 19:11:20 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kani, Toshi" X-Patchwork-Id: 9304375 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5669D60756 for ; Mon, 29 Aug 2016 19:12:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 47F2D27FBB for ; Mon, 29 Aug 2016 19:12:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3C9A228966; Mon, 29 Aug 2016 19:12:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C381627FBB for ; Mon, 29 Aug 2016 19:12:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755784AbcH2TMM (ORCPT ); Mon, 29 Aug 2016 15:12:12 -0400 Received: from g2t1383g.austin.hpe.com ([15.233.16.89]:33901 "EHLO g2t1383g.austin.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751691AbcH2TML (ORCPT ); Mon, 29 Aug 2016 15:12:11 -0400 Received: from g4t3425.houston.hpe.com (g4t3425.houston.hpe.com [15.241.140.78]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by g2t1383g.austin.hpe.com (Postfix) with ESMTPS id 3A3D0E13; Mon, 29 Aug 2016 19:12:10 +0000 (UTC) Received: from g4t3433.houston.hpecorp.net (g4t3433.houston.hpecorp.net [16.208.49.245]) by g4t3425.houston.hpe.com (Postfix) with ESMTP id 1D94F52; Mon, 29 Aug 2016 19:12:09 +0000 (UTC) Received: from misato.americas.hpqcorp.net (unknown [16.78.170.217]) by g4t3433.houston.hpecorp.net (Postfix) with ESMTP id 118D446; Mon, 29 Aug 2016 19:12:08 +0000 (UTC) From: Toshi Kani To: akpm@linux-foundation.org Cc: dan.j.williams@intel.com, mawilcox@microsoft.com, ross.zwisler@linux.intel.com, kirill.shutemov@linux.intel.com, david@fromorbit.com, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, mike.kravetz@oracle.com, toshi.kani@hpe.com, linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 RESEND 1/2] thp, dax: add thp_get_unmapped_area for pmd mappings Date: Mon, 29 Aug 2016 13:11:20 -0600 Message-Id: <1472497881-9323-2-git-send-email-toshi.kani@hpe.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1472497881-9323-1-git-send-email-toshi.kani@hpe.com> References: <1472497881-9323-1-git-send-email-toshi.kani@hpe.com> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page size. This feature relies on both mmap virtual address and FS block (i.e. physical address) to be aligned by the pmd page size. Users can use mkfs options to specify FS to align block allocations. However, aligning mmap address requires code changes to existing applications for providing a pmd-aligned address to mmap(). For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. It calls mmap() with a NULL address, which needs to be changed to provide a pmd-aligned address for testing with DAX pmd mappings. Changing all applications that call mmap() with NULL is undesirable. Add thp_get_unmapped_area(), which can be called by filesystem's get_unmapped_area to align an mmap address by the pmd size for a DAX file. It calls the default handler, mm->get_unmapped_area(), to find a range and then aligns it for a DAX file. The patch is based on Matthew Wilcox's change that allows adding support of the pud page size easily. [1]: https://github.com/axboe/fio/blob/master/engines/mmap.c Signed-off-by: Toshi Kani Cc: Andrew Morton Cc: Dan Williams Cc: Matthew Wilcox Cc: Ross Zwisler Cc: Kirill A. Shutemov Cc: Dave Chinner Cc: Jan Kara Cc: Theodore Ts'o Cc: Andreas Dilger Cc: Mike Kravetz Reviewed-by: Dan Williams --- include/linux/huge_mm.h | 7 +++++++ mm/huge_memory.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 6f14de4..4fca526 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -87,6 +87,10 @@ extern bool is_vma_temporary_stack(struct vm_area_struct *vma); extern unsigned long transparent_hugepage_flags; +extern unsigned long thp_get_unmapped_area(struct file *filp, + unsigned long addr, unsigned long len, unsigned long pgoff, + unsigned long flags); + extern void prep_transhuge_page(struct page *page); extern void free_transhuge_page(struct page *page); @@ -169,6 +173,9 @@ void put_huge_zero_page(void); static inline void prep_transhuge_page(struct page *page) {} #define transparent_hugepage_flags 0UL + +#define thp_get_unmapped_area NULL + static inline int split_huge_page_to_list(struct page *page, struct list_head *list) { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2db2112..883f0ee 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -469,6 +469,49 @@ void prep_transhuge_page(struct page *page) set_compound_page_dtor(page, TRANSHUGE_PAGE_DTOR); } +unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len, + loff_t off, unsigned long flags, unsigned long size) +{ + unsigned long addr; + loff_t off_end = off + len; + loff_t off_align = round_up(off, size); + unsigned long len_pad; + + if (off_end <= off_align || (off_end - off_align) < size) + return 0; + + len_pad = len + size; + if (len_pad < len || (off + len_pad) < off) + return 0; + + addr = current->mm->get_unmapped_area(filp, 0, len_pad, + off >> PAGE_SHIFT, flags); + if (IS_ERR_VALUE(addr)) + return 0; + + addr += (off - addr) & (size - 1); + return addr; +} + +unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + loff_t off = (loff_t)pgoff << PAGE_SHIFT; + + if (addr) + goto out; + if (!IS_DAX(filp->f_mapping->host) || !IS_ENABLED(CONFIG_FS_DAX_PMD)) + goto out; + + addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE); + if (addr) + return addr; + + out: + return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); +} +EXPORT_SYMBOL_GPL(thp_get_unmapped_area); + static int __do_huge_pmd_anonymous_page(struct fault_env *fe, struct page *page, gfp_t gfp) {