From patchwork Fri Oct 2 21:21:37 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 7319021 Return-Path: X-Original-To: patchwork-linux-nvdimm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 501DDBEEA4 for ; Fri, 2 Oct 2015 21:21:53 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 621BD20806 for ; Fri, 2 Oct 2015 21:21:52 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 90E06207EF for ; Fri, 2 Oct 2015 21:21:50 +0000 (UTC) Received: from ml01.vlan14.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 69BE661AE1; Fri, 2 Oct 2015 14:21:50 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received: from ale.deltatee.com (ale.deltatee.com [207.54.116.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 3079861973 for ; Fri, 2 Oct 2015 14:21:48 -0700 (PDT) Received: from logang by ale.deltatee.com with local (Exim 4.84) (envelope-from ) id 1Zi7lp-0000UQ-Gp; Fri, 02 Oct 2015 15:21:37 -0600 Date: Fri, 2 Oct 2015 15:21:37 -0600 From: Logan Gunthorpe To: Dan Williams Subject: Re: [PATCH 14/15] mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup Message-ID: <20151002212137.GB30448@deltatee.com> References: <20150923043737.36490.70547.stgit@dwillia2-desk3.jf.intel.com> <20150923044227.36490.99741.stgit@dwillia2-desk3.jf.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150923044227.36490.99741.stgit@dwillia2-desk3.jf.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Rcpt-To: dan.j.williams@intel.com, akpm@linux-foundation.org, dave@sr71.net, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, willy@linux.intel.com, ross.zwisler@linux.intel.com, Stephen.Bates@pmcs.com X-SA-Exim-Mail-From: logang@deltatee.com X-SA-Exim-Scanned: No (on ale.deltatee.com); SAEximRunCond expanded to false Cc: Stephen Bates , Dave Hansen , linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alexander Viro , linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Dan, We've been doing some experimenting and testing with this patchset. Specifically, we are trying to use you're ZONE_DEVICE work to enable peer to peer PCIe transfers. This is actually working pretty well (though we're still testing and working through some things). However, we've found a couple of issues: On Wed, Sep 23, 2015 at 12:42:27AM -0400, Dan Williams wrote: > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 3d6baa7d4534..20097e7b679a 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -49,12 +49,16 @@ struct page { > * updated asynchronously */ > union { > struct address_space *mapping; /* If low bit clear, points to > - * inode address_space, or NULL. > + * inode address_space, unless > + * the page is in ZONE_DEVICE > + * then it points to its parent > + * dev_pagemap, otherwise NULL. > * If page mapped as anonymous > * memory, low bit is set, and > * it points to anon_vma object: > * see PAGE_MAPPING_ANON below. > */ > + struct dev_pagemap *pgmap; > void *s_mem; /* slab first object */ > }; When you add to this union and overide the mapping value, we see bugs in calls to set_page_dirty when it tries to dereference mapping. I believe a change to page_mapping is required such as the patch that's at the end of this email. > diff --git a/mm/gup.c b/mm/gup.c > index a798293fc648..1064e9a489a4 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -98,7 +98,16 @@ retry: > } > > page = vm_normal_page(vma, address, pte); > - if (unlikely(!page)) { > + if (!page && pte_devmap(pte) && (flags & FOLL_GET)) { > + /* > + * Only return device mapping pages in the FOLL_GET case since > + * they are only valid while holding the pgmap reference. > + */ > + if (get_dev_pagemap(pte_pfn(pte), NULL)) > + page = pte_page(pte); > + else > + goto no_page; > + } else if (unlikely(!page)) { I've found that if a driver creates a ZONE_DEVICE mapping but doesn't create the pagemap (using devm_register_pagemap) then the get_user_pages code will go into an infinite loop. I'm not really sure if this as an issue or not but it seems a bit undesirable for a buggy driver to be able to cause this. My thoughts are that either devm_register_pagemap needs to be done by devm_memremap_pages so a driver cannot use one without the other, or the GUP code needs to return EFAULT if no pagemap was registered so it doesn't loop forever. Thanks! Logan diff --git a/mm/util.c b/mm/util.c index 68ff8a5..19af683 100644 --- a/mm/util.c +++ b/mm/util.c @@ -368,6 +368,9 @@ struct address_space *page_mapping(struct page *page) return swap_address_space(entry); } + if (unlikely(is_zone_device_page(page))) + return NULL; + mapping = (unsigned long)page->mapping; if (mapping & PAGE_MAPPING_FLAGS) return NULL;