[v4,03/14] dax: use HPAGE_SIZE instead of PMD_SIZE

Message ID	20151108192739.9104.32105.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <linux-nvdimm-bounces@lists.01.org> Subject: [PATCH v4 03/14] dax: use HPAGE_SIZE instead of PMD_SIZE From: Dan Williams <dan.j.williams@intel.com> To: axboe@fb.com Date: Sun, 08 Nov 2015 14:27:39 -0500 Message-ID: <20151108192739.9104.32105.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <20151108192722.9104.86664.stgit@dwillia2-desk3.amr.corp.intel.com> References: <20151108192722.9104.86664.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 Cc: jack@suse.cz, linux-nvdimm@lists.01.org, Dave Hansen <dave.hansen@linux.intel.com>, david@fromorbit.com, linux-block@vger.kernel.org, hch@lst.de Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" <linux-nvdimm-bounces@lists.01.org>

Message ID

20151108192739.9104.32105.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)

State

Superseded

Headers

Subject: [PATCH v4 03/14] dax: use HPAGE_SIZE instead of PMD_SIZE
From: Dan Williams <dan.j.williams@intel.com>
To: axboe@fb.com
Date: Sun, 08 Nov 2015 14:27:39 -0500
Message-ID: <20151108192739.9104.32105.stgit@dwillia2-desk3.amr.corp.intel.com>
In-Reply-To: <20151108192722.9104.86664.stgit@dwillia2-desk3.amr.corp.intel.com>
References: <20151108192722.9104.86664.stgit@dwillia2-desk3.amr.corp.intel.com>
User-Agent: StGit/0.17.1-9-g687f
MIME-Version: 1.0
Cc: jack@suse.cz, linux-nvdimm@lists.01.org,
	Dave Hansen <dave.hansen@linux.intel.com>, david@fromorbit.com,
	linux-block@vger.kernel.org, hch@lst.de
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: linux-nvdimm-bounces@lists.01.org
Sender: "Linux-nvdimm" <linux-nvdimm-bounces@lists.01.org>

Commit Message

Dan Williams Nov. 8, 2015, 7:27 p.m. UTC

As Dave points out when dealing with the contents of a page we use
PAGE_SIZE and PAGE_SHIFT, similary for huge pages use HPAGE_SIZE and
HPAGE_SHIFT.

Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 fs/dax.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Matthew Wilcox Nov. 9, 2015, 7:52 p.m. UTC | #1

On Sun, Nov 08, 2015 at 02:27:39PM -0500, Dan Williams wrote:
> As Dave points out when dealing with the contents of a page we use
> PAGE_SIZE and PAGE_SHIFT, similary for huge pages use HPAGE_SIZE and
> HPAGE_SHIFT.

Hang on, no.  We want to use PMD_SIZE here so that when we add PGD_SIZE
support, it's all nicely symmetrical.

Dave Hansen Nov. 9, 2015, 8:30 p.m. UTC | #2

On 11/09/2015 11:52 AM, Matthew Wilcox wrote:
> On Sun, Nov 08, 2015 at 02:27:39PM -0500, Dan Williams wrote:
>> As Dave points out when dealing with the contents of a page we use
>> PAGE_SIZE and PAGE_SHIFT, similary for huge pages use HPAGE_SIZE and
>> HPAGE_SHIFT.
> 
> Hang on, no.  We want to use PMD_SIZE here so that when we add PGD_SIZE
> support, it's all nicely symmetrical.

This is splitting hairs a bit, but here goes.

Do we care about the pagetables, or do we care about the page sizes?

If we are doing pagetable things, we use P{TE,MD,UD,GD}_SIZE.  If we are
doing something related to page (contents) themselves, we use
H?PAGE_SIZE.  This makes it clear whether we're dealing with pagetables,
or *contents*.

In the end, they're the same thing on our architecture at the moment, so
it matters very little.

Matthew Wilcox Nov. 9, 2015, 8:58 p.m. UTC | #3

On Mon, Nov 09, 2015 at 12:30:08PM -0800, Dave Hansen wrote:
> On 11/09/2015 11:52 AM, Matthew Wilcox wrote:
> > On Sun, Nov 08, 2015 at 02:27:39PM -0500, Dan Williams wrote:
> >> As Dave points out when dealing with the contents of a page we use
> >> PAGE_SIZE and PAGE_SHIFT, similary for huge pages use HPAGE_SIZE and
> >> HPAGE_SHIFT.
> > 
> > Hang on, no.  We want to use PMD_SIZE here so that when we add PGD_SIZE
> > support, it's all nicely symmetrical.
> 
> This is splitting hairs a bit, but here goes.
> 
> Do we care about the pagetables, or do we care about the page sizes?
> 
> If we are doing pagetable things, we use P{TE,MD,UD,GD}_SIZE.  If we are
> doing something related to page (contents) themselves, we use
> H?PAGE_SIZE.  This makes it clear whether we're dealing with pagetables,
> or *contents*.
> 
> In the end, they're the same thing on our architecture at the moment, so
> it matters very little.

Agreed we're splitting hairs here but I'm not sure why the hair can
be split :-)  We come here from the page fault handler, where we've
discovered that there is no page table for a PMD-shaped hole in the
address space, and we'd like a PMD inserted by the fault handler.
In that sense, we care about PMD_SIZE.

Another problem is that the name 'HPAGE' doesn't obviously say "I cover
a PMD-sized hole" -- hugetlbfs can use 1GB pages as well as 2MB pages.
Then there's the problem that there *is* no equivalent to HPAGE for PUDs
or PGDs -- if PMDs are "huge", then what are the names for pages pointed
to by PUDs and PGDs?

Why is it important to distinguish between "the size of something pointed
to by a PMD" and "the size of the contents"?  Actually, I'm not sure I
even understand how to summarise the distinction.

diff --git a/fs/dax.c b/fs/dax.c
index f8e543839e5c..149d6000d72a 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -511,7 +511,7 @@  EXPORT_SYMBOL_GPL(dax_fault);
  * The 'colour' (ie low bits) within a PMD of a page offset.  This comes up
  * more often than one might expect in the below function.
  */
-#define PG_PMD_COLOUR	((PMD_SIZE >> PAGE_SHIFT) - 1)
+#define PG_PMD_COLOUR	((HPAGE_SIZE >> PAGE_SHIFT) - 1)
 
 int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 		pmd_t *pmd, unsigned int flags, get_block_t get_block,
@@ -537,7 +537,7 @@  int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	/* If the PMD would extend outside the VMA */
 	if (pmd_addr < vma->vm_start)
 		return VM_FAULT_FALLBACK;
-	if ((pmd_addr + PMD_SIZE) > vma->vm_end)
+	if ((pmd_addr + HPAGE_SIZE) > vma->vm_end)
 		return VM_FAULT_FALLBACK;
 
 	pgoff = linear_page_index(vma, pmd_addr);
@@ -551,7 +551,7 @@  int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	memset(&bh, 0, sizeof(bh));
 	block = (sector_t)pgoff << (PAGE_SHIFT - blkbits);
 
-	bh.b_size = PMD_SIZE;
+	bh.b_size = HPAGE_SIZE;
 	length = get_block(inode, block, &bh, write);
 	if (length)
 		return VM_FAULT_SIGBUS;
@@ -562,7 +562,7 @@  int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	 * just fall back to PTEs.  Calling get_block 512 times in a loop
 	 * would be silly.
 	 */
-	if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE)
+	if (!buffer_size_valid(&bh) || bh.b_size < HPAGE_SIZE)
 		goto fallback;
 
 	/*
@@ -571,7 +571,7 @@  int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 	 */
 	if (buffer_new(&bh)) {
 		i_mmap_unlock_read(mapping);
-		unmap_mapping_range(mapping, pgoff << PAGE_SHIFT, PMD_SIZE, 0);
+		unmap_mapping_range(mapping, pgoff << PAGE_SHIFT, HPAGE_SIZE, 0);
 		i_mmap_lock_read(mapping);
 	}
 
@@ -616,7 +616,7 @@  int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 			result = VM_FAULT_SIGBUS;
 			goto out;
 		}
-		if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR))
+		if ((length < HPAGE_SIZE) || (pfn & PG_PMD_COLOUR))
 			goto fallback;
 
 		if (buffer_unwritten(&bh) || buffer_new(&bh)) {

[v4,03/14] dax: use HPAGE_SIZE instead of PMD_SIZE

Commit Message

Comments

Patch