diff mbox

mm/shmem: Zero out unused vma fields in shmem_pseudo_vma_init()

Message ID 20180531135602.20321-1-kirill.shutemov@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kirill A . Shutemov May 31, 2018, 1:56 p.m. UTC
shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy.

The pseudo vma doesn't have vm_page_prot set. We are going to encode
encryption KeyID in vm_page_prot. Having garbage there causes problems.

Zero out all unused fields in the pseudo vma.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/shmem.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Hugh Dickins May 31, 2018, 10:50 p.m. UTC | #1
On Thu, 31 May 2018, Kirill A. Shutemov wrote:

> shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy.
> 
> The pseudo vma doesn't have vm_page_prot set. We are going to encode
> encryption KeyID in vm_page_prot. Having garbage there causes problems.
> 
> Zero out all unused fields in the pseudo vma.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

I won't go so far as to say NAK, but personally I much prefer that we
document what fields actually get used, by initializing only those,
rather than having such a blanket memset.

And you say "We are going to ...": so this should really be part of
some future patchset, shouldn't it?

My opinion might be in the minority: you remind me of a similar
request from Josef some while ago, Cc'ing him.

(I'm very ashamed, by the way, of shmem's pseudo-vma, I think it's
horrid, and just reflects that shmem was an afterthought when NUMA
mempolicies were designed.  Internally, we replaced alloc_pages_vma()
throughout by alloc_pages_mpol(), which has no need for pseudo-vmas,
and the advantage of dropping mmap_sem across the bulk of NUMA page
migration. I shall be updating that work in coming months, and hope
to upstream, but no promise from me on the timing - your need for
vm_page_prot likely much sooner.)

Hugh

> ---
>  mm/shmem.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 9d6c7e595415..693fb82b4b42 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1404,10 +1404,9 @@ static void shmem_pseudo_vma_init(struct vm_area_struct *vma,
>  		struct shmem_inode_info *info, pgoff_t index)
>  {
>  	/* Create a pseudo vma that just contains the policy */
> -	vma->vm_start = 0;
> +	memset(vma, 0, sizeof(*vma));
>  	/* Bias interleave by inode number to distribute better across nodes */
>  	vma->vm_pgoff = index + info->vfs_inode.i_ino;
> -	vma->vm_ops = NULL;
>  	vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index);
>  }
>  
> -- 
> 2.17.0
Andrew Morton May 31, 2018, 10:52 p.m. UTC | #2
On Thu, 31 May 2018 16:56:02 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:

> shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy.
> 
> The pseudo vma doesn't have vm_page_prot set. We are going to encode
> encryption KeyID in vm_page_prot. Having garbage there causes problems.
> 
> Zero out all unused fields in the pseudo vma.
> 

So there are no known problems in the current mainline kernel?

> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1404,10 +1404,9 @@ static void shmem_pseudo_vma_init(struct vm_area_struct *vma,
>  		struct shmem_inode_info *info, pgoff_t index)
>  {
>  	/* Create a pseudo vma that just contains the policy */
> -	vma->vm_start = 0;
> +	memset(vma, 0, sizeof(*vma));
>  	/* Bias interleave by inode number to distribute better across nodes */
>  	vma->vm_pgoff = index + info->vfs_inode.i_ino;
> -	vma->vm_ops = NULL;
>  	vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index);
>  }
Hugh Dickins May 31, 2018, 11:37 p.m. UTC | #3
On Thu, 31 May 2018, Andrew Morton wrote:
> On Thu, 31 May 2018 16:56:02 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
> 
> > shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy.
> > 
> > The pseudo vma doesn't have vm_page_prot set. We are going to encode
> > encryption KeyID in vm_page_prot. Having garbage there causes problems.
> > 
> > Zero out all unused fields in the pseudo vma.
> > 
> 
> So there are no known problems in the current mainline kernel?

Correct - if we limit ourselves to the area of the shmem pseudo-vma :)

Hugh
Kirill A . Shutemov June 4, 2018, 11:32 a.m. UTC | #4
On Thu, May 31, 2018 at 10:50:36PM +0000, Hugh Dickins wrote:
> On Thu, 31 May 2018, Kirill A. Shutemov wrote:
> 
> > shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy.
> > 
> > The pseudo vma doesn't have vm_page_prot set. We are going to encode
> > encryption KeyID in vm_page_prot. Having garbage there causes problems.
> > 
> > Zero out all unused fields in the pseudo vma.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> I won't go so far as to say NAK, but personally I much prefer that we
> document what fields actually get used, by initializing only those,
> rather than having such a blanket memset.

I recognize value of documentation here. But I still think leaving garbage
in the fields is not a great idea.

> 
> And you say "We are going to ...": so this should really be part of
> some future patchset, shouldn't it?

Yeah. It's for MKTME. I just try to push easy patches first.

> My opinion might be in the minority: you remind me of a similar
> request from Josef some while ago, Cc'ing him.
> 
> (I'm very ashamed, by the way, of shmem's pseudo-vma, I think it's
> horrid, and just reflects that shmem was an afterthought when NUMA
> mempolicies were designed.  Internally, we replaced alloc_pages_vma()
> throughout by alloc_pages_mpol(), which has no need for pseudo-vmas,
> and the advantage of dropping mmap_sem across the bulk of NUMA page
> migration. I shall be updating that work in coming months, and hope
> to upstream, but no promise from me on the timing - your need for
> vm_page_prot likely much sooner.)

I will try to look at how we can get alloc_pages_mpol() implemented.
(Although interleave bias is kinda confusing. I'll need to wrap my head
around the thing.)
diff mbox

Patch

diff --git a/mm/shmem.c b/mm/shmem.c
index 9d6c7e595415..693fb82b4b42 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1404,10 +1404,9 @@  static void shmem_pseudo_vma_init(struct vm_area_struct *vma,
 		struct shmem_inode_info *info, pgoff_t index)
 {
 	/* Create a pseudo vma that just contains the policy */
-	vma->vm_start = 0;
+	memset(vma, 0, sizeof(*vma));
 	/* Bias interleave by inode number to distribute better across nodes */
 	vma->vm_pgoff = index + info->vfs_inode.i_ino;
-	vma->vm_ops = NULL;
 	vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index);
 }