Message ID | 20180531135602.20321-1-kirill.shutemov@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 31 May 2018, Kirill A. Shutemov wrote: > shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy. > > The pseudo vma doesn't have vm_page_prot set. We are going to encode > encryption KeyID in vm_page_prot. Having garbage there causes problems. > > Zero out all unused fields in the pseudo vma. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> I won't go so far as to say NAK, but personally I much prefer that we document what fields actually get used, by initializing only those, rather than having such a blanket memset. And you say "We are going to ...": so this should really be part of some future patchset, shouldn't it? My opinion might be in the minority: you remind me of a similar request from Josef some while ago, Cc'ing him. (I'm very ashamed, by the way, of shmem's pseudo-vma, I think it's horrid, and just reflects that shmem was an afterthought when NUMA mempolicies were designed. Internally, we replaced alloc_pages_vma() throughout by alloc_pages_mpol(), which has no need for pseudo-vmas, and the advantage of dropping mmap_sem across the bulk of NUMA page migration. I shall be updating that work in coming months, and hope to upstream, but no promise from me on the timing - your need for vm_page_prot likely much sooner.) Hugh > --- > mm/shmem.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 9d6c7e595415..693fb82b4b42 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1404,10 +1404,9 @@ static void shmem_pseudo_vma_init(struct vm_area_struct *vma, > struct shmem_inode_info *info, pgoff_t index) > { > /* Create a pseudo vma that just contains the policy */ > - vma->vm_start = 0; > + memset(vma, 0, sizeof(*vma)); > /* Bias interleave by inode number to distribute better across nodes */ > vma->vm_pgoff = index + info->vfs_inode.i_ino; > - vma->vm_ops = NULL; > vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index); > } > > -- > 2.17.0
On Thu, 31 May 2018 16:56:02 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote: > shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy. > > The pseudo vma doesn't have vm_page_prot set. We are going to encode > encryption KeyID in vm_page_prot. Having garbage there causes problems. > > Zero out all unused fields in the pseudo vma. > So there are no known problems in the current mainline kernel? > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1404,10 +1404,9 @@ static void shmem_pseudo_vma_init(struct vm_area_struct *vma, > struct shmem_inode_info *info, pgoff_t index) > { > /* Create a pseudo vma that just contains the policy */ > - vma->vm_start = 0; > + memset(vma, 0, sizeof(*vma)); > /* Bias interleave by inode number to distribute better across nodes */ > vma->vm_pgoff = index + info->vfs_inode.i_ino; > - vma->vm_ops = NULL; > vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index); > }
On Thu, 31 May 2018, Andrew Morton wrote: > On Thu, 31 May 2018 16:56:02 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote: > > > shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy. > > > > The pseudo vma doesn't have vm_page_prot set. We are going to encode > > encryption KeyID in vm_page_prot. Having garbage there causes problems. > > > > Zero out all unused fields in the pseudo vma. > > > > So there are no known problems in the current mainline kernel? Correct - if we limit ourselves to the area of the shmem pseudo-vma :) Hugh
On Thu, May 31, 2018 at 10:50:36PM +0000, Hugh Dickins wrote: > On Thu, 31 May 2018, Kirill A. Shutemov wrote: > > > shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy. > > > > The pseudo vma doesn't have vm_page_prot set. We are going to encode > > encryption KeyID in vm_page_prot. Having garbage there causes problems. > > > > Zero out all unused fields in the pseudo vma. > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > I won't go so far as to say NAK, but personally I much prefer that we > document what fields actually get used, by initializing only those, > rather than having such a blanket memset. I recognize value of documentation here. But I still think leaving garbage in the fields is not a great idea. > > And you say "We are going to ...": so this should really be part of > some future patchset, shouldn't it? Yeah. It's for MKTME. I just try to push easy patches first. > My opinion might be in the minority: you remind me of a similar > request from Josef some while ago, Cc'ing him. > > (I'm very ashamed, by the way, of shmem's pseudo-vma, I think it's > horrid, and just reflects that shmem was an afterthought when NUMA > mempolicies were designed. Internally, we replaced alloc_pages_vma() > throughout by alloc_pages_mpol(), which has no need for pseudo-vmas, > and the advantage of dropping mmap_sem across the bulk of NUMA page > migration. I shall be updating that work in coming months, and hope > to upstream, but no promise from me on the timing - your need for > vm_page_prot likely much sooner.) I will try to look at how we can get alloc_pages_mpol() implemented. (Although interleave bias is kinda confusing. I'll need to wrap my head around the thing.)
diff --git a/mm/shmem.c b/mm/shmem.c index 9d6c7e595415..693fb82b4b42 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1404,10 +1404,9 @@ static void shmem_pseudo_vma_init(struct vm_area_struct *vma, struct shmem_inode_info *info, pgoff_t index) { /* Create a pseudo vma that just contains the policy */ - vma->vm_start = 0; + memset(vma, 0, sizeof(*vma)); /* Bias interleave by inode number to distribute better across nodes */ vma->vm_pgoff = index + info->vfs_inode.i_ino; - vma->vm_ops = NULL; vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index); }
shmem/tmpfs uses pseudo vma to allocate page with correct NUMA policy. The pseudo vma doesn't have vm_page_prot set. We are going to encode encryption KeyID in vm_page_prot. Having garbage there causes problems. Zero out all unused fields in the pseudo vma. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- mm/shmem.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)