Message ID | 20190419204435.16984-1-mike.kravetz@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hugetlbfs: always use address space in inode for resv_map pointer | expand |
On 2019/4/20 4:44, Mike Kravetz wrote: > Continuing discussion about commit 58b6e5e8f1ad ("hugetlbfs: fix memory > leak for resv_map") brought up the issue that inode->i_mapping may not > point to the address space embedded within the inode at inode eviction > time. The hugetlbfs truncate routine handles this by explicitly using > inode->i_data. However, code cleaning up the resv_map will still use > the address space pointed to by inode->i_mapping. Luckily, private_data > is NULL for address spaces in all such cases today but, there is no > guarantee this will continue. > > Change all hugetlbfs code getting a resv_map pointer to explicitly get > it from the address space embedded within the inode. In addition, add > more comments in the code to indicate why this is being done. > > Reported-by: Yufen Yu <yuyufen@huawei.com> > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> > --- > fs/hugetlbfs/inode.c | 11 +++++++++-- > mm/hugetlb.c | 19 ++++++++++++++++++- > 2 files changed, 27 insertions(+), 3 deletions(-) > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index 9285dd4f4b1c..cbc649cd1722 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -499,8 +499,15 @@ static void hugetlbfs_evict_inode(struct inode *inode) > struct resv_map *resv_map; > > remove_inode_hugepages(inode, 0, LLONG_MAX); > - resv_map = (struct resv_map *)inode->i_mapping->private_data; > - /* root inode doesn't have the resv_map, so we should check it */ > + > + /* > + * Get the resv_map from the address space embedded in the inode. > + * This is the address space which points to any resv_map allocated > + * at inode creation time. If this is a device special inode, > + * i_mapping may not point to the original address space. > + */ > + resv_map = (struct resv_map *)(&inode->i_data)->private_data; > + /* Only regular and link inodes have associated reserve maps */ > if (resv_map) > resv_map_release(&resv_map->refs); > clear_inode(inode); > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 6cdc7b2d9100..b30e97b0ef37 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -740,7 +740,15 @@ void resv_map_release(struct kref *ref) > > static inline struct resv_map *inode_resv_map(struct inode *inode) > { > - return inode->i_mapping->private_data; > + /* > + * At inode evict time, i_mapping may not point to the original > + * address space within the inode. This original address space > + * contains the pointer to the resv_map. So, always use the > + * address space embedded within the inode. > + * The VERY common case is inode->mapping == &inode->i_data but, > + * this may not be true for device special inodes. > + */ > + return (struct resv_map *)(&inode->i_data)->private_data; > } > > static struct resv_map *vma_resv_map(struct vm_area_struct *vma) > @@ -4477,6 +4485,11 @@ int hugetlb_reserve_pages(struct inode *inode, > * called to make the mapping read-write. Assume !vma is a shm mapping > */ > if (!vma || vma->vm_flags & VM_MAYSHARE) { > + /* > + * resv_map can not be NULL as hugetlb_reserve_pages is only > + * called for inodes for which resv_maps were created (see > + * hugetlbfs_get_inode). > + */ > resv_map = inode_resv_map(inode); > > chg = region_chg(resv_map, from, to); > @@ -4568,6 +4581,10 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, > struct hugepage_subpool *spool = subpool_inode(inode); > long gbl_reserve; > > + /* > + * Since this routine can be called in the evict inode path for all > + * hugetlbfs inodes, resv_map could be NULL. > + */ > if (resv_map) { > chg = region_del(resv_map, start, end); > /* Dose this patch have been applied? I think it is better to add fixes label, like: Fixes: 58b6e5e8f1ad ("hugetlbfs: fix memory leak for resv_map") Since the commit 58b6e5e8f1a has been merged to stable, this patch also be needed. https://www.spinics.net/lists/stable/msg298740.html
On 5/8/19 12:10 AM, yuyufen wrote: > On 2019/4/20 4:44, Mike Kravetz wrote: >> Continuing discussion about commit 58b6e5e8f1ad ("hugetlbfs: fix memory >> leak for resv_map") brought up the issue that inode->i_mapping may not >> point to the address space embedded within the inode at inode eviction >> time. The hugetlbfs truncate routine handles this by explicitly using >> inode->i_data. However, code cleaning up the resv_map will still use >> the address space pointed to by inode->i_mapping. Luckily, private_data >> is NULL for address spaces in all such cases today but, there is no >> guarantee this will continue. >> >> Change all hugetlbfs code getting a resv_map pointer to explicitly get >> it from the address space embedded within the inode. In addition, add >> more comments in the code to indicate why this is being done. >> >> Reported-by: Yufen Yu <yuyufen@huawei.com> >> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> ... > > Dose this patch have been applied? Andrew has pulled it into his tree. However, I do not believe there has been an ACK or Review, so am not sure of the exact status. > I think it is better to add fixes label, like: > Fixes: 58b6e5e8f1ad ("hugetlbfs: fix memory leak for resv_map") > > Since the commit 58b6e5e8f1a has been merged to stable, this patch also be needed. > https://www.spinics.net/lists/stable/msg298740.html It must have been the AI that decided 58b6e5e8f1a needed to go to stable. Even though this technically does not fix 58b6e5e8f1a, I'm OK with adding the Fixes: to force this to go to the same stable trees.
On Wed, 8 May 2019 13:16:09 -0700 Mike Kravetz <mike.kravetz@oracle.com> wrote: > > I think it is better to add fixes label, like: > > Fixes: 58b6e5e8f1ad ("hugetlbfs: fix memory leak for resv_map") > > > > Since the commit 58b6e5e8f1a has been merged to stable, this patch also be needed. > > https://www.spinics.net/lists/stable/msg298740.html > > It must have been the AI that decided 58b6e5e8f1a needed to go to stable. grr. > Even though this technically does not fix 58b6e5e8f1a, I'm OK with adding > the Fixes: to force this to go to the same stable trees. Why are we bothering with any of this, given that : Luckily, private_data is NULL for address spaces in all such cases : today but, there is no guarantee this will continue. ? Even though 58b6e5e8f1ad was inappropriately backported, the above still holds, so what problem does a backport of "hugetlbfs: always use address space in inode for resv_map pointer" actually solve? And yes, some review of this would be nice
On 5/9/19 4:11 PM, Andrew Morton wrote: > On Wed, 8 May 2019 13:16:09 -0700 Mike Kravetz <mike.kravetz@oracle.com> wrote: > >>> I think it is better to add fixes label, like: >>> Fixes: 58b6e5e8f1ad ("hugetlbfs: fix memory leak for resv_map") >>> >>> Since the commit 58b6e5e8f1a has been merged to stable, this patch also be needed. >>> https://www.spinics.net/lists/stable/msg298740.html >> >> It must have been the AI that decided 58b6e5e8f1a needed to go to stable. > > grr. > >> Even though this technically does not fix 58b6e5e8f1a, I'm OK with adding >> the Fixes: to force this to go to the same stable trees. > > Why are we bothering with any of this, given that > > : Luckily, private_data is NULL for address spaces in all such cases > : today but, there is no guarantee this will continue. > > ? You are right. For stable releases, I do not see any way for this to be an issue. We are lucky today (and in the past). The patch is there to guard against code changes which may cause this condition to change in the future. Yufen Yu, do you see this actually fixing a problem in stable releases? I believe you originally said this is not a problem today, which would also imply older releases. Just want to make sure I am not missing something.
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 9285dd4f4b1c..cbc649cd1722 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -499,8 +499,15 @@ static void hugetlbfs_evict_inode(struct inode *inode) struct resv_map *resv_map; remove_inode_hugepages(inode, 0, LLONG_MAX); - resv_map = (struct resv_map *)inode->i_mapping->private_data; - /* root inode doesn't have the resv_map, so we should check it */ + + /* + * Get the resv_map from the address space embedded in the inode. + * This is the address space which points to any resv_map allocated + * at inode creation time. If this is a device special inode, + * i_mapping may not point to the original address space. + */ + resv_map = (struct resv_map *)(&inode->i_data)->private_data; + /* Only regular and link inodes have associated reserve maps */ if (resv_map) resv_map_release(&resv_map->refs); clear_inode(inode); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6cdc7b2d9100..b30e97b0ef37 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -740,7 +740,15 @@ void resv_map_release(struct kref *ref) static inline struct resv_map *inode_resv_map(struct inode *inode) { - return inode->i_mapping->private_data; + /* + * At inode evict time, i_mapping may not point to the original + * address space within the inode. This original address space + * contains the pointer to the resv_map. So, always use the + * address space embedded within the inode. + * The VERY common case is inode->mapping == &inode->i_data but, + * this may not be true for device special inodes. + */ + return (struct resv_map *)(&inode->i_data)->private_data; } static struct resv_map *vma_resv_map(struct vm_area_struct *vma) @@ -4477,6 +4485,11 @@ int hugetlb_reserve_pages(struct inode *inode, * called to make the mapping read-write. Assume !vma is a shm mapping */ if (!vma || vma->vm_flags & VM_MAYSHARE) { + /* + * resv_map can not be NULL as hugetlb_reserve_pages is only + * called for inodes for which resv_maps were created (see + * hugetlbfs_get_inode). + */ resv_map = inode_resv_map(inode); chg = region_chg(resv_map, from, to); @@ -4568,6 +4581,10 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, struct hugepage_subpool *spool = subpool_inode(inode); long gbl_reserve; + /* + * Since this routine can be called in the evict inode path for all + * hugetlbfs inodes, resv_map could be NULL. + */ if (resv_map) { chg = region_del(resv_map, start, end); /*
Continuing discussion about commit 58b6e5e8f1ad ("hugetlbfs: fix memory leak for resv_map") brought up the issue that inode->i_mapping may not point to the address space embedded within the inode at inode eviction time. The hugetlbfs truncate routine handles this by explicitly using inode->i_data. However, code cleaning up the resv_map will still use the address space pointed to by inode->i_mapping. Luckily, private_data is NULL for address spaces in all such cases today but, there is no guarantee this will continue. Change all hugetlbfs code getting a resv_map pointer to explicitly get it from the address space embedded within the inode. In addition, add more comments in the code to indicate why this is being done. Reported-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> --- fs/hugetlbfs/inode.c | 11 +++++++++-- mm/hugetlb.c | 19 ++++++++++++++++++- 2 files changed, 27 insertions(+), 3 deletions(-)