Message ID | 20190213112900.33963-1-minchan@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Fix the pgtable leak | expand |
On Wed, Feb 13, 2019 at 08:29:00PM +0900, Minchan Kim wrote: > [1] was backported to v4.9 stable tree but it introduces pgtable > memory leak because with fault retrial, preallocated pagetable > could be leaked in second iteration. > To fix the problem, this patch backport [2]. > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > [2] b0b9b3df27d10, mm: stop leaking PageTables > > Fixes: 5cf3e5ff95876 ("mm, memcg: fix reclaim deadlock with writeback") > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Hugh Dickins <hughd@google.com> > Cc: Liu Bo <bo.liu@linux.alibaba.com> > Cc: <stable@vger.kernel.org> [4.9] > Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Martin Liu <liumartin@google.com>
On Wed 13-02-19 20:29:00, Minchan Kim wrote: > [1] was backported to v4.9 stable tree but it introduces pgtable > memory leak because with fault retrial, preallocated pagetable > could be leaked in second iteration. > To fix the problem, this patch backport [2]. > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > [2] b0b9b3df27d10, mm: stop leaking PageTables > > Fixes: 5cf3e5ff95876 ("mm, memcg: fix reclaim deadlock with writeback") > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Hugh Dickins <hughd@google.com> > Cc: Liu Bo <bo.liu@linux.alibaba.com> > Cc: <stable@vger.kernel.org> [4.9] > Signed-off-by: Minchan Kim <minchan@kernel.org> Thanks for catching this dependency. Do I assume it correctly that this is stable-4.9 only? > --- > mm/memory.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 35d8217bb0467..47248dc0b9e1a 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3329,15 +3329,24 @@ static int do_fault(struct fault_env *fe) > { > struct vm_area_struct *vma = fe->vma; > pgoff_t pgoff = linear_page_index(vma, fe->address); > + int ret; > > /* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */ > if (!vma->vm_ops->fault) > - return VM_FAULT_SIGBUS; > - if (!(fe->flags & FAULT_FLAG_WRITE)) > - return do_read_fault(fe, pgoff); > - if (!(vma->vm_flags & VM_SHARED)) > - return do_cow_fault(fe, pgoff); > - return do_shared_fault(fe, pgoff); > + ret = VM_FAULT_SIGBUS; > + else if (!(fe->flags & FAULT_FLAG_WRITE)) > + ret = do_read_fault(fe, pgoff); > + else if (!(vma->vm_flags & VM_SHARED)) > + ret = do_cow_fault(fe, pgoff); > + else > + ret = do_shared_fault(fe, pgoff); > + > + /* preallocated pagetable is unused: free it */ > + if (fe->prealloc_pte) { > + pte_free(vma->vm_mm, fe->prealloc_pte); > + fe->prealloc_pte = 0; > + } > + return ret; > } > > static int numa_migrate_prep(struct page *page, struct vm_area_struct *vma, > -- > 2.20.1.791.gb4d0f1c61a-goog >
On Wed, Feb 13, 2019 at 01:03:30PM +0100, Michal Hocko wrote: > On Wed 13-02-19 20:29:00, Minchan Kim wrote: > > [1] was backported to v4.9 stable tree but it introduces pgtable > > memory leak because with fault retrial, preallocated pagetable > > could be leaked in second iteration. > > To fix the problem, this patch backport [2]. > > > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > > [2] b0b9b3df27d10, mm: stop leaking PageTables > > > > Fixes: 5cf3e5ff95876 ("mm, memcg: fix reclaim deadlock with writeback") > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > Cc: Michal Hocko <mhocko@suse.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Hugh Dickins <hughd@google.com> > > Cc: Liu Bo <bo.liu@linux.alibaba.com> > > Cc: <stable@vger.kernel.org> [4.9] > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > Thanks for catching this dependency. Do I assume it correctly that this > is stable-4.9 only? I have no idea how I could find it automatically that a stable patch of linus tree is spread out with several stable trees(Hope Greg has an answer). I just checked 4.4 longterm kernel and couldn't find it in there. Thanks.
On Wed 13-02-19 21:12:00, Minchan Kim wrote: > On Wed, Feb 13, 2019 at 01:03:30PM +0100, Michal Hocko wrote: > > On Wed 13-02-19 20:29:00, Minchan Kim wrote: > > > [1] was backported to v4.9 stable tree but it introduces pgtable > > > memory leak because with fault retrial, preallocated pagetable > > > could be leaked in second iteration. > > > To fix the problem, this patch backport [2]. > > > > > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > > > [2] b0b9b3df27d10, mm: stop leaking PageTables > > > > > > Fixes: 5cf3e5ff95876 ("mm, memcg: fix reclaim deadlock with writeback") > > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > Cc: Michal Hocko <mhocko@suse.com> > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > > Cc: Hugh Dickins <hughd@google.com> > > > Cc: Liu Bo <bo.liu@linux.alibaba.com> > > > Cc: <stable@vger.kernel.org> [4.9] > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > > > Thanks for catching this dependency. Do I assume it correctly that this > > is stable-4.9 only? > > I have no idea how I could find it automatically that a stable patch of > linus tree is spread out with several stable trees(Hope Greg has an > answer). I just checked 4.4 longterm kernel and couldn't find it in there. See http://lkml.kernel.org/r/20190115174036.GA24149@dhcp22.suse.cz But my question was more about "this is a stable only thing"? It was not obvious from the subject so I wanted to be sure that I am not missing anything.
On Wed, Feb 13, 2019 at 08:29:00PM +0900, Minchan Kim wrote: > [1] was backported to v4.9 stable tree but it introduces pgtable > memory leak because with fault retrial, preallocated pagetable > could be leaked in second iteration. > To fix the problem, this patch backport [2]. > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback This is really commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock with writeback") which was in 4.9.152, 4.14.94, 4.19.16, and 4.20.3 as well as 5.0-rc2. > [2] b0b9b3df27d10, mm: stop leaking PageTables This commit was in 4.10, so I am guessing that this really is just a backport of that commit? If so, it's not the full backport, why not take the whole thing? Why only cherry-pick one chunk of it? Why do we not need the other parts? thanks, greg k-h
On Wed, Feb 13, 2019 at 02:36:24PM +0100, Greg KH wrote: > On Wed, Feb 13, 2019 at 08:29:00PM +0900, Minchan Kim wrote: > > [1] was backported to v4.9 stable tree but it introduces pgtable > > memory leak because with fault retrial, preallocated pagetable > > could be leaked in second iteration. > > To fix the problem, this patch backport [2]. > > > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > > This is really commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock > with writeback") which was in 4.9.152, 4.14.94, 4.19.16, and 4.20.3 as > well as 5.0-rc2. Since 4.10, we has [2] so it should be okay other (tree > 4.10) > > > [2] b0b9b3df27d10, mm: stop leaking PageTables > > This commit was in 4.10, so I am guessing that this really is just a > backport of that commit? Yub. > > If so, it's not the full backport, why not take the whole thing? Why > only cherry-pick one chunk of it? Why do we not need the other parts? Because [2] actually aims for fixing [3] which was introduced at 4.10. Since then, [1] relies on the chunk I sent. Thus we don't need other part for 4.9. [3] 953c66c2b22a ("mm: THP page cache support for ppc64") Thanks.
On Wed, Feb 13, 2019 at 01:24:58PM +0100, Michal Hocko wrote: > On Wed 13-02-19 21:12:00, Minchan Kim wrote: > > On Wed, Feb 13, 2019 at 01:03:30PM +0100, Michal Hocko wrote: > > > On Wed 13-02-19 20:29:00, Minchan Kim wrote: > > > > [1] was backported to v4.9 stable tree but it introduces pgtable > > > > memory leak because with fault retrial, preallocated pagetable > > > > could be leaked in second iteration. > > > > To fix the problem, this patch backport [2]. > > > > > > > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > > > > [2] b0b9b3df27d10, mm: stop leaking PageTables > > > > > > > > Fixes: 5cf3e5ff95876 ("mm, memcg: fix reclaim deadlock with writeback") > > > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > Cc: Michal Hocko <mhocko@suse.com> > > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > > > Cc: Hugh Dickins <hughd@google.com> > > > > Cc: Liu Bo <bo.liu@linux.alibaba.com> > > > > Cc: <stable@vger.kernel.org> [4.9] > > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > > > > > Thanks for catching this dependency. Do I assume it correctly that this > > > is stable-4.9 only? > > > > I have no idea how I could find it automatically that a stable patch of > > linus tree is spread out with several stable trees(Hope Greg has an > > answer). I just checked 4.4 longterm kernel and couldn't find it in there. > > See http://lkml.kernel.org/r/20190115174036.GA24149@dhcp22.suse.cz > > But my question was more about "this is a stable only thing"? It was not > obvious from the subject so I wanted to be sure that I am not missing > anything. Yub, I think only 4.9 stable tree need to be fixed because Hugh's patch was in there since v4.10. Thanks.
On Thu, Feb 14, 2019 at 04:23:52PM +0900, Minchan Kim wrote: > On Wed, Feb 13, 2019 at 02:36:24PM +0100, Greg KH wrote: > > On Wed, Feb 13, 2019 at 08:29:00PM +0900, Minchan Kim wrote: > > > [1] was backported to v4.9 stable tree but it introduces pgtable > > > memory leak because with fault retrial, preallocated pagetable > > > could be leaked in second iteration. > > > To fix the problem, this patch backport [2]. > > > > > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > > > > This is really commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock > > with writeback") which was in 4.9.152, 4.14.94, 4.19.16, and 4.20.3 as > > well as 5.0-rc2. > > Since 4.10, we has [2] so it should be okay other (tree > 4.10) > > > > > > [2] b0b9b3df27d10, mm: stop leaking PageTables > > > > This commit was in 4.10, so I am guessing that this really is just a > > backport of that commit? > > Yub. > > > > > If so, it's not the full backport, why not take the whole thing? Why > > only cherry-pick one chunk of it? Why do we not need the other parts? > > Because [2] actually aims for fixing [3] which was introduced at 4.10. > Since then, [1] relies on the chunk I sent. Thus we don't need other part > for 4.9. > > [3] 953c66c2b22a ("mm: THP page cache support for ppc64") Hi Greg, Any chance to look into this patch?
On Mon, Feb 18, 2019 at 05:20:26PM +0900, Minchan Kim wrote: > On Thu, Feb 14, 2019 at 04:23:52PM +0900, Minchan Kim wrote: > > On Wed, Feb 13, 2019 at 02:36:24PM +0100, Greg KH wrote: > > > On Wed, Feb 13, 2019 at 08:29:00PM +0900, Minchan Kim wrote: > > > > [1] was backported to v4.9 stable tree but it introduces pgtable > > > > memory leak because with fault retrial, preallocated pagetable > > > > could be leaked in second iteration. > > > > To fix the problem, this patch backport [2]. > > > > > > > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > > > > > > This is really commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock > > > with writeback") which was in 4.9.152, 4.14.94, 4.19.16, and 4.20.3 as > > > well as 5.0-rc2. > > > > Since 4.10, we has [2] so it should be okay other (tree > 4.10) > > > > > > > > > [2] b0b9b3df27d10, mm: stop leaking PageTables > > > > > > This commit was in 4.10, so I am guessing that this really is just a > > > backport of that commit? > > > > Yub. > > > > > > > > If so, it's not the full backport, why not take the whole thing? Why > > > only cherry-pick one chunk of it? Why do we not need the other parts? > > > > Because [2] actually aims for fixing [3] which was introduced at 4.10. > > Since then, [1] relies on the chunk I sent. Thus we don't need other part > > for 4.9. > > > > [3] 953c66c2b22a ("mm: THP page cache support for ppc64") > > Hi Greg, > > Any chance to look into this patch? You sent it on Thursday, it is Monday, I tried to actually take the weekend off :) It's in my queue, relax, it has good company with a few other hundred stable patches being requested... greg k-h
On Wed, Feb 13, 2019 at 08:29:00PM +0900, Minchan Kim wrote: > [1] was backported to v4.9 stable tree but it introduces pgtable > memory leak because with fault retrial, preallocated pagetable > could be leaked in second iteration. > To fix the problem, this patch backport [2]. > > [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback > [2] b0b9b3df27d10, mm: stop leaking PageTables > > Fixes: 5cf3e5ff95876 ("mm, memcg: fix reclaim deadlock with writeback") > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Hugh Dickins <hughd@google.com> > Cc: Liu Bo <bo.liu@linux.alibaba.com> > Cc: <stable@vger.kernel.org> [4.9] > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > mm/memory.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) I fixed up the changelog text to be correct, and now queued this up, thanks. greg k-h
diff --git a/mm/memory.c b/mm/memory.c index 35d8217bb0467..47248dc0b9e1a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3329,15 +3329,24 @@ static int do_fault(struct fault_env *fe) { struct vm_area_struct *vma = fe->vma; pgoff_t pgoff = linear_page_index(vma, fe->address); + int ret; /* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */ if (!vma->vm_ops->fault) - return VM_FAULT_SIGBUS; - if (!(fe->flags & FAULT_FLAG_WRITE)) - return do_read_fault(fe, pgoff); - if (!(vma->vm_flags & VM_SHARED)) - return do_cow_fault(fe, pgoff); - return do_shared_fault(fe, pgoff); + ret = VM_FAULT_SIGBUS; + else if (!(fe->flags & FAULT_FLAG_WRITE)) + ret = do_read_fault(fe, pgoff); + else if (!(vma->vm_flags & VM_SHARED)) + ret = do_cow_fault(fe, pgoff); + else + ret = do_shared_fault(fe, pgoff); + + /* preallocated pagetable is unused: free it */ + if (fe->prealloc_pte) { + pte_free(vma->vm_mm, fe->prealloc_pte); + fe->prealloc_pte = 0; + } + return ret; } static int numa_migrate_prep(struct page *page, struct vm_area_struct *vma,
[1] was backported to v4.9 stable tree but it introduces pgtable memory leak because with fault retrial, preallocated pagetable could be leaked in second iteration. To fix the problem, this patch backport [2]. [1] 5cf3e5ff95876, mm, memcg: fix reclaim deadlock with writeback [2] b0b9b3df27d10, mm: stop leaking PageTables Fixes: 5cf3e5ff95876 ("mm, memcg: fix reclaim deadlock with writeback") Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Liu Bo <bo.liu@linux.alibaba.com> Cc: <stable@vger.kernel.org> [4.9] Signed-off-by: Minchan Kim <minchan@kernel.org> --- mm/memory.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-)