Message ID | 1582889550-9101-3-git-send-email-kernelfans@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fix omission of check on FOLL_LONGTERM in gup fast path | expand |
On Fri, Feb 28, 2020 at 07:32:29PM +0800, Pingfan Liu wrote: > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > can't move. It would truncate CMA permanently and should be excluded. > > FOLL_LONGTERM has already been checked in the slow path, but not checked in > the fast path, which means a possible leak of CMA page to longterm pinned > requirement through this crack. > > Place a check in try_get_compound_head() in the fast path. > > Some note about the check: > Huge page's subpages have the same migrate type due to either > allocation from a free_list[] or alloc_contig_range() with param > MIGRATE_MOVABLE. So it is enough to check on a single subpage > by is_migrate_cma_page(subpage) > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > Cc: Ira Weiny <ira.weiny@intel.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Mike Rapoport <rppt@linux.ibm.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: John Hubbard <jhubbard@nvidia.com> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > Cc: Keith Busch <keith.busch@intel.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Shuah Khan <shuah@kernel.org> > To: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > mm/gup.c | 26 +++++++++++++++++++------- > 1 file changed, 19 insertions(+), 7 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index cd8075e..f0d6804 100644 > +++ b/mm/gup.c > @@ -33,9 +33,21 @@ struct follow_page_context { > * Return the compound head page with ref appropriately incremented, > * or NULL if that failed. > */ > -static inline struct page *try_get_compound_head(struct page *page, int refs) > +static inline struct page *try_get_compound_head(struct page *page, int refs, > + unsigned int flags) > { > - struct page *head = compound_head(page); > + struct page *head; > + > + /* > + * Huge page's subpages have the same migrate type due to either > + * allocation from a free_list[] or alloc_contig_range() with param > + * MIGRATE_MOVABLE. So it is enough to check on a single subpage. > + */ > + if (unlikely(flags & FOLL_LONGTERM) && > + is_migrate_cma_page(page)) > + return NULL; This doesn't seem very good actually. If I understand properly, if the system has randomly decided to place, say, an anonymous page in a CMA region when an application did mmap(), then when the application tries to use this page with a LONGTERM pin it gets an immediate failure because of the above. This not OK - the application should not be subject to random failures related to long term pins beyond its direct control. Essentially, failures should only originate from the application using specific mmap scenarios, not randomly based on something the MM did, and certainly never for anonymous memory. I think the correct action here is to trigger migration of the page so it is not in CMA. Jason
On Fri, Feb 28, 2020 at 07:32:29PM +0800, Pingfan Liu wrote: > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > can't move. It would truncate CMA permanently and should be excluded. I don't understand what is 'truncated' here? I generally agree with Jason that this is going to be confusing to the user. Ira > > FOLL_LONGTERM has already been checked in the slow path, but not checked in > the fast path, which means a possible leak of CMA page to longterm pinned > requirement through this crack. > > Place a check in try_get_compound_head() in the fast path. > > Some note about the check: > Huge page's subpages have the same migrate type due to either > allocation from a free_list[] or alloc_contig_range() with param > MIGRATE_MOVABLE. So it is enough to check on a single subpage > by is_migrate_cma_page(subpage) > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > Cc: Ira Weiny <ira.weiny@intel.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Mike Rapoport <rppt@linux.ibm.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: John Hubbard <jhubbard@nvidia.com> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > Cc: Keith Busch <keith.busch@intel.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Shuah Khan <shuah@kernel.org> > To: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > mm/gup.c | 26 +++++++++++++++++++------- > 1 file changed, 19 insertions(+), 7 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index cd8075e..f0d6804 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -33,9 +33,21 @@ struct follow_page_context { > * Return the compound head page with ref appropriately incremented, > * or NULL if that failed. > */ > -static inline struct page *try_get_compound_head(struct page *page, int refs) > +static inline struct page *try_get_compound_head(struct page *page, int refs, > + unsigned int flags) > { > - struct page *head = compound_head(page); > + struct page *head; > + > + /* > + * Huge page's subpages have the same migrate type due to either > + * allocation from a free_list[] or alloc_contig_range() with param > + * MIGRATE_MOVABLE. So it is enough to check on a single subpage. > + */ > + if (unlikely(flags & FOLL_LONGTERM) && > + is_migrate_cma_page(page)) > + return NULL; > + > + head = compound_head(page); > > if (WARN_ON_ONCE(page_ref_count(head) < 0)) > return NULL; > @@ -1908,7 +1920,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, > VM_BUG_ON(!pfn_valid(pte_pfn(pte))); > page = pte_page(pte); > > - head = try_get_compound_head(page, 1); > + head = try_get_compound_head(page, 1, flags); > if (!head) > goto pte_unmap; > > @@ -2083,7 +2095,7 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, > page = head + ((addr & (sz-1)) >> PAGE_SHIFT); > refs = record_subpages(page, addr, end, pages + *nr); > > - head = try_get_compound_head(head, refs); > + head = try_get_compound_head(head, refs, flags); > if (!head) > return 0; > > @@ -2142,7 +2154,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, > page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); > refs = record_subpages(page, addr, end, pages + *nr); > > - head = try_get_compound_head(pmd_page(orig), refs); > + head = try_get_compound_head(pmd_page(orig), refs, flags); > if (!head) > return 0; > > @@ -2174,7 +2186,7 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, > page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); > refs = record_subpages(page, addr, end, pages + *nr); > > - head = try_get_compound_head(pud_page(orig), refs); > + head = try_get_compound_head(pud_page(orig), refs, flags); > if (!head) > return 0; > > @@ -2203,7 +2215,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, > page = pgd_page(orig) + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); > refs = record_subpages(page, addr, end, pages + *nr); > > - head = try_get_compound_head(pgd_page(orig), refs); > + head = try_get_compound_head(pgd_page(orig), refs, flags); > if (!head) > return 0; > > -- > 2.7.5 > >
On Fri, Feb 28, 2020 at 9:44 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Fri, Feb 28, 2020 at 07:32:29PM +0800, Pingfan Liu wrote: > > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > > can't move. It would truncate CMA permanently and should be excluded. > > > > FOLL_LONGTERM has already been checked in the slow path, but not checked in > > the fast path, which means a possible leak of CMA page to longterm pinned > > requirement through this crack. > > > > Place a check in try_get_compound_head() in the fast path. > > > > Some note about the check: > > Huge page's subpages have the same migrate type due to either > > allocation from a free_list[] or alloc_contig_range() with param > > MIGRATE_MOVABLE. So it is enough to check on a single subpage > > by is_migrate_cma_page(subpage) > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > Cc: Ira Weiny <ira.weiny@intel.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Mike Rapoport <rppt@linux.ibm.com> > > Cc: Dan Williams <dan.j.williams@intel.com> > > Cc: Matthew Wilcox <willy@infradead.org> > > Cc: John Hubbard <jhubbard@nvidia.com> > > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > > Cc: Keith Busch <keith.busch@intel.com> > > Cc: Christoph Hellwig <hch@infradead.org> > > Cc: Shuah Khan <shuah@kernel.org> > > To: linux-mm@kvack.org > > Cc: linux-kernel@vger.kernel.org > > mm/gup.c | 26 +++++++++++++++++++------- > > 1 file changed, 19 insertions(+), 7 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index cd8075e..f0d6804 100644 > > +++ b/mm/gup.c > > @@ -33,9 +33,21 @@ struct follow_page_context { > > * Return the compound head page with ref appropriately incremented, > > * or NULL if that failed. > > */ > > -static inline struct page *try_get_compound_head(struct page *page, int refs) > > +static inline struct page *try_get_compound_head(struct page *page, int refs, > > + unsigned int flags) > > { > > - struct page *head = compound_head(page); > > + struct page *head; > > + > > + /* > > + * Huge page's subpages have the same migrate type due to either > > + * allocation from a free_list[] or alloc_contig_range() with param > > + * MIGRATE_MOVABLE. So it is enough to check on a single subpage. > > + */ > > + if (unlikely(flags & FOLL_LONGTERM) && > > + is_migrate_cma_page(page)) > > + return NULL; > > This doesn't seem very good actually. > > If I understand properly, if the system has randomly decided to place, > say, an anonymous page in a CMA region when an application did mmap(), > then when the application tries to use this page with a LONGTERM pin > it gets an immediate failure because of the above. No, actually, it will fall back to slow path, which migrates and sever the LONGTERM pin. This patch just aims to fix the leakage in gup fast path, while in gup slow path, there is already logic to guard CMA against LONGTERM pin. > > This not OK - the application should not be subject to random failures > related to long term pins beyond its direct control. > > Essentially, failures should only originate from the application using > specific mmap scenarios, not randomly based on something the MM did, > and certainly never for anonymous memory. > > I think the correct action here is to trigger migration of the page so > it is not in CMA. In fact, it does this. The failure in gup fast path will fall back to slow path, where __gup_longterm_locked->check_and_migrate_cma_pages() does the migration. Thanks, Pingfan
On Sat, Feb 29, 2020 at 6:34 AM Ira Weiny <ira.weiny@intel.com> wrote: > > On Fri, Feb 28, 2020 at 07:32:29PM +0800, Pingfan Liu wrote: > > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > > can't move. It would truncate CMA permanently and should be excluded. > > I don't understand what is 'truncated' here? a pinned page will truncate a continuous area, and prevent CMA to reclaim the continuous area. > > I generally agree with Jason that this is going to be confusing to the user. Please see the reply in anothe mail. Thanks, Pingfan [...]
On Mon, Mar 02, 2020 at 10:25:52AM +0800, Pingfan Liu wrote: > On Fri, Feb 28, 2020 at 9:44 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > On Fri, Feb 28, 2020 at 07:32:29PM +0800, Pingfan Liu wrote: > > > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > > > can't move. It would truncate CMA permanently and should be excluded. > > > > > > FOLL_LONGTERM has already been checked in the slow path, but not checked in > > > the fast path, which means a possible leak of CMA page to longterm pinned > > > requirement through this crack. > > > > > > Place a check in try_get_compound_head() in the fast path. > > > > > > Some note about the check: > > > Huge page's subpages have the same migrate type due to either > > > allocation from a free_list[] or alloc_contig_range() with param > > > MIGRATE_MOVABLE. So it is enough to check on a single subpage > > > by is_migrate_cma_page(subpage) > > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > > Cc: Ira Weiny <ira.weiny@intel.com> > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > > Cc: Mike Rapoport <rppt@linux.ibm.com> > > > Cc: Dan Williams <dan.j.williams@intel.com> > > > Cc: Matthew Wilcox <willy@infradead.org> > > > Cc: John Hubbard <jhubbard@nvidia.com> > > > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > > > Cc: Keith Busch <keith.busch@intel.com> > > > Cc: Christoph Hellwig <hch@infradead.org> > > > Cc: Shuah Khan <shuah@kernel.org> > > > To: linux-mm@kvack.org > > > Cc: linux-kernel@vger.kernel.org > > > mm/gup.c | 26 +++++++++++++++++++------- > > > 1 file changed, 19 insertions(+), 7 deletions(-) > > > > > > diff --git a/mm/gup.c b/mm/gup.c > > > index cd8075e..f0d6804 100644 > > > +++ b/mm/gup.c > > > @@ -33,9 +33,21 @@ struct follow_page_context { > > > * Return the compound head page with ref appropriately incremented, > > > * or NULL if that failed. > > > */ > > > -static inline struct page *try_get_compound_head(struct page *page, int refs) > > > +static inline struct page *try_get_compound_head(struct page *page, int refs, > > > + unsigned int flags) > > > { > > > - struct page *head = compound_head(page); > > > + struct page *head; > > > + > > > + /* > > > + * Huge page's subpages have the same migrate type due to either > > > + * allocation from a free_list[] or alloc_contig_range() with param > > > + * MIGRATE_MOVABLE. So it is enough to check on a single subpage. > > > + */ > > > + if (unlikely(flags & FOLL_LONGTERM) && > > > + is_migrate_cma_page(page)) > > > + return NULL; > > > > This doesn't seem very good actually. > > > > If I understand properly, if the system has randomly decided to place, > > say, an anonymous page in a CMA region when an application did mmap(), > > then when the application tries to use this page with a LONGTERM pin > > it gets an immediate failure because of the above. > No, actually, it will fall back to slow path, which migrates and sever > the LONGTERM pin. > > This patch just aims to fix the leakage in gup fast path, while in gup > slow path, there is already logic to guard CMA against LONGTERM pin. > > > > This not OK - the application should not be subject to random failures > > related to long term pins beyond its direct control. > > > > Essentially, failures should only originate from the application using > > specific mmap scenarios, not randomly based on something the MM did, > > and certainly never for anonymous memory. > > > > I think the correct action here is to trigger migration of the page so > > it is not in CMA. > In fact, it does this. The failure in gup fast path will fall back to > slow path, where __gup_longterm_locked->check_and_migrate_cma_pages() > does the migration. It is probably worth revising the commit message so this flow is clear Jason
On 2/28/20 3:32 AM, Pingfan Liu wrote: > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > can't move. It would truncate CMA permanently and should be excluded. > > FOLL_LONGTERM has already been checked in the slow path, but not checked in > the fast path, which means a possible leak of CMA page to longterm pinned > requirement through this crack. > > Place a check in try_get_compound_head() in the fast path. > > Some note about the check: > Huge page's subpages have the same migrate type due to either > allocation from a free_list[] or alloc_contig_range() with param > MIGRATE_MOVABLE. So it is enough to check on a single subpage > by is_migrate_cma_page(subpage) > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > Cc: Ira Weiny <ira.weiny@intel.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Mike Rapoport <rppt@linux.ibm.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: John Hubbard <jhubbard@nvidia.com> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > Cc: Keith Busch <keith.busch@intel.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Shuah Khan <shuah@kernel.org> > To: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > mm/gup.c | 26 +++++++++++++++++++------- > 1 file changed, 19 insertions(+), 7 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index cd8075e..f0d6804 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -33,9 +33,21 @@ struct follow_page_context { > * Return the compound head page with ref appropriately incremented, > * or NULL if that failed. > */ > -static inline struct page *try_get_compound_head(struct page *page, int refs) > +static inline struct page *try_get_compound_head(struct page *page, int refs, > + unsigned int flags) ohhh...please please look at the latest gup.c in mmotm, and this one in particular: commit 0ea2781c3de4 mm/gup: track FOLL_PIN pages ...where you'll see that there is a concept of "try_get*" vs. "try_grab*"). This is going to be a huge mess if we do it as above, from a code structure point of view. The "grab" functions take gup flags, the "get" functions do not. Anyway, as I said in reply to the cover letter, I'm really uncomfortable with this being applied to linux.git. So maybe if we see a fix to mmotm, it will be clearer how to port that back to linux.git (assuming that you need 5.6 fixed--do you though?) thanks,
On Tue, Mar 3, 2020 at 7:51 AM John Hubbard <jhubbard@nvidia.com> wrote: > > On 2/28/20 3:32 AM, Pingfan Liu wrote: > > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > > can't move. It would truncate CMA permanently and should be excluded. > > > > FOLL_LONGTERM has already been checked in the slow path, but not checked in > > the fast path, which means a possible leak of CMA page to longterm pinned > > requirement through this crack. > > > > Place a check in try_get_compound_head() in the fast path. > > > > Some note about the check: > > Huge page's subpages have the same migrate type due to either > > allocation from a free_list[] or alloc_contig_range() with param > > MIGRATE_MOVABLE. So it is enough to check on a single subpage > > by is_migrate_cma_page(subpage) > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > Cc: Ira Weiny <ira.weiny@intel.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Mike Rapoport <rppt@linux.ibm.com> > > Cc: Dan Williams <dan.j.williams@intel.com> > > Cc: Matthew Wilcox <willy@infradead.org> > > Cc: John Hubbard <jhubbard@nvidia.com> > > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > > Cc: Keith Busch <keith.busch@intel.com> > > Cc: Christoph Hellwig <hch@infradead.org> > > Cc: Shuah Khan <shuah@kernel.org> > > To: linux-mm@kvack.org > > Cc: linux-kernel@vger.kernel.org > > --- > > mm/gup.c | 26 +++++++++++++++++++------- > > 1 file changed, 19 insertions(+), 7 deletions(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index cd8075e..f0d6804 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -33,9 +33,21 @@ struct follow_page_context { > > * Return the compound head page with ref appropriately incremented, > > * or NULL if that failed. > > */ > > -static inline struct page *try_get_compound_head(struct page *page, int refs) > > +static inline struct page *try_get_compound_head(struct page *page, int refs, > > + unsigned int flags) > > > ohhh...please please look at the latest gup.c in mmotm, and this one in particular: > > commit 0ea2781c3de4 mm/gup: track FOLL_PIN pages > > ...where you'll see that there is a concept of "try_get*" vs. "try_grab*"). This is going > to be a huge mess if we do it as above, from a code structure point of view. > > The "grab" functions take gup flags, the "get" functions do not. > > Anyway, as I said in reply to the cover letter, I'm really uncomfortable with this > being applied to linux.git. So maybe if we see a fix to mmotm, it will be clearer how > to port that back to linux.git (assuming that you need 5.6 fixed--do you though?) Sure, I will read your series and figure out the way to rebase my patches on mmotm at first. Thanks, Pingfan
On Mon, Mar 2, 2020 at 9:08 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Mon, Mar 02, 2020 at 10:25:52AM +0800, Pingfan Liu wrote: > > On Fri, Feb 28, 2020 at 9:44 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > On Fri, Feb 28, 2020 at 07:32:29PM +0800, Pingfan Liu wrote: > > > > FOLL_LONGTERM suggests a pin which is going to be given to hardware and > > > > can't move. It would truncate CMA permanently and should be excluded. > > > > > > > > FOLL_LONGTERM has already been checked in the slow path, but not checked in > > > > the fast path, which means a possible leak of CMA page to longterm pinned > > > > requirement through this crack. > > > > > > > > Place a check in try_get_compound_head() in the fast path. > > > > > > > > Some note about the check: > > > > Huge page's subpages have the same migrate type due to either > > > > allocation from a free_list[] or alloc_contig_range() with param > > > > MIGRATE_MOVABLE. So it is enough to check on a single subpage > > > > by is_migrate_cma_page(subpage) > > > > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > > > Cc: Ira Weiny <ira.weiny@intel.com> > > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > > > Cc: Mike Rapoport <rppt@linux.ibm.com> > > > > Cc: Dan Williams <dan.j.williams@intel.com> > > > > Cc: Matthew Wilcox <willy@infradead.org> > > > > Cc: John Hubbard <jhubbard@nvidia.com> > > > > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > > > > Cc: Keith Busch <keith.busch@intel.com> > > > > Cc: Christoph Hellwig <hch@infradead.org> > > > > Cc: Shuah Khan <shuah@kernel.org> > > > > To: linux-mm@kvack.org > > > > Cc: linux-kernel@vger.kernel.org > > > > mm/gup.c | 26 +++++++++++++++++++------- > > > > 1 file changed, 19 insertions(+), 7 deletions(-) > > > > > > > > diff --git a/mm/gup.c b/mm/gup.c > > > > index cd8075e..f0d6804 100644 > > > > +++ b/mm/gup.c > > > > @@ -33,9 +33,21 @@ struct follow_page_context { > > > > * Return the compound head page with ref appropriately incremented, > > > > * or NULL if that failed. > > > > */ > > > > -static inline struct page *try_get_compound_head(struct page *page, int refs) > > > > +static inline struct page *try_get_compound_head(struct page *page, int refs, > > > > + unsigned int flags) > > > > { > > > > - struct page *head = compound_head(page); > > > > + struct page *head; > > > > + > > > > + /* > > > > + * Huge page's subpages have the same migrate type due to either > > > > + * allocation from a free_list[] or alloc_contig_range() with param > > > > + * MIGRATE_MOVABLE. So it is enough to check on a single subpage. > > > > + */ > > > > + if (unlikely(flags & FOLL_LONGTERM) && > > > > + is_migrate_cma_page(page)) > > > > + return NULL; > > > > > > This doesn't seem very good actually. > > > > > > If I understand properly, if the system has randomly decided to place, > > > say, an anonymous page in a CMA region when an application did mmap(), > > > then when the application tries to use this page with a LONGTERM pin > > > it gets an immediate failure because of the above. > > No, actually, it will fall back to slow path, which migrates and sever > > the LONGTERM pin. > > > > This patch just aims to fix the leakage in gup fast path, while in gup > > slow path, there is already logic to guard CMA against LONGTERM pin. > > > > > > This not OK - the application should not be subject to random failures > > > related to long term pins beyond its direct control. > > > > > > Essentially, failures should only originate from the application using > > > specific mmap scenarios, not randomly based on something the MM did, > > > and certainly never for anonymous memory. > > > > > > I think the correct action here is to trigger migration of the page so > > > it is not in CMA. > > In fact, it does this. The failure in gup fast path will fall back to > > slow path, where __gup_longterm_locked->check_and_migrate_cma_pages() > > does the migration. > > It is probably worth revising the commit message so this flow is clear OK. Thanks, Pingfan
diff --git a/mm/gup.c b/mm/gup.c index cd8075e..f0d6804 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -33,9 +33,21 @@ struct follow_page_context { * Return the compound head page with ref appropriately incremented, * or NULL if that failed. */ -static inline struct page *try_get_compound_head(struct page *page, int refs) +static inline struct page *try_get_compound_head(struct page *page, int refs, + unsigned int flags) { - struct page *head = compound_head(page); + struct page *head; + + /* + * Huge page's subpages have the same migrate type due to either + * allocation from a free_list[] or alloc_contig_range() with param + * MIGRATE_MOVABLE. So it is enough to check on a single subpage. + */ + if (unlikely(flags & FOLL_LONGTERM) && + is_migrate_cma_page(page)) + return NULL; + + head = compound_head(page); if (WARN_ON_ONCE(page_ref_count(head) < 0)) return NULL; @@ -1908,7 +1920,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); - head = try_get_compound_head(page, 1); + head = try_get_compound_head(page, 1, flags); if (!head) goto pte_unmap; @@ -2083,7 +2095,7 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, page = head + ((addr & (sz-1)) >> PAGE_SHIFT); refs = record_subpages(page, addr, end, pages + *nr); - head = try_get_compound_head(head, refs); + head = try_get_compound_head(head, refs, flags); if (!head) return 0; @@ -2142,7 +2154,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); refs = record_subpages(page, addr, end, pages + *nr); - head = try_get_compound_head(pmd_page(orig), refs); + head = try_get_compound_head(pmd_page(orig), refs, flags); if (!head) return 0; @@ -2174,7 +2186,7 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); refs = record_subpages(page, addr, end, pages + *nr); - head = try_get_compound_head(pud_page(orig), refs); + head = try_get_compound_head(pud_page(orig), refs, flags); if (!head) return 0; @@ -2203,7 +2215,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, page = pgd_page(orig) + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); refs = record_subpages(page, addr, end, pages + *nr); - head = try_get_compound_head(pgd_page(orig), refs); + head = try_get_compound_head(pgd_page(orig), refs, flags); if (!head) return 0;
FOLL_LONGTERM suggests a pin which is going to be given to hardware and can't move. It would truncate CMA permanently and should be excluded. FOLL_LONGTERM has already been checked in the slow path, but not checked in the fast path, which means a possible leak of CMA page to longterm pinned requirement through this crack. Place a check in try_get_compound_head() in the fast path. Some note about the check: Huge page's subpages have the same migrate type due to either allocation from a free_list[] or alloc_contig_range() with param MIGRATE_MOVABLE. So it is enough to check on a single subpage by is_migrate_cma_page(subpage) Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Shuah Khan <shuah@kernel.org> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/gup.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-)