Message ID | 20170522133604.11392-3-punit.agrawal@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, May 22, 2017 at 02:36:00PM +0100, Punit Agrawal wrote: > When speculatively taking references to a hugepage using > page_cache_add_speculative() in gup_huge_pmd(), it is assumed that the > page returned by pmd_page() is the head page. Although normally true, > this assumption doesn't hold when the hugepage comprises of successive > page table entries such as when using contiguous bit on arm64 at PTE or > PMD levels. > > This can be addressed by ensuring that the page passed to > page_cache_add_speculative() is the real head or by de-referencing the > head page within the function. > > We take the first approach to keep the usage pattern aligned with > page_cache_get_speculative() where users already pass the appropriate > page, i.e., the de-referenced head. > > Apply the same logic to fix gup_huge_[pud|pgd]() as well. Hm. Okay. But I'm kinda surprise that this is the only place that need to be adjusted. Have you validated all other pmd_page() use-cases?
"Kirill A. Shutemov" <kirill@shutemov.name> writes: > On Mon, May 22, 2017 at 02:36:00PM +0100, Punit Agrawal wrote: >> When speculatively taking references to a hugepage using >> page_cache_add_speculative() in gup_huge_pmd(), it is assumed that the >> page returned by pmd_page() is the head page. Although normally true, >> this assumption doesn't hold when the hugepage comprises of successive >> page table entries such as when using contiguous bit on arm64 at PTE or >> PMD levels. >> >> This can be addressed by ensuring that the page passed to >> page_cache_add_speculative() is the real head or by de-referencing the >> head page within the function. >> >> We take the first approach to keep the usage pattern aligned with >> page_cache_get_speculative() where users already pass the appropriate >> page, i.e., the de-referenced head. >> >> Apply the same logic to fix gup_huge_[pud|pgd]() as well. > > Hm. Okay. But I'm kinda surprise that this is the only place that need to > be adjusted. > > Have you validated all other pmd_page() use-cases? I came across the gup issues were found while investigating a failing test from mce-tests. I think the problem here is not due to the use of pmd_page() but because page_cache_[add|get]_speculative() don't ensure they ref-count the head page as is done in get_page(). Having said that, I had a quick look at the other uses of pmd_page() - Quite a few of them are followed by an explicit BUG_ON() to check that the page returned is a head page. All other instances seem to be dealing with transparent hugepages where contiguous hugepages are not supported. I don't see any call sites that ring alarm bells. Did you have any particular part of the code in mind where pmd_page() usage might be a problem? Thanks, Punit
diff --git a/mm/gup.c b/mm/gup.c index ccf8cb38234f..be67996513be 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1358,8 +1358,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, return __gup_device_huge_pmd(orig, addr, end, pages, nr); refs = 0; - head = pmd_page(orig); - page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT); + page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); do { pages[*nr] = page; (*nr)++; @@ -1367,6 +1366,7 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); + head = compound_head(page); if (!page_cache_add_speculative(head, refs)) { *nr -= refs; return 0; @@ -1396,8 +1396,7 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, return __gup_device_huge_pud(orig, addr, end, pages, nr); refs = 0; - head = pud_page(orig); - page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT); + page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); do { pages[*nr] = page; (*nr)++; @@ -1405,6 +1404,7 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); + head = compound_head(page); if (!page_cache_add_speculative(head, refs)) { *nr -= refs; return 0; @@ -1433,8 +1433,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, BUILD_BUG_ON(pgd_devmap(orig)); refs = 0; - head = pgd_page(orig); - page = head + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); + page = pgd_page(orig) + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); do { pages[*nr] = page; (*nr)++; @@ -1442,6 +1441,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); + head = compound_head(page); if (!page_cache_add_speculative(head, refs)) { *nr -= refs; return 0;