Message ID | 0-v1-69e7da97f81f+21c-ttm_pmd_jgg@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/ttm: Do not put non-struct page memory into PUD/PMDs | expand |
Hi, On 10/19/21 20:21, Jason Gunthorpe wrote: > PUD and PMD entries do not have a special bit. > > get_user_pages_fast() considers any page that passed pmd_huge() as > usable: > > if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || > pmd_devmap(pmd))) { > > And vmf_insert_pfn_pmd_prot() unconditionally sets > > entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); > > eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE. > > As such gup_huge_pmd() will try to deref a struct page: > > head = try_grab_compound_head(pmd_page(orig), refs, flags); > > and thus crash. > > Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot(). Actually I think if fast gup will break even page backed memory since the backing drivers don't assume anybody else takes a refcount / pincount. Normal pages have PTE_SPECIAL and VM_PFNMAP to block that. (Side note I was recommended to introduce a PTE_HUGESPECIAL bit for this and basically had a patch ready but got scared off when trying to handle 64-bit PTE atomic updates on x86-32) It might be that we (Intel) try to resurrect this code using PTE_HUGESPECIAL in the near future for i915, but until that, I think it's the safest option to disable it completely. Thanks, Thomas > > Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults") > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/gpu/drm/ttm/ttm_bo_vm.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > AFAICT only the vmwgfx driver makes use of this, and I can't tell which path > it is taking. > > This function will also try to install a PUD - does vmwgfx have a case where > it has obtained a 1GB high order page - or is that dead code now? > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c > index f56be5bc0861ec..91d02c345fbba8 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > @@ -203,10 +203,13 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, > if (page_offset + fault_page_size > bo->resource->num_pages) > goto out_fallback; > > + /* > + * vmf_insert_pfn_pud/pmd_prot() can only be called with struct page > + * backed memory > + */ > if (bo->resource->bus.is_iomem) > - pfn = ttm_bo_io_mem_pfn(bo, page_offset); > - else > - pfn = page_to_pfn(ttm->pages[page_offset]); > + goto out_fallback; > + pfn = page_to_pfn(ttm->pages[page_offset]); > > /* pfn must be fault_page_size aligned. */ > if ((pfn & (fault_page_size - 1)) != 0) > > base-commit: 519d81956ee277b4419c723adfb154603c2565ba
On Tue, Oct 19, 2021 at 08:49:29PM +0200, Thomas Hellström wrote: > Hi, > > On 10/19/21 20:21, Jason Gunthorpe wrote: > > PUD and PMD entries do not have a special bit. > > > > get_user_pages_fast() considers any page that passed pmd_huge() as > > usable: > > > > if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || > > pmd_devmap(pmd))) { > > > > And vmf_insert_pfn_pmd_prot() unconditionally sets > > > > entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); > > > > eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE. > > > > As such gup_huge_pmd() will try to deref a struct page: > > > > head = try_grab_compound_head(pmd_page(orig), refs, flags); > > > > and thus crash. > > > > Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot(). > > Actually I think if fast gup will break even page backed memory since the > backing drivers don't assume anybody else takes a refcount / pincount. > Normal pages have PTE_SPECIAL and VM_PFNMAP to block that. Erk, yes, that is even worse. > (Side note I was recommended to introduce a PTE_HUGESPECIAL bit for > this and basically had a patch ready but got scared off when trying > to handle 64-bit PTE atomic updates on x86-32) Right, a PMD_SPECIAL bit is needed to make this work. > It might be that we (Intel) try to resurrect this code using > PTE_HUGESPECIAL in the near future for i915, but until that, I think > it's the safest option to disable it completely. Okay, do you want a patch to just delete this function? Jason
On 10/19/21 20:52, Jason Gunthorpe wrote: > On Tue, Oct 19, 2021 at 08:49:29PM +0200, Thomas Hellström wrote: >> Hi, >> >> On 10/19/21 20:21, Jason Gunthorpe wrote: >>> PUD and PMD entries do not have a special bit. >>> >>> get_user_pages_fast() considers any page that passed pmd_huge() as >>> usable: >>> >>> if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || >>> pmd_devmap(pmd))) { >>> >>> And vmf_insert_pfn_pmd_prot() unconditionally sets >>> >>> entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); >>> >>> eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE. >>> >>> As such gup_huge_pmd() will try to deref a struct page: >>> >>> head = try_grab_compound_head(pmd_page(orig), refs, flags); >>> >>> and thus crash. >>> >>> Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot(). >> Actually I think if fast gup will break even page backed memory since the >> backing drivers don't assume anybody else takes a refcount / pincount. >> Normal pages have PTE_SPECIAL and VM_PFNMAP to block that. > Erk, yes, that is even worse. > >> (Side note I was recommended to introduce a PTE_HUGESPECIAL bit for >> this and basically had a patch ready but got scared off when trying >> to handle 64-bit PTE atomic updates on x86-32) > Right, a PMD_SPECIAL bit is needed to make this work. Yes, PMD_SPECIAL it was :) > >> It might be that we (Intel) try to resurrect this code using >> PTE_HUGESPECIAL in the near future for i915, but until that, I think >> it's the safest option to disable it completely. > Okay, do you want a patch to just delete this function? That'd be great. Thanks, Thomas > > Jason
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861ec..91d02c345fbba8 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -203,10 +203,13 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, if (page_offset + fault_page_size > bo->resource->num_pages) goto out_fallback; + /* + * vmf_insert_pfn_pud/pmd_prot() can only be called with struct page + * backed memory + */ if (bo->resource->bus.is_iomem) - pfn = ttm_bo_io_mem_pfn(bo, page_offset); - else - pfn = page_to_pfn(ttm->pages[page_offset]); + goto out_fallback; + pfn = page_to_pfn(ttm->pages[page_offset]); /* pfn must be fault_page_size aligned. */ if ((pfn & (fault_page_size - 1)) != 0)
PUD and PMD entries do not have a special bit. get_user_pages_fast() considers any page that passed pmd_huge() as usable: if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))) { And vmf_insert_pfn_pmd_prot() unconditionally sets entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE. As such gup_huge_pmd() will try to deref a struct page: head = try_grab_compound_head(pmd_page(orig), refs, flags); and thus crash. Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot(). Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) AFAICT only the vmwgfx driver makes use of this, and I can't tell which path it is taking. This function will also try to install a PUD - does vmwgfx have a case where it has obtained a 1GB high order page - or is that dead code now? base-commit: 519d81956ee277b4419c723adfb154603c2565ba