From patchwork Fri May 8 19:20:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ralph Campbell X-Patchwork-Id: 11537477 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9F4515E6 for ; Fri, 8 May 2020 19:20:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D17720CC7 for ; Fri, 8 May 2020 19:20:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="FPmKRoqs" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8D17720CC7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A5FBA8E0005; Fri, 8 May 2020 15:20:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9E9B4900005; Fri, 8 May 2020 15:20:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 816648E0007; Fri, 8 May 2020 15:20:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id 658778E0005 for ; Fri, 8 May 2020 15:20:25 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 322F952BB for ; Fri, 8 May 2020 19:20:25 +0000 (UTC) X-FDA: 76794518010.09.nerve07_5ae5200256e60 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,rcampbell@nvidia.com,,RULES_HIT:30012:30054:30064:30070:30079:30080,0,RBL:216.228.121.143:@nvidia.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:32,LUA_SUMMARY:none X-HE-Tag: nerve07_5ae5200256e60 X-Filterd-Recvd-Size: 12169 Received: from hqnvemgate24.nvidia.com (hqnvemgate24.nvidia.com [216.228.121.143]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 May 2020 19:20:24 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 08 May 2020 12:18:11 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Fri, 08 May 2020 12:20:23 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Fri, 08 May 2020 12:20:23 -0700 Received: from HQMAIL109.nvidia.com (172.20.187.15) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 8 May 2020 19:20:22 +0000 Received: from rnnvemgw01.nvidia.com (10.128.109.123) by HQMAIL109.nvidia.com (172.20.187.15) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Fri, 8 May 2020 19:20:21 +0000 Received: from rcampbell-dev.nvidia.com (Not Verified[10.110.48.66]) by rnnvemgw01.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Fri, 08 May 2020 12:20:21 -0700 From: Ralph Campbell To: , , , , CC: Jerome Glisse , John Hubbard , Christoph Hellwig , Jason Gunthorpe , "Ben Skeggs" , Andrew Morton , Shuah Khan , Ralph Campbell Subject: [PATCH 1/6] nouveau/hmm: map pages after migration Date: Fri, 8 May 2020 12:20:04 -0700 Message-ID: <20200508192009.15302-2-rcampbell@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508192009.15302-1-rcampbell@nvidia.com> References: <20200508192009.15302-1-rcampbell@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1588965491; bh=vLuVQ9M113BbmyP+rGSUUkFSbuwtcCONl7c0TMMZCiI=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:MIME-Version:X-NVConfidentiality: Content-Type:Content-Transfer-Encoding; b=FPmKRoqsJcuAhRo1ALZxuRV0elsE6l1HxO6iH/ZoS8nbviHozYvtTitwhRWctD36W D0ISoVSjGu0B7KxaBL262CGKA3qmrKQfQ7CrO4eDgLJSN+EYRnwiBg0UvjZKiEhRHC tLOa3EqE8k6hxCiRlArC8QSwCe4RRdaOKRWG3jQy9vc1Goe+GDAPXCn1exWlFa0fJy DuUFcI55n7vAA7thSCThwyKqqhp1xyznbccszbNebOhdiZgvS5cXgeWgPDda/zFJ3+ x0iXNOMyslqAptbz0oDJ7ZwOFBn+e/rvb8UBzbRhoOljD9ZtQd8pC1iNDfIczlu8l5 XNrAKUt6tXY2Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When memory is migrated to the GPU, it is likely to be accessed by GPU code soon afterwards. Instead of waiting for a GPU fault, map the migrated memory into the GPU page tables with the same access permissions as the source CPU page table entries. This preserves copy on write semantics. Signed-off-by: Ralph Campbell Cc: Christoph Hellwig Cc: Jason Gunthorpe Cc: "Jérôme Glisse" Cc: Ben Skeggs --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 46 +++++++++++++------- drivers/gpu/drm/nouveau/nouveau_dmem.h | 2 + drivers/gpu/drm/nouveau/nouveau_svm.c | 59 +++++++++++++++++++++++++- drivers/gpu/drm/nouveau/nouveau_svm.h | 5 +++ 4 files changed, 95 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 3364904eccff..b29b01a56d07 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -25,12 +25,14 @@ #include "nouveau_dma.h" #include "nouveau_mem.h" #include "nouveau_bo.h" +#include "nouveau_svm.h" #include #include #include #include #include +#include #include #include @@ -561,10 +563,11 @@ nouveau_dmem_init(struct nouveau_drm *drm) } static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, - unsigned long src, dma_addr_t *dma_addr) + unsigned long src, dma_addr_t *dma_addr, u64 *pfn) { struct device *dev = drm->dev->dev; struct page *dpage, *spage; + unsigned long paddr; spage = migrate_pfn_to_page(src); if (!spage || !(src & MIGRATE_PFN_MIGRATE)) @@ -572,17 +575,21 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, dpage = nouveau_dmem_page_alloc_locked(drm); if (!dpage) - return 0; + goto out; *dma_addr = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); if (dma_mapping_error(dev, *dma_addr)) goto out_free_page; + paddr = nouveau_dmem_page_addr(dpage); if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_VRAM, - nouveau_dmem_page_addr(dpage), NOUVEAU_APER_HOST, - *dma_addr)) + paddr, NOUVEAU_APER_HOST, *dma_addr)) goto out_dma_unmap; + *pfn = NVIF_VMM_PFNMAP_V0_V | NVIF_VMM_PFNMAP_V0_VRAM | + ((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT); + if (src & MIGRATE_PFN_WRITE) + *pfn |= NVIF_VMM_PFNMAP_V0_W; return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; out_dma_unmap: @@ -590,18 +597,20 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, out_free_page: nouveau_dmem_page_free_locked(drm, dpage); out: + *pfn = NVIF_VMM_PFNMAP_V0_NONE; return 0; } static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm, - struct migrate_vma *args, dma_addr_t *dma_addrs) + struct nouveau_svmm *svmm, struct migrate_vma *args, + dma_addr_t *dma_addrs, u64 *pfns) { struct nouveau_fence *fence; unsigned long addr = args->start, nr_dma = 0, i; for (i = 0; addr < args->end; i++) { args->dst[i] = nouveau_dmem_migrate_copy_one(drm, args->src[i], - dma_addrs + nr_dma); + dma_addrs + nr_dma, pfns + i); if (args->dst[i]) nr_dma++; addr += PAGE_SIZE; @@ -610,20 +619,18 @@ static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm, nouveau_fence_new(drm->dmem->migrate.chan, false, &fence); migrate_vma_pages(args); nouveau_dmem_fence_done(&fence); + nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, i); while (nr_dma--) { dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE, DMA_BIDIRECTIONAL); } - /* - * FIXME optimization: update GPU page table to point to newly migrated - * memory. - */ migrate_vma_finalize(args); } int nouveau_dmem_migrate_vma(struct nouveau_drm *drm, + struct nouveau_svmm *svmm, struct vm_area_struct *vma, unsigned long start, unsigned long end) @@ -635,7 +642,8 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm, .vma = vma, .start = start, }; - unsigned long c, i; + unsigned long i; + u64 *pfns; int ret = -ENOMEM; args.src = kcalloc(max, sizeof(*args.src), GFP_KERNEL); @@ -649,19 +657,25 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm, if (!dma_addrs) goto out_free_dst; - for (i = 0; i < npages; i += c) { - c = min(SG_MAX_SINGLE_ALLOC, npages); - args.end = start + (c << PAGE_SHIFT); + pfns = nouveau_pfns_alloc(max); + if (!pfns) + goto out_free_dma; + + for (i = 0; i < npages; i += max) { + args.end = start + (max << PAGE_SHIFT); ret = migrate_vma_setup(&args); if (ret) - goto out_free_dma; + goto out_free_pfns; if (args.cpages) - nouveau_dmem_migrate_chunk(drm, &args, dma_addrs); + nouveau_dmem_migrate_chunk(drm, svmm, &args, dma_addrs, + pfns); args.start = args.end; } ret = 0; +out_free_pfns: + nouveau_pfns_free(pfns); out_free_dma: kfree(dma_addrs); out_free_dst: diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.h b/drivers/gpu/drm/nouveau/nouveau_dmem.h index db3b59b210af..64da5d3635c8 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.h +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.h @@ -25,6 +25,7 @@ struct drm_device; struct drm_file; struct nouveau_drm; +struct nouveau_svmm; struct hmm_range; #if IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) @@ -34,6 +35,7 @@ void nouveau_dmem_suspend(struct nouveau_drm *); void nouveau_dmem_resume(struct nouveau_drm *); int nouveau_dmem_migrate_vma(struct nouveau_drm *drm, + struct nouveau_svmm *svmm, struct vm_area_struct *vma, unsigned long start, unsigned long end); diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 407e34a5c0ab..22f054f7ee3e 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -70,6 +70,12 @@ struct nouveau_svm { #define SVM_DBG(s,f,a...) NV_DEBUG((s)->drm, "svm: "f"\n", ##a) #define SVM_ERR(s,f,a...) NV_WARN((s)->drm, "svm: "f"\n", ##a) +struct nouveau_pfnmap_args { + struct nvif_ioctl_v0 i; + struct nvif_ioctl_mthd_v0 m; + struct nvif_vmm_pfnmap_v0 p; +}; + struct nouveau_ivmm { struct nouveau_svmm *svmm; u64 inst; @@ -187,7 +193,8 @@ nouveau_svmm_bind(struct drm_device *dev, void *data, addr = max(addr, vma->vm_start); next = min(vma->vm_end, end); /* This is a best effort so we ignore errors */ - nouveau_dmem_migrate_vma(cli->drm, vma, addr, next); + nouveau_dmem_migrate_vma(cli->drm, cli->svm.svmm, vma, addr, + next); addr = next; } @@ -814,6 +821,56 @@ nouveau_svm_fault(struct nvif_notify *notify) return NVIF_NOTIFY_KEEP; } +static struct nouveau_pfnmap_args * +nouveau_pfns_to_args(void *pfns) +{ + return container_of(pfns, struct nouveau_pfnmap_args, p.phys); +} + +u64 * +nouveau_pfns_alloc(unsigned long npages) +{ + struct nouveau_pfnmap_args *args; + + args = kzalloc(struct_size(args, p.phys, npages), GFP_KERNEL); + if (!args) + return NULL; + + args->i.type = NVIF_IOCTL_V0_MTHD; + args->m.method = NVIF_VMM_V0_PFNMAP; + args->p.page = PAGE_SHIFT; + + return args->p.phys; +} + +void +nouveau_pfns_free(u64 *pfns) +{ + struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns); + + kfree(args); +} + +void +nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm, + unsigned long addr, u64 *pfns, unsigned long npages) +{ + struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns); + int ret; + + args->p.addr = addr; + args->p.size = npages << PAGE_SHIFT; + + mutex_lock(&svmm->mutex); + + svmm->vmm->vmm.object.client->super = true; + ret = nvif_object_ioctl(&svmm->vmm->vmm.object, args, sizeof(*args) + + npages * sizeof(args->p.phys[0]), NULL); + svmm->vmm->vmm.object.client->super = false; + + mutex_unlock(&svmm->mutex); +} + static void nouveau_svm_fault_buffer_fini(struct nouveau_svm *svm, int id) { diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.h b/drivers/gpu/drm/nouveau/nouveau_svm.h index e839d8189461..f0fcd1b72e8b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.h +++ b/drivers/gpu/drm/nouveau/nouveau_svm.h @@ -18,6 +18,11 @@ void nouveau_svmm_fini(struct nouveau_svmm **); int nouveau_svmm_join(struct nouveau_svmm *, u64 inst); void nouveau_svmm_part(struct nouveau_svmm *, u64 inst); int nouveau_svmm_bind(struct drm_device *, void *, struct drm_file *); + +u64 *nouveau_pfns_alloc(unsigned long npages); +void nouveau_pfns_free(u64 *pfns); +void nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm, + unsigned long addr, u64 *pfns, unsigned long npages); #else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */ static inline void nouveau_svm_init(struct nouveau_drm *drm) {} static inline void nouveau_svm_fini(struct nouveau_drm *drm) {}