From patchwork Tue Oct 15 18:12:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Gunthorpe X-Patchwork-Id: 11191379 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED8CF139A for ; Tue, 15 Oct 2019 18:16:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ACAF5222C2 for ; Tue, 15 Oct 2019 18:16:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="SZTuPs+9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ACAF5222C2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B67E48E000B; Tue, 15 Oct 2019 14:16:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B18B78E0001; Tue, 15 Oct 2019 14:16:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DFC88E000B; Tue, 15 Oct 2019 14:16:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0205.hostedemail.com [216.40.44.205]) by kanga.kvack.org (Postfix) with ESMTP id 764A28E0001 for ; Tue, 15 Oct 2019 14:16:22 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 25DE268A2 for ; Tue, 15 Oct 2019 18:16:22 +0000 (UTC) X-FDA: 76046823804.30.hate15_3787bae532819 X-Spam-Summary: 2,0,0,b94d5435de96633e,d41d8cd98f00b204,jgg@ziepe.ca,:jglisse@redhat.com:rcampbell@nvidia.com:jhubbard@nvidia.com:felix.kuehling@amd.com:linux-rdma@vger.kernel.org::aarcange@redhat.com:dri-devel@lists.freedesktop.org:amd-gfx@lists.freedesktop.org:bskeggs@redhat.com:jgg@mellanox.com:nouveau@lists.freedesktop.org,RULES_HIT:1:41:69:355:379:541:800:960:966:968:973:988:989:1260:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:1801:2196:2198:2199:2200:2393:2559:2562:2636:2693:2895:2898:2901:2914:2924:2926:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3874:4250:4321:4385:4419:4605:5007:6117:6119:6120:6261:6653:7576:7875:7901:7903:8603:9036:9149:9592:10004:11026:11233:11473:11657:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12895:12986:13141:13161:13229:13230:13894:13972:14394:21080:21433:21444:21450:21451:21627:21740:30054:30064:30070,0,RBL:209.85.166.67:@ziepe.ca:.lbl8.mailshell.net-62.2.175.100 66.100.201.201,CacheIP:none,Bayesia n:0.5,0. X-HE-Tag: hate15_3787bae532819 X-Filterd-Recvd-Size: 13665 Received: from mail-io1-f67.google.com (mail-io1-f67.google.com [209.85.166.67]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Oct 2019 18:16:21 +0000 (UTC) Received: by mail-io1-f67.google.com with SMTP id a1so48252243ioc.6 for ; Tue, 15 Oct 2019 11:16:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7bGBHV3XKTBYtn+s2VkOV5vFHFWD1VRFqgigxd7ES8s=; b=SZTuPs+9wtwF5qdo+T04vL15d88nht4jzLRt2KERwlIlmEXkDIWhnd1ZVIaUcbfUHB 3lvX6T61iauTPcUGyedFFKgzQKx0qM38ovl5TsQiW8W9MlH3WDIMXXxhwPZA16J+FlJo WOMlNrj+nWtDbfr2oeHLqKkqb9xCx0UH8UPZ8bMwWH9gjlfWyysteaZ05Dwrq+hI0Pj4 1AWgVI9R2WygWt6WGwALuiMdjbs0zn4/tZ+07Y+GvD+YCexEuaNFKnkovhprSZwmdMpJ ndfHvsy43KtDTBSWgQzQDBvwJEzLL1K+mBJJ4RmV5O3OucQZuKa1aj5T3C9OZDphUkNa QxDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7bGBHV3XKTBYtn+s2VkOV5vFHFWD1VRFqgigxd7ES8s=; b=m9OmK7dvEyoRRVcBtQoCYqODztIKwx4VS0MOqNEDGy9HhmZ2mMAVwSaT/2xBdCvmMn rYskI4xBv2Hk9PfWrS1ELrlsDwl1ZIVHbD8TVqkQdKbJtKoZaKaOfqTpb4rlDlAdFdMD wjXNLg55yOcxX2mrjKQmB24Jaxw3dF7EdM0WiTUVk73fkVGsGNa8OElv2qcbLKyhyXDG dTWRPUrOUGVJqnUgHM4CBSn2tctW0RthNtLs+diYZa7gkLtPBg9WbtrC2xKIF2p8lFxA Wv5wJslYckDccw9aIgtpp5fRbfLvMC3wosNjfl8uHz2uo33YPkGLgPuubgQCYKhAu/UC q/rA== X-Gm-Message-State: APjAAAUGk7ekxBfruu9NhSWOQjz3fC1l1Vm4w5Pe6tnneHeI0a3VMfy1 49c/ndwZnOC6SHfrDvgxavw0Hw== X-Google-Smtp-Source: APXvYqzXNjh0fKDKQEyiiqzBWh49cM4XsjfhBudT5whPmb8Cc0287ba+L/26Gqkm2bQl2N5jfKT/5w== X-Received: by 2002:a05:6e02:783:: with SMTP id q3mr7197819ils.33.1571163380758; Tue, 15 Oct 2019 11:16:20 -0700 (PDT) Received: from ziepe.ca ([24.114.26.129]) by smtp.gmail.com with ESMTPSA id r5sm2178117ill.12.2019.10.15.11.16.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 15 Oct 2019 11:16:20 -0700 (PDT) Received: from jgg by jggl.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1iKRJT-0002Ca-M8; Tue, 15 Oct 2019 15:12:51 -0300 From: Jason Gunthorpe To: Jerome Glisse , Ralph Campbell , John Hubbard , Felix.Kuehling@amd.com Cc: linux-rdma@vger.kernel.org, linux-mm@kvack.org, Andrea Arcangeli , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Ben Skeggs , Jason Gunthorpe , nouveau@lists.freedesktop.org Subject: [PATCH hmm 11/15] nouveau: use mmu_range_notifier instead of hmm_mirror Date: Tue, 15 Oct 2019 15:12:38 -0300 Message-Id: <20191015181242.8343-12-jgg@ziepe.ca> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191015181242.8343-1-jgg@ziepe.ca> References: <20191015181242.8343-1-jgg@ziepe.ca> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Gunthorpe Remove the hmm_mirror object and use the mmu_range_notifier API instead for the range, and use the normal mmu_notifier API for the general invalidation callback. While here re-organize the pagefault path so the locking pattern is clear. nouveau is the only driver that uses a temporary range object and instead forwards nearly every invalidation range directly to the HW. While this is not how the mmu_range_notifier was intended to be used, the overheads on the pagefaulting path are similar to the existing hmm_mirror version. Particularly since the interval tree will be small. Cc: Ben Skeggs Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: Ralph Campbell Signed-off-by: Jason Gunthorpe --- drivers/gpu/drm/nouveau/nouveau_svm.c | 178 ++++++++++++++------------ 1 file changed, 98 insertions(+), 80 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 577f8811925a59..712c99918551bc 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -96,8 +96,6 @@ struct nouveau_svmm { } unmanaged; struct mutex mutex; - - struct hmm_mirror mirror; }; #define SVMM_DBG(s,f,a...) \ @@ -293,23 +291,11 @@ static const struct mmu_notifier_ops nouveau_mn_ops = { .free_notifier = nouveau_svmm_free_notifier, }; -static int -nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror, - const struct mmu_notifier_range *update) -{ - return 0; -} - -static const struct hmm_mirror_ops nouveau_svmm = { - .sync_cpu_device_pagetables = nouveau_svmm_sync_cpu_device_pagetables, -}; - void nouveau_svmm_fini(struct nouveau_svmm **psvmm) { struct nouveau_svmm *svmm = *psvmm; if (svmm) { - hmm_mirror_unregister(&svmm->mirror); mutex_lock(&svmm->mutex); svmm->vmm = NULL; mutex_unlock(&svmm->mutex); @@ -357,15 +343,10 @@ nouveau_svmm_init(struct drm_device *dev, void *data, goto out_free; down_write(¤t->mm->mmap_sem); - svmm->mirror.ops = &nouveau_svmm; - ret = hmm_mirror_register(&svmm->mirror, current->mm); - if (ret) - goto out_mm_unlock; - svmm->notifier.ops = &nouveau_mn_ops; ret = __mmu_notifier_register(&svmm->notifier, current->mm); if (ret) - goto out_hmm_unregister; + goto out_mm_unlock; /* Note, ownership of svmm transfers to mmu_notifier */ cli->svm.svmm = svmm; @@ -374,8 +355,6 @@ nouveau_svmm_init(struct drm_device *dev, void *data, mutex_unlock(&cli->mutex); return 0; -out_hmm_unregister: - hmm_mirror_unregister(&svmm->mirror); out_mm_unlock: up_write(¤t->mm->mmap_sem); out_free: @@ -503,43 +482,89 @@ nouveau_svm_fault_cache(struct nouveau_svm *svm, fault->inst, fault->addr, fault->access); } -static inline bool -nouveau_range_done(struct hmm_range *range) +struct svm_notifier { + struct mmu_range_notifier notifier; + struct nouveau_svmm *svmm; +}; + +static bool nouveau_svm_range_invalidate(struct mmu_range_notifier *mrn, + const struct mmu_notifier_range *range) { - bool ret = hmm_range_valid(range); + struct svm_notifier *sn = + container_of(mrn, struct svm_notifier, notifier); - hmm_range_unregister(range); - return ret; + /* + * serializes the update to mrn->invalidate_seq done by caller and + * prevents invalidation of the PTE from progressing while HW is being + * programmed. This is very hacky and only works because the normal + * notifier that does invalidation is always called after the range + * notifier. + */ + if (mmu_notifier_range_blockable(range)) + mutex_lock(&sn->svmm->mutex); + else if (!mutex_trylock(&sn->svmm->mutex)) + return false; + mutex_unlock(&sn->svmm->mutex); + return true; } -static int -nouveau_range_fault(struct nouveau_svmm *svmm, struct hmm_range *range) +static const struct mmu_range_notifier_ops nouveau_svm_mrn_ops = { + .invalidate = nouveau_svm_range_invalidate, +}; + +static int nouveau_range_fault(struct nouveau_svmm *svmm, + struct nouveau_drm *drm, void *data, u32 size, + u64 *pfns, + struct svm_notifier *notifier) { + unsigned long timeout = + jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); + /* Have HMM fault pages within the fault window to the GPU. */ + struct hmm_range range = { + .notifier = ¬ifier->notifier, + .start = notifier->notifier.interval_tree.start, + .end = notifier->notifier.interval_tree.last + 1, + .pfns = pfns, + .flags = nouveau_svm_pfn_flags, + .values = nouveau_svm_pfn_values, + .pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT, + }; + struct mm_struct *mm = notifier->notifier.mm; long ret; - range->default_flags = 0; - range->pfn_flags_mask = -1UL; + while (true) { + if (time_after(jiffies, timeout)) + return -EBUSY; - ret = hmm_range_register(range, &svmm->mirror); - if (ret) { - up_read(&svmm->notifier.mm->mmap_sem); - return (int)ret; - } + range.notifier_seq = mmu_range_read_begin(range.notifier); + range.default_flags = 0; + range.pfn_flags_mask = -1UL; + down_read(&mm->mmap_sem); + ret = hmm_range_fault(&range, 0); + up_read(&mm->mmap_sem); + if (ret <= 0) { + if (ret == 0 || ret == -EBUSY) + continue; + return ret; + } - if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { - up_read(&svmm->notifier.mm->mmap_sem); - return -EBUSY; + mutex_lock(&svmm->mutex); + if (mmu_range_read_retry(range.notifier, + range.notifier_seq)) { + mutex_unlock(&svmm->mutex); + continue; + } + break; } - ret = hmm_range_fault(range, 0); - if (ret <= 0) { - if (ret == 0) - ret = -EBUSY; - up_read(&svmm->notifier.mm->mmap_sem); - hmm_range_unregister(range); - return ret; - } - return 0; + nouveau_dmem_convert_pfn(drm, &range); + + svmm->vmm->vmm.object.client->super = true; + ret = nvif_object_ioctl(&svmm->vmm->vmm.object, data, size, NULL); + svmm->vmm->vmm.object.client->super = false; + mutex_unlock(&svmm->mutex); + + return ret; } static int @@ -559,7 +584,6 @@ nouveau_svm_fault(struct nvif_notify *notify) } i; u64 phys[16]; } args; - struct hmm_range range; struct vm_area_struct *vma; u64 inst, start, limit; int fi, fn, pi, fill; @@ -615,6 +639,7 @@ nouveau_svm_fault(struct nvif_notify *notify) args.i.p.version = 0; for (fi = 0; fn = fi + 1, fi < buffer->fault_nr; fi = fn) { + struct svm_notifier notifier; struct mm_struct *mm; /* Cancel any faults from non-SVM channels. */ @@ -623,7 +648,6 @@ nouveau_svm_fault(struct nvif_notify *notify) continue; } SVMM_DBG(svmm, "addr %016llx", buffer->fault[fi]->addr); - mm = svmm->notifier.mm; /* We try and group handling of faults within a small * window into a single update. @@ -637,6 +661,12 @@ nouveau_svm_fault(struct nvif_notify *notify) start = max_t(u64, start, svmm->unmanaged.limit); SVMM_DBG(svmm, "wndw %016llx-%016llx", start, limit); + mm = svmm->notifier.mm; + if (!mmget_not_zero(mm)) { + nouveau_svm_fault_cancel_fault(svm, buffer->fault[fi]); + continue; + } + /* Intersect fault window with the CPU VMA, cancelling * the fault if the address is invalid. */ @@ -645,16 +675,18 @@ nouveau_svm_fault(struct nvif_notify *notify) if (!vma) { SVMM_ERR(svmm, "wndw %016llx-%016llx", start, limit); up_read(&mm->mmap_sem); + mmput(mm); nouveau_svm_fault_cancel_fault(svm, buffer->fault[fi]); continue; } start = max_t(u64, start, vma->vm_start); limit = min_t(u64, limit, vma->vm_end); + up_read(&mm->mmap_sem); SVMM_DBG(svmm, "wndw %016llx-%016llx", start, limit); if (buffer->fault[fi]->addr != start) { SVMM_ERR(svmm, "addr %016llx", buffer->fault[fi]->addr); - up_read(&mm->mmap_sem); + mmput(mm); nouveau_svm_fault_cancel_fault(svm, buffer->fault[fi]); continue; } @@ -710,33 +742,19 @@ nouveau_svm_fault(struct nvif_notify *notify) args.i.p.addr, args.i.p.addr + args.i.p.size, fn - fi); - /* Have HMM fault pages within the fault window to the GPU. */ - range.start = args.i.p.addr; - range.end = args.i.p.addr + args.i.p.size; - range.pfns = args.phys; - range.flags = nouveau_svm_pfn_flags; - range.values = nouveau_svm_pfn_values; - range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT; -again: - ret = nouveau_range_fault(svmm, &range); - if (ret == 0) { - mutex_lock(&svmm->mutex); - if (!nouveau_range_done(&range)) { - mutex_unlock(&svmm->mutex); - goto again; - } - - nouveau_dmem_convert_pfn(svm->drm, &range); - - svmm->vmm->vmm.object.client->super = true; - ret = nvif_object_ioctl(&svmm->vmm->vmm.object, - &args, sizeof(args.i) + - pi * sizeof(args.phys[0]), - NULL); - svmm->vmm->vmm.object.client->super = false; - mutex_unlock(&svmm->mutex); - up_read(&mm->mmap_sem); + notifier.svmm = svmm; + notifier.notifier.ops = &nouveau_svm_mrn_ops; + ret = mmu_range_notifier_insert(¬ifier.notifier, + args.i.p.addr, args.i.p.size, + svmm->notifier.mm); + if (!ret) { + ret = nouveau_range_fault( + svmm, svm->drm, &args, + sizeof(args.i) + pi * sizeof(args.phys[0]), + args.phys, ¬ifier); + mmu_range_notifier_remove(¬ifier.notifier); } + mmput(mm); /* Cancel any faults in the window whose pages didn't manage * to keep their valid bit, or stay writeable when required. @@ -745,10 +763,10 @@ nouveau_svm_fault(struct nvif_notify *notify) */ while (fi < fn) { struct nouveau_svm_fault *fault = buffer->fault[fi++]; - pi = (fault->addr - range.start) >> PAGE_SHIFT; + pi = (fault->addr - args.i.p.addr) >> PAGE_SHIFT; if (ret || - !(range.pfns[pi] & NVIF_VMM_PFNMAP_V0_V) || - (!(range.pfns[pi] & NVIF_VMM_PFNMAP_V0_W) && + !(args.phys[pi] & NVIF_VMM_PFNMAP_V0_V) || + (!(args.phys[pi] & NVIF_VMM_PFNMAP_V0_W) && fault->access != 0 && fault->access != 3)) { nouveau_svm_fault_cancel_fault(svm, fault); continue;