From patchwork Thu Dec 13 17:13:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10729225 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C3E7D112E for ; Thu, 13 Dec 2018 17:14:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A9BEA285B9 for ; Thu, 13 Dec 2018 17:14:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9DB832C347; Thu, 13 Dec 2018 17:14:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5DAB22C059 for ; Thu, 13 Dec 2018 17:14:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729524AbeLMRNr (ORCPT ); Thu, 13 Dec 2018 12:13:47 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47958 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727511AbeLMRNq (ORCPT ); Thu, 13 Dec 2018 12:13:46 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 79BE34E92D; Thu, 13 Dec 2018 17:13:45 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-121-185.rdu2.redhat.com [10.10.121.185]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6AFF56012D; Thu, 13 Dec 2018 17:13:43 +0000 (UTC) From: jglisse@redhat.com To: linux-kernel@vger.kernel.org Cc: =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , Matthew Wilcox , Ross Zwisler , Dan Williams , Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , Michal Hocko , Christian Koenig , Ralph Campbell , John Hubbard , kvm@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-rdma@vger.kernel.org, linux-fsdevel@vger.kernel.org, Arnd Bergmann Subject: [PATCH v3 1/3] mm/mmu_notifier: use structure for invalidate_range_start/end callback v2 Date: Thu, 13 Dec 2018 12:13:28 -0500 Message-Id: <20181213171330.8489-2-jglisse@redhat.com> In-Reply-To: <20181213171330.8489-1-jglisse@redhat.com> References: <20181213171330.8489-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 13 Dec 2018 17:13:45 +0000 (UTC) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse To avoid having to change many callback definition everytime we want to add a parameter use a structure to group all parameters for the mmu_notifier invalidate_range_start/end callback. No functional changes with this patch. Changed since v1: - fix make htmldocs warning in amdgpu_mn.c Signed-off-by: Jérôme Glisse Acked-by: Jan Kara Acked-by: Felix Kuehling Acked-by: Jason Gunthorpe Cc: Andrew Morton Cc: Matthew Wilcox Cc: Ross Zwisler Cc: Dan Williams Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Michal Hocko Cc: Christian Koenig Cc: Ralph Campbell Cc: John Hubbard Cc: kvm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-rdma@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: Arnd Bergmann --- drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 47 +++++++++++-------------- drivers/gpu/drm/i915/i915_gem_userptr.c | 14 ++++---- drivers/gpu/drm/radeon/radeon_mn.c | 16 ++++----- drivers/infiniband/core/umem_odp.c | 20 +++++------ drivers/infiniband/hw/hfi1/mmu_rb.c | 13 +++---- drivers/misc/mic/scif/scif_dma.c | 11 ++---- drivers/misc/sgi-gru/grutlbpurge.c | 14 ++++---- drivers/xen/gntdev.c | 12 +++---- include/linux/mmu_notifier.h | 14 +++++--- mm/hmm.c | 23 +++++------- mm/mmu_notifier.c | 21 +++++++++-- virt/kvm/kvm_main.c | 14 +++----- 12 files changed, 103 insertions(+), 116 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c index e55508b39496..3e6823fdd939 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c @@ -238,44 +238,40 @@ static void amdgpu_mn_invalidate_node(struct amdgpu_mn_node *node, * amdgpu_mn_invalidate_range_start_gfx - callback to notify about mm change * * @mn: our notifier - * @mm: the mm this callback is about - * @start: start of updated range - * @end: end of updated range + * @range: mmu notifier context * * Block for operations on BOs to finish and mark pages as accessed and * potentially dirty. */ static int amdgpu_mn_invalidate_range_start_gfx(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn, mn); struct interval_tree_node *it; + unsigned long end; /* notification is exclusive, but interval is inclusive */ - end -= 1; + end = range->end - 1; /* TODO we should be able to split locking for interval tree and * amdgpu_mn_invalidate_node */ - if (amdgpu_mn_read_lock(amn, blockable)) + if (amdgpu_mn_read_lock(amn, range->blockable)) return -EAGAIN; - it = interval_tree_iter_first(&amn->objects, start, end); + it = interval_tree_iter_first(&amn->objects, range->start, end); while (it) { struct amdgpu_mn_node *node; - if (!blockable) { + if (!range->blockable) { amdgpu_mn_read_unlock(amn); return -EAGAIN; } node = container_of(it, struct amdgpu_mn_node, it); - it = interval_tree_iter_next(it, start, end); + it = interval_tree_iter_next(it, range->start, end); - amdgpu_mn_invalidate_node(node, start, end); + amdgpu_mn_invalidate_node(node, range->start, end); } return 0; @@ -294,39 +290,38 @@ static int amdgpu_mn_invalidate_range_start_gfx(struct mmu_notifier *mn, * are restorted in amdgpu_mn_invalidate_range_end_hsa. */ static int amdgpu_mn_invalidate_range_start_hsa(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn, mn); struct interval_tree_node *it; + unsigned long end; /* notification is exclusive, but interval is inclusive */ - end -= 1; + end = range->end - 1; - if (amdgpu_mn_read_lock(amn, blockable)) + if (amdgpu_mn_read_lock(amn, range->blockable)) return -EAGAIN; - it = interval_tree_iter_first(&amn->objects, start, end); + it = interval_tree_iter_first(&amn->objects, range->start, end); while (it) { struct amdgpu_mn_node *node; struct amdgpu_bo *bo; - if (!blockable) { + if (!range->blockable) { amdgpu_mn_read_unlock(amn); return -EAGAIN; } node = container_of(it, struct amdgpu_mn_node, it); - it = interval_tree_iter_next(it, start, end); + it = interval_tree_iter_next(it, range->start, end); list_for_each_entry(bo, &node->bos, mn_list) { struct kgd_mem *mem = bo->kfd_bo; if (amdgpu_ttm_tt_affect_userptr(bo->tbo.ttm, - start, end)) - amdgpu_amdkfd_evict_userptr(mem, mm); + range->start, + end)) + amdgpu_amdkfd_evict_userptr(mem, range->mm); } } @@ -344,9 +339,7 @@ static int amdgpu_mn_invalidate_range_start_hsa(struct mmu_notifier *mn, * Release the lock again to allow new command submissions. */ static void amdgpu_mn_invalidate_range_end(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end) + const struct mmu_notifier_range *range) { struct amdgpu_mn *amn = container_of(mn, struct amdgpu_mn, mn); diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 2c9b284036d1..3df77020aada 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -113,27 +113,25 @@ static void del_object(struct i915_mmu_object *mo) } static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct i915_mmu_notifier *mn = container_of(_mn, struct i915_mmu_notifier, mn); struct i915_mmu_object *mo; struct interval_tree_node *it; LIST_HEAD(cancelled); + unsigned long end; if (RB_EMPTY_ROOT(&mn->objects.rb_root)) return 0; /* interval ranges are inclusive, but invalidate range is exclusive */ - end--; + end = range->end - 1; spin_lock(&mn->lock); - it = interval_tree_iter_first(&mn->objects, start, end); + it = interval_tree_iter_first(&mn->objects, range->start, end); while (it) { - if (!blockable) { + if (!range->blockable) { spin_unlock(&mn->lock); return -EAGAIN; } @@ -151,7 +149,7 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, queue_work(mn->wq, &mo->work); list_add(&mo->link, &cancelled); - it = interval_tree_iter_next(it, start, end); + it = interval_tree_iter_next(it, range->start, end); } list_for_each_entry(mo, &cancelled, link) del_object(mo); diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c index f8b35df44c60..b3019505065a 100644 --- a/drivers/gpu/drm/radeon/radeon_mn.c +++ b/drivers/gpu/drm/radeon/radeon_mn.c @@ -119,40 +119,38 @@ static void radeon_mn_release(struct mmu_notifier *mn, * unmap them by move them into system domain again. */ static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn); struct ttm_operation_ctx ctx = { false, false }; struct interval_tree_node *it; + unsigned long end; int ret = 0; /* notification is exclusive, but interval is inclusive */ - end -= 1; + end = range->end - 1; /* TODO we should be able to split locking for interval tree and * the tear down. */ - if (blockable) + if (range->blockable) mutex_lock(&rmn->lock); else if (!mutex_trylock(&rmn->lock)) return -EAGAIN; - it = interval_tree_iter_first(&rmn->objects, start, end); + it = interval_tree_iter_first(&rmn->objects, range->start, end); while (it) { struct radeon_mn_node *node; struct radeon_bo *bo; long r; - if (!blockable) { + if (!range->blockable) { ret = -EAGAIN; goto out_unlock; } node = container_of(it, struct radeon_mn_node, it); - it = interval_tree_iter_next(it, start, end); + it = interval_tree_iter_next(it, range->start, end); list_for_each_entry(bo, &node->bos, mn_list) { diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 676c1fd1119d..25db6ff68c70 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -146,15 +146,12 @@ static int invalidate_range_start_trampoline(struct ib_umem_odp *item, } static int ib_umem_notifier_invalidate_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct ib_ucontext_per_mm *per_mm = container_of(mn, struct ib_ucontext_per_mm, mn); - if (blockable) + if (range->blockable) down_read(&per_mm->umem_rwsem); else if (!down_read_trylock(&per_mm->umem_rwsem)) return -EAGAIN; @@ -169,9 +166,10 @@ static int ib_umem_notifier_invalidate_range_start(struct mmu_notifier *mn, return 0; } - return rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, start, end, + return rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start, + range->end, invalidate_range_start_trampoline, - blockable, NULL); + range->blockable, NULL); } static int invalidate_range_end_trampoline(struct ib_umem_odp *item, u64 start, @@ -182,9 +180,7 @@ static int invalidate_range_end_trampoline(struct ib_umem_odp *item, u64 start, } static void ib_umem_notifier_invalidate_range_end(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end) + const struct mmu_notifier_range *range) { struct ib_ucontext_per_mm *per_mm = container_of(mn, struct ib_ucontext_per_mm, mn); @@ -192,8 +188,8 @@ static void ib_umem_notifier_invalidate_range_end(struct mmu_notifier *mn, if (unlikely(!per_mm->active)) return; - rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, start, - end, + rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start, + range->end, invalidate_range_end_trampoline, true, NULL); up_read(&per_mm->umem_rwsem); } diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c index 475b769e120c..14d2a90964c3 100644 --- a/drivers/infiniband/hw/hfi1/mmu_rb.c +++ b/drivers/infiniband/hw/hfi1/mmu_rb.c @@ -68,8 +68,7 @@ struct mmu_rb_handler { static unsigned long mmu_node_start(struct mmu_rb_node *); static unsigned long mmu_node_last(struct mmu_rb_node *); static int mmu_notifier_range_start(struct mmu_notifier *, - struct mm_struct *, - unsigned long, unsigned long, bool); + const struct mmu_notifier_range *); static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *, unsigned long, unsigned long); static void do_remove(struct mmu_rb_handler *handler, @@ -284,10 +283,7 @@ void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler, } static int mmu_notifier_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct mmu_rb_handler *handler = container_of(mn, struct mmu_rb_handler, mn); @@ -297,10 +293,11 @@ static int mmu_notifier_range_start(struct mmu_notifier *mn, bool added = false; spin_lock_irqsave(&handler->lock, flags); - for (node = __mmu_int_rb_iter_first(root, start, end - 1); + for (node = __mmu_int_rb_iter_first(root, range->start, range->end-1); node; node = ptr) { /* Guard against node removal. */ - ptr = __mmu_int_rb_iter_next(node, start, end - 1); + ptr = __mmu_int_rb_iter_next(node, range->start, + range->end - 1); trace_hfi1_mmu_mem_invalidate(node->addr, node->len); if (handler->ops->invalidate(handler->ops_arg, node)) { __mmu_int_rb_remove(node, root); diff --git a/drivers/misc/mic/scif/scif_dma.c b/drivers/misc/mic/scif/scif_dma.c index 18b8ed57c4ac..e0d97044d0e9 100644 --- a/drivers/misc/mic/scif/scif_dma.c +++ b/drivers/misc/mic/scif/scif_dma.c @@ -201,23 +201,18 @@ static void scif_mmu_notifier_release(struct mmu_notifier *mn, } static int scif_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct scif_mmu_notif *mmn; mmn = container_of(mn, struct scif_mmu_notif, ep_mmu_notifier); - scif_rma_destroy_tcw(mmn, start, end - start); + scif_rma_destroy_tcw(mmn, range->start, range->end - range->start); return 0; } static void scif_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end) + const struct mmu_notifier_range *range) { /* * Nothing to do here, everything needed was done in diff --git a/drivers/misc/sgi-gru/grutlbpurge.c b/drivers/misc/sgi-gru/grutlbpurge.c index 03b49d52092e..ca2032afe035 100644 --- a/drivers/misc/sgi-gru/grutlbpurge.c +++ b/drivers/misc/sgi-gru/grutlbpurge.c @@ -220,9 +220,7 @@ void gru_flush_all_tlb(struct gru_state *gru) * MMUOPS notifier callout functions */ static int gru_invalidate_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct, ms_notifier); @@ -230,15 +228,14 @@ static int gru_invalidate_range_start(struct mmu_notifier *mn, STAT(mmu_invalidate_range); atomic_inc(&gms->ms_range_active); gru_dbg(grudev, "gms %p, start 0x%lx, end 0x%lx, act %d\n", gms, - start, end, atomic_read(&gms->ms_range_active)); - gru_flush_tlb_range(gms, start, end - start); + range->start, range->end, atomic_read(&gms->ms_range_active)); + gru_flush_tlb_range(gms, range->start, range->end - range->start); return 0; } static void gru_invalidate_range_end(struct mmu_notifier *mn, - struct mm_struct *mm, unsigned long start, - unsigned long end) + const struct mmu_notifier_range *range) { struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct, ms_notifier); @@ -247,7 +244,8 @@ static void gru_invalidate_range_end(struct mmu_notifier *mn, (void)atomic_dec_and_test(&gms->ms_range_active); wake_up_all(&gms->ms_wait_queue); - gru_dbg(grudev, "gms %p, start 0x%lx, end 0x%lx\n", gms, start, end); + gru_dbg(grudev, "gms %p, start 0x%lx, end 0x%lx\n", + gms, range->start, range->end); } static void gru_release(struct mmu_notifier *mn, struct mm_struct *mm) diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index b0b02a501167..5efc5eee9544 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -520,26 +520,26 @@ static int unmap_if_in_range(struct gntdev_grant_map *map, } static int mn_invl_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn); struct gntdev_grant_map *map; int ret = 0; - if (blockable) + if (range->blockable) mutex_lock(&priv->lock); else if (!mutex_trylock(&priv->lock)) return -EAGAIN; list_for_each_entry(map, &priv->maps, next) { - ret = unmap_if_in_range(map, start, end, blockable); + ret = unmap_if_in_range(map, range->start, range->end, + range->blockable); if (ret) goto out_unlock; } list_for_each_entry(map, &priv->freeable_maps, next) { - ret = unmap_if_in_range(map, start, end, blockable); + ret = unmap_if_in_range(map, range->start, range->end, + range->blockable); if (ret) goto out_unlock; } diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 9893a6432adf..368f0c1a049d 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -25,6 +25,13 @@ struct mmu_notifier_mm { spinlock_t lock; }; +struct mmu_notifier_range { + struct mm_struct *mm; + unsigned long start; + unsigned long end; + bool blockable; +}; + struct mmu_notifier_ops { /* * Called either by mmu_notifier_unregister or when the mm is @@ -146,12 +153,9 @@ struct mmu_notifier_ops { * */ int (*invalidate_range_start)(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, unsigned long end, - bool blockable); + const struct mmu_notifier_range *range); void (*invalidate_range_end)(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, unsigned long end); + const struct mmu_notifier_range *range); /* * invalidate_range() is either called between diff --git a/mm/hmm.c b/mm/hmm.c index 90c34f3d1243..1965f2caf5eb 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -189,35 +189,30 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) } static int hmm_invalidate_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct hmm_update update; - struct hmm *hmm = mm->hmm; + struct hmm *hmm = range->mm->hmm; VM_BUG_ON(!hmm); - update.start = start; - update.end = end; + update.start = range->start; + update.end = range->end; update.event = HMM_UPDATE_INVALIDATE; - update.blockable = blockable; + update.blockable = range->blockable; return hmm_invalidate_range(hmm, true, &update); } static void hmm_invalidate_range_end(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end) + const struct mmu_notifier_range *range) { struct hmm_update update; - struct hmm *hmm = mm->hmm; + struct hmm *hmm = range->mm->hmm; VM_BUG_ON(!hmm); - update.start = start; - update.end = end; + update.start = range->start; + update.end = range->end; update.event = HMM_UPDATE_INVALIDATE; update.blockable = true; hmm_invalidate_range(hmm, false, &update); diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 5119ff846769..5f6665ae3ee2 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -178,14 +178,20 @@ int __mmu_notifier_invalidate_range_start(struct mm_struct *mm, unsigned long start, unsigned long end, bool blockable) { + struct mmu_notifier_range _range, *range = &_range; struct mmu_notifier *mn; int ret = 0; int id; + range->blockable = blockable; + range->start = start; + range->end = end; + range->mm = mm; + id = srcu_read_lock(&srcu); hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) { if (mn->ops->invalidate_range_start) { - int _ret = mn->ops->invalidate_range_start(mn, mm, start, end, blockable); + int _ret = mn->ops->invalidate_range_start(mn, range); if (_ret) { pr_info("%pS callback failed with %d in %sblockable context.\n", mn->ops->invalidate_range_start, _ret, @@ -205,9 +211,20 @@ void __mmu_notifier_invalidate_range_end(struct mm_struct *mm, unsigned long end, bool only_end) { + struct mmu_notifier_range _range, *range = &_range; struct mmu_notifier *mn; int id; + /* + * The end call back will never be call if the start refused to go + * through because of blockable was false so here assume that we + * can block. + */ + range->blockable = true; + range->start = start; + range->end = end; + range->mm = mm; + id = srcu_read_lock(&srcu); hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) { /* @@ -226,7 +243,7 @@ void __mmu_notifier_invalidate_range_end(struct mm_struct *mm, if (!only_end && mn->ops->invalidate_range) mn->ops->invalidate_range(mn, mm, start, end); if (mn->ops->invalidate_range_end) - mn->ops->invalidate_range_end(mn, mm, start, end); + mn->ops->invalidate_range_end(mn, range); } srcu_read_unlock(&srcu, id); } diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2679e476b6c3..f829f63f2b16 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -360,10 +360,7 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, } static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end, - bool blockable) + const struct mmu_notifier_range *range) { struct kvm *kvm = mmu_notifier_to_kvm(mn); int need_tlb_flush = 0, idx; @@ -377,7 +374,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, * count is also read inside the mmu_lock critical section. */ kvm->mmu_notifier_count++; - need_tlb_flush = kvm_unmap_hva_range(kvm, start, end); + need_tlb_flush = kvm_unmap_hva_range(kvm, range->start, range->end); need_tlb_flush |= kvm->tlbs_dirty; /* we've to flush the tlb before the pages can be freed */ if (need_tlb_flush) @@ -385,7 +382,8 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, spin_unlock(&kvm->mmu_lock); - ret = kvm_arch_mmu_notifier_invalidate_range(kvm, start, end, blockable); + ret = kvm_arch_mmu_notifier_invalidate_range(kvm, range->start, + range->end, range->blockable); srcu_read_unlock(&kvm->srcu, idx); @@ -393,9 +391,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, } static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end) + const struct mmu_notifier_range *range) { struct kvm *kvm = mmu_notifier_to_kvm(mn); From patchwork Thu Dec 13 17:13:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10729221 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 075DF14E2 for ; Thu, 13 Dec 2018 17:14:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE386285B9 for ; Thu, 13 Dec 2018 17:14:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D0FEF2C08C; Thu, 13 Dec 2018 17:14:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3F826285B9 for ; Thu, 13 Dec 2018 17:14:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729631AbeLMRNu (ORCPT ); Thu, 13 Dec 2018 12:13:50 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44320 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729612AbeLMRNt (ORCPT ); Thu, 13 Dec 2018 12:13:49 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D701BC062D16; Thu, 13 Dec 2018 17:13:47 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-121-185.rdu2.redhat.com [10.10.121.185]) by smtp.corp.redhat.com (Postfix) with ESMTP id A0E4460139; Thu, 13 Dec 2018 17:13:45 +0000 (UTC) From: jglisse@redhat.com To: linux-kernel@vger.kernel.org Cc: =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , Matthew Wilcox , Ross Zwisler , Dan Williams , Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , Michal Hocko , Christian Koenig , Ralph Campbell , John Hubbard , kvm@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-rdma@vger.kernel.org, linux-fsdevel@vger.kernel.org, Arnd Bergmann Subject: [PATCH v3 2/3] mm/mmu_notifier: use structure for invalidate_range_start/end calls v3 Date: Thu, 13 Dec 2018 12:13:29 -0500 Message-Id: <20181213171330.8489-3-jglisse@redhat.com> In-Reply-To: <20181213171330.8489-1-jglisse@redhat.com> References: <20181213171330.8489-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Thu, 13 Dec 2018 17:13:48 +0000 (UTC) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse To avoid having to change many call sites everytime we want to add a parameter use a structure to group all parameters for the mmu_notifier invalidate_range_start/end cakks. No functional changes with this patch. Changes since v2: - fix build warning in migrate.c when CONFIG_MMU_NOTIFIER=n Changes since v1: - introduce mmu_notifier_range_init() as an helper to initialize the range structure allowing to optimize out the case when mmu notifier is not enabled - fix mm/migrate.c migrate_vma_collect() Signed-off-by: Jérôme Glisse Acked-by: Christian König Acked-by: Jan Kara Acked-by: Felix Kuehling Acked-by: Jason Gunthorpe Cc: Andrew Morton Cc: Matthew Wilcox Cc: Ross Zwisler Cc: Dan Williams Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Michal Hocko Cc: Christian Koenig Cc: Ralph Campbell Cc: John Hubbard Cc: kvm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-rdma@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: Arnd Bergmann --- fs/dax.c | 8 +-- fs/proc/task_mmu.c | 7 ++- include/linux/mm.h | 4 +- include/linux/mmu_notifier.h | 87 +++++++++++++++++++++----------- kernel/events/uprobes.c | 10 ++-- mm/huge_memory.c | 54 ++++++++++---------- mm/hugetlb.c | 52 ++++++++++--------- mm/khugepaged.c | 10 ++-- mm/ksm.c | 21 ++++---- mm/madvise.c | 21 ++++---- mm/memory.c | 97 ++++++++++++++++++------------------ mm/migrate.c | 28 +++++------ mm/mmu_notifier.c | 35 +++---------- mm/mprotect.c | 15 +++--- mm/mremap.c | 10 ++-- mm/oom_kill.c | 17 ++++--- mm/rmap.c | 30 ++++++----- 17 files changed, 259 insertions(+), 247 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 9bcce89ea18e..874085bacaf5 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -758,7 +758,8 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, i_mmap_lock_read(mapping); vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) { - unsigned long address, start, end; + struct mmu_notifier_range range; + unsigned long address; cond_resched(); @@ -772,7 +773,8 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, * call mmu_notifier_invalidate_range_start() on our behalf * before taking any lock. */ - if (follow_pte_pmd(vma->vm_mm, address, &start, &end, &ptep, &pmdp, &ptl)) + if (follow_pte_pmd(vma->vm_mm, address, &range, + &ptep, &pmdp, &ptl)) continue; /* @@ -814,7 +816,7 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, pte_unmap_unlock(ptep, ptl); } - mmu_notifier_invalidate_range_end(vma->vm_mm, start, end); + mmu_notifier_invalidate_range_end(&range); } i_mmap_unlock_read(mapping); } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 47c3764c469b..b3ddceb003bc 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1096,6 +1096,7 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, return -ESRCH; mm = get_task_mm(task); if (mm) { + struct mmu_notifier_range range; struct clear_refs_private cp = { .type = type, }; @@ -1139,11 +1140,13 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, downgrade_write(&mm->mmap_sem); break; } - mmu_notifier_invalidate_range_start(mm, 0, -1); + + mmu_notifier_range_init(&range, mm, 0, -1UL); + mmu_notifier_invalidate_range_start(&range); } walk_page_range(0, mm->highest_vm_end, &clear_refs_walk); if (type == CLEAR_REFS_SOFT_DIRTY) - mmu_notifier_invalidate_range_end(mm, 0, -1); + mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb, 0, -1); up_read(&mm->mmap_sem); out_mm: diff --git a/include/linux/mm.h b/include/linux/mm.h index 5411de93a363..e7b6f2b30713 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1397,6 +1397,8 @@ struct mm_walk { void *private; }; +struct mmu_notifier_range; + int walk_page_range(unsigned long addr, unsigned long end, struct mm_walk *walk); int walk_page_vma(struct vm_area_struct *vma, struct mm_walk *walk); @@ -1405,7 +1407,7 @@ void free_pgd_range(struct mmu_gather *tlb, unsigned long addr, int copy_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *vma); int follow_pte_pmd(struct mm_struct *mm, unsigned long address, - unsigned long *start, unsigned long *end, + struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 368f0c1a049d..39b06772427f 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -220,11 +220,8 @@ extern int __mmu_notifier_test_young(struct mm_struct *mm, unsigned long address); extern void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address, pte_t pte); -extern int __mmu_notifier_invalidate_range_start(struct mm_struct *mm, - unsigned long start, unsigned long end, - bool blockable); -extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm, - unsigned long start, unsigned long end, +extern int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *); +extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r, bool only_end); extern void __mmu_notifier_invalidate_range(struct mm_struct *mm, unsigned long start, unsigned long end); @@ -268,33 +265,37 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm, __mmu_notifier_change_pte(mm, address, pte); } -static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline void +mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) { - if (mm_has_notifiers(mm)) - __mmu_notifier_invalidate_range_start(mm, start, end, true); + if (mm_has_notifiers(range->mm)) { + range->blockable = true; + __mmu_notifier_invalidate_range_start(range); + } } -static inline int mmu_notifier_invalidate_range_start_nonblock(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline int +mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range) { - if (mm_has_notifiers(mm)) - return __mmu_notifier_invalidate_range_start(mm, start, end, false); + if (mm_has_notifiers(range->mm)) { + range->blockable = false; + return __mmu_notifier_invalidate_range_start(range); + } return 0; } -static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline void +mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range) { - if (mm_has_notifiers(mm)) - __mmu_notifier_invalidate_range_end(mm, start, end, false); + if (mm_has_notifiers(range->mm)) + __mmu_notifier_invalidate_range_end(range, false); } -static inline void mmu_notifier_invalidate_range_only_end(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline void +mmu_notifier_invalidate_range_only_end(struct mmu_notifier_range *range) { - if (mm_has_notifiers(mm)) - __mmu_notifier_invalidate_range_end(mm, start, end, true); + if (mm_has_notifiers(range->mm)) + __mmu_notifier_invalidate_range_end(range, true); } static inline void mmu_notifier_invalidate_range(struct mm_struct *mm, @@ -315,6 +316,17 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm) __mmu_notifier_mm_destroy(mm); } + +static inline void mmu_notifier_range_init(struct mmu_notifier_range *range, + struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + range->mm = mm; + range->start = start; + range->end = end; +} + #define ptep_clear_flush_young_notify(__vma, __address, __ptep) \ ({ \ int __young; \ @@ -428,6 +440,23 @@ extern void mmu_notifier_synchronize(void); #else /* CONFIG_MMU_NOTIFIER */ +struct mmu_notifier_range { + unsigned long start; + unsigned long end; +}; + +static inline void _mmu_notifier_range_init(struct mmu_notifier_range *range, + unsigned long start, + unsigned long end) +{ + range->start = start; + range->end = end; +} + +#define mmu_notifier_range_init(range, mm, start, end) \ + _mmu_notifier_range_init(range, start, end) + + static inline int mm_has_notifiers(struct mm_struct *mm) { return 0; @@ -455,24 +484,24 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm, { } -static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline void +mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) { } -static inline int mmu_notifier_invalidate_range_start_nonblock(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline int +mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range) { return 0; } -static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline +void mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range) { } -static inline void mmu_notifier_invalidate_range_only_end(struct mm_struct *mm, - unsigned long start, unsigned long end) +static inline void +mmu_notifier_invalidate_range_only_end(struct mmu_notifier_range *range) { } diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 322e97bbb437..1fc8a93709c3 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -171,11 +171,11 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, .address = addr, }; int err; - /* For mmu_notifiers */ - const unsigned long mmun_start = addr; - const unsigned long mmun_end = addr + PAGE_SIZE; + struct mmu_notifier_range range; struct mem_cgroup *memcg; + mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE); + VM_BUG_ON_PAGE(PageTransHuge(old_page), old_page); err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL, &memcg, @@ -186,7 +186,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, /* For try_to_free_swap() and munlock_vma_page() below */ lock_page(old_page); - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_start(&range); err = -EAGAIN; if (!page_vma_mapped_walk(&pvmw)) { mem_cgroup_cancel_charge(new_page, memcg, false); @@ -220,7 +220,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, err = 0; unlock: - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); unlock_page(old_page); return err; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 622cced74fd9..c1d3ce809416 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1144,8 +1144,7 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, int i; vm_fault_t ret = 0; struct page **pages; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ + struct mmu_notifier_range range; pages = kmalloc_array(HPAGE_PMD_NR, sizeof(struct page *), GFP_KERNEL); @@ -1183,9 +1182,9 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, cond_resched(); } - mmun_start = haddr; - mmun_end = haddr + HPAGE_PMD_SIZE; - mmu_notifier_invalidate_range_start(vma->vm_mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, vma->vm_mm, haddr, + haddr + HPAGE_PMD_SIZE); + mmu_notifier_invalidate_range_start(&range); vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) @@ -1230,8 +1229,7 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, * No need to double call mmu_notifier->invalidate_range() callback as * the above pmdp_huge_clear_flush_notify() did already call it. */ - mmu_notifier_invalidate_range_only_end(vma->vm_mm, mmun_start, - mmun_end); + mmu_notifier_invalidate_range_only_end(&range); ret |= VM_FAULT_WRITE; put_page(page); @@ -1241,7 +1239,7 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, out_free_pages: spin_unlock(vmf->ptl); - mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); for (i = 0; i < HPAGE_PMD_NR; i++) { memcg = (void *)page_private(pages[i]); set_page_private(pages[i], 0); @@ -1258,8 +1256,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) struct page *page = NULL, *new_page; struct mem_cgroup *memcg; unsigned long haddr = vmf->address & HPAGE_PMD_MASK; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ + struct mmu_notifier_range range; gfp_t huge_gfp; /* for allocation and charge */ vm_fault_t ret = 0; @@ -1349,9 +1346,9 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) vma, HPAGE_PMD_NR); __SetPageUptodate(new_page); - mmun_start = haddr; - mmun_end = haddr + HPAGE_PMD_SIZE; - mmu_notifier_invalidate_range_start(vma->vm_mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, vma->vm_mm, haddr, + haddr + HPAGE_PMD_SIZE); + mmu_notifier_invalidate_range_start(&range); spin_lock(vmf->ptl); if (page) @@ -1386,8 +1383,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) * No need to double call mmu_notifier->invalidate_range() callback as * the above pmdp_huge_clear_flush_notify() did already call it. */ - mmu_notifier_invalidate_range_only_end(vma->vm_mm, mmun_start, - mmun_end); + mmu_notifier_invalidate_range_only_end(&range); out: return ret; out_unlock: @@ -2028,14 +2024,15 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, unsigned long address) { spinlock_t *ptl; - struct mm_struct *mm = vma->vm_mm; - unsigned long haddr = address & HPAGE_PUD_MASK; + struct mmu_notifier_range range; - mmu_notifier_invalidate_range_start(mm, haddr, haddr + HPAGE_PUD_SIZE); - ptl = pud_lock(mm, pud); + mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PUD_MASK, + (address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE); + mmu_notifier_invalidate_range_start(&range); + ptl = pud_lock(vma->vm_mm, pud); if (unlikely(!pud_trans_huge(*pud) && !pud_devmap(*pud))) goto out; - __split_huge_pud_locked(vma, pud, haddr); + __split_huge_pud_locked(vma, pud, range.start); out: spin_unlock(ptl); @@ -2043,8 +2040,7 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, * No need to double call mmu_notifier->invalidate_range() callback as * the above pudp_huge_clear_flush_notify() did already call it. */ - mmu_notifier_invalidate_range_only_end(mm, haddr, haddr + - HPAGE_PUD_SIZE); + mmu_notifier_invalidate_range_only_end(&range); } #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ @@ -2244,11 +2240,12 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct page *page) { spinlock_t *ptl; - struct mm_struct *mm = vma->vm_mm; - unsigned long haddr = address & HPAGE_PMD_MASK; + struct mmu_notifier_range range; - mmu_notifier_invalidate_range_start(mm, haddr, haddr + HPAGE_PMD_SIZE); - ptl = pmd_lock(mm, pmd); + mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PMD_MASK, + (address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE); + mmu_notifier_invalidate_range_start(&range); + ptl = pmd_lock(vma->vm_mm, pmd); /* * If caller asks to setup a migration entries, we need a page to check @@ -2264,7 +2261,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, clear_page_mlock(page); } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) goto out; - __split_huge_pmd_locked(vma, pmd, haddr, freeze); + __split_huge_pmd_locked(vma, pmd, range.start, freeze); out: spin_unlock(ptl); /* @@ -2280,8 +2277,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, * any further changes to individual pte will notify. So no need * to call mmu_notifier->invalidate_range() */ - mmu_notifier_invalidate_range_only_end(mm, haddr, haddr + - HPAGE_PMD_SIZE); + mmu_notifier_invalidate_range_only_end(&range); } void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 705a3e9cc910..e7c179cbcd75 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3239,16 +3239,16 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, int cow; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ + struct mmu_notifier_range range; int ret = 0; cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE; - mmun_start = vma->vm_start; - mmun_end = vma->vm_end; - if (cow) - mmu_notifier_invalidate_range_start(src, mmun_start, mmun_end); + if (cow) { + mmu_notifier_range_init(&range, src, vma->vm_start, + vma->vm_end); + mmu_notifier_invalidate_range_start(&range); + } for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; @@ -3324,7 +3324,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } if (cow) - mmu_notifier_invalidate_range_end(src, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); return ret; } @@ -3341,8 +3341,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, struct page *page; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); - unsigned long mmun_start = start; /* For mmu_notifiers */ - unsigned long mmun_end = end; /* For mmu_notifiers */ + struct mmu_notifier_range range; WARN_ON(!is_vm_hugetlb_page(vma)); BUG_ON(start & ~huge_page_mask(h)); @@ -3358,8 +3357,9 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, /* * If sharing possible, alert mmu notifiers of worst case. */ - adjust_range_if_pmd_sharing_possible(vma, &mmun_start, &mmun_end); - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, mm, start, end); + adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); + mmu_notifier_invalidate_range_start(&range); address = start; for (; address < end; address += sz) { ptep = huge_pte_offset(mm, address, sz); @@ -3427,7 +3427,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, if (ref_page) break; } - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); tlb_end_vma(tlb, vma); } @@ -3545,9 +3545,8 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, struct page *old_page, *new_page; int outside_reserve = 0; vm_fault_t ret = 0; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ unsigned long haddr = address & huge_page_mask(h); + struct mmu_notifier_range range; pte = huge_ptep_get(ptep); old_page = pte_page(pte); @@ -3626,9 +3625,8 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, __SetPageUptodate(new_page); set_page_huge_active(new_page); - mmun_start = haddr; - mmun_end = mmun_start + huge_page_size(h); - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, mm, haddr, haddr + huge_page_size(h)); + mmu_notifier_invalidate_range_start(&range); /* * Retake the page table lock to check for racing updates @@ -3641,7 +3639,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, /* Break COW */ huge_ptep_clear_flush(vma, haddr, ptep); - mmu_notifier_invalidate_range(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range(mm, range.start, range.end); set_huge_pte_at(mm, haddr, ptep, make_huge_pte(vma, new_page, 1)); page_remove_rmap(old_page, true); @@ -3650,7 +3648,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, new_page = old_page; } spin_unlock(ptl); - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); out_release_all: restore_reserve_on_error(h, vma, haddr, new_page); put_page(new_page); @@ -4339,21 +4337,21 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte_t pte; struct hstate *h = hstate_vma(vma); unsigned long pages = 0; - unsigned long f_start = start; - unsigned long f_end = end; bool shared_pmd = false; + struct mmu_notifier_range range; /* * In the case of shared PMDs, the area to flush could be beyond - * start/end. Set f_start/f_end to cover the maximum possible + * start/end. Set range.start/range.end to cover the maximum possible * range if PMD sharing is possible. */ - adjust_range_if_pmd_sharing_possible(vma, &f_start, &f_end); + mmu_notifier_range_init(&range, mm, start, end); + adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); BUG_ON(address >= end); - flush_cache_range(vma, f_start, f_end); + flush_cache_range(vma, range.start, range.end); - mmu_notifier_invalidate_range_start(mm, f_start, f_end); + mmu_notifier_invalidate_range_start(&range); i_mmap_lock_write(vma->vm_file->f_mapping); for (; address < end; address += huge_page_size(h)) { spinlock_t *ptl; @@ -4404,7 +4402,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * did unshare a page of pmds, flush the range corresponding to the pud. */ if (shared_pmd) - flush_hugetlb_tlb_range(vma, f_start, f_end); + flush_hugetlb_tlb_range(vma, range.start, range.end); else flush_hugetlb_tlb_range(vma, start, end); /* @@ -4414,7 +4412,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * See Documentation/vm/mmu_notifier.rst */ i_mmap_unlock_write(vma->vm_file->f_mapping); - mmu_notifier_invalidate_range_end(mm, f_start, f_end); + mmu_notifier_invalidate_range_end(&range); return pages << h->order; } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 8e2ff195ecb3..7736f6c37f19 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -944,8 +944,7 @@ static void collapse_huge_page(struct mm_struct *mm, int isolated = 0, result = 0; struct mem_cgroup *memcg; struct vm_area_struct *vma; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ + struct mmu_notifier_range range; gfp_t gfp; VM_BUG_ON(address & ~HPAGE_PMD_MASK); @@ -1017,9 +1016,8 @@ static void collapse_huge_page(struct mm_struct *mm, pte = pte_offset_map(pmd, address); pte_ptl = pte_lockptr(mm, pmd); - mmun_start = address; - mmun_end = address + HPAGE_PMD_SIZE; - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, mm, address, address + HPAGE_PMD_SIZE); + mmu_notifier_invalidate_range_start(&range); pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */ /* * After this gup_fast can't run anymore. This also removes @@ -1029,7 +1027,7 @@ static void collapse_huge_page(struct mm_struct *mm, */ _pmd = pmdp_collapse_flush(vma, address, pmd); spin_unlock(pmd_ptl); - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); isolated = __collapse_huge_page_isolate(vma, address, pte); diff --git a/mm/ksm.c b/mm/ksm.c index 5b0894b45ee5..6239d2df7a8e 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1042,8 +1042,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, }; int swapped; int err = -EFAULT; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ + struct mmu_notifier_range range; pvmw.address = page_address_in_vma(page, vma); if (pvmw.address == -EFAULT) @@ -1051,9 +1050,9 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, BUG_ON(PageTransCompound(page)); - mmun_start = pvmw.address; - mmun_end = pvmw.address + PAGE_SIZE; - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, mm, pvmw.address, + pvmw.address + PAGE_SIZE); + mmu_notifier_invalidate_range_start(&range); if (!page_vma_mapped_walk(&pvmw)) goto out_mn; @@ -1105,7 +1104,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, out_unlock: page_vma_mapped_walk_done(&pvmw); out_mn: - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); out: return err; } @@ -1129,8 +1128,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, spinlock_t *ptl; unsigned long addr; int err = -EFAULT; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ + struct mmu_notifier_range range; addr = page_address_in_vma(page, vma); if (addr == -EFAULT) @@ -1140,9 +1138,8 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, if (!pmd) goto out; - mmun_start = addr; - mmun_end = addr + PAGE_SIZE; - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE); + mmu_notifier_invalidate_range_start(&range); ptep = pte_offset_map_lock(mm, pmd, addr, &ptl); if (!pte_same(*ptep, orig_pte)) { @@ -1188,7 +1185,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, pte_unmap_unlock(ptep, ptl); err = 0; out_mn: - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); out: return err; } diff --git a/mm/madvise.c b/mm/madvise.c index 6cb1ca93e290..21a7881a2db4 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -458,29 +458,30 @@ static void madvise_free_page_range(struct mmu_gather *tlb, static int madvise_free_single_vma(struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr) { - unsigned long start, end; struct mm_struct *mm = vma->vm_mm; + struct mmu_notifier_range range; struct mmu_gather tlb; /* MADV_FREE works for only anon vma at the moment */ if (!vma_is_anonymous(vma)) return -EINVAL; - start = max(vma->vm_start, start_addr); - if (start >= vma->vm_end) + range.start = max(vma->vm_start, start_addr); + if (range.start >= vma->vm_end) return -EINVAL; - end = min(vma->vm_end, end_addr); - if (end <= vma->vm_start) + range.end = min(vma->vm_end, end_addr); + if (range.end <= vma->vm_start) return -EINVAL; + mmu_notifier_range_init(&range, mm, range.start, range.end); lru_add_drain(); - tlb_gather_mmu(&tlb, mm, start, end); + tlb_gather_mmu(&tlb, mm, range.start, range.end); update_hiwater_rss(mm); - mmu_notifier_invalidate_range_start(mm, start, end); - madvise_free_page_range(&tlb, vma, start, end); - mmu_notifier_invalidate_range_end(mm, start, end); - tlb_finish_mmu(&tlb, start, end); + mmu_notifier_invalidate_range_start(&range); + madvise_free_page_range(&tlb, vma, range.start, range.end); + mmu_notifier_invalidate_range_end(&range); + tlb_finish_mmu(&tlb, range.start, range.end); return 0; } diff --git a/mm/memory.c b/mm/memory.c index 4ad2d293ddc2..574307f11464 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -973,8 +973,7 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, unsigned long next; unsigned long addr = vma->vm_start; unsigned long end = vma->vm_end; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ + struct mmu_notifier_range range; bool is_cow; int ret; @@ -1008,11 +1007,11 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, * is_cow_mapping() returns true. */ is_cow = is_cow_mapping(vma->vm_flags); - mmun_start = addr; - mmun_end = end; - if (is_cow) - mmu_notifier_invalidate_range_start(src_mm, mmun_start, - mmun_end); + + if (is_cow) { + mmu_notifier_range_init(&range, src_mm, addr, end); + mmu_notifier_invalidate_range_start(&range); + } ret = 0; dst_pgd = pgd_offset(dst_mm, addr); @@ -1029,7 +1028,7 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, } while (dst_pgd++, src_pgd++, addr = next, addr != end); if (is_cow) - mmu_notifier_invalidate_range_end(src_mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); return ret; } @@ -1332,12 +1331,13 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr) { - struct mm_struct *mm = vma->vm_mm; + struct mmu_notifier_range range; - mmu_notifier_invalidate_range_start(mm, start_addr, end_addr); + mmu_notifier_range_init(&range, vma->vm_mm, start_addr, end_addr); + mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) unmap_single_vma(tlb, vma, start_addr, end_addr, NULL); - mmu_notifier_invalidate_range_end(mm, start_addr, end_addr); + mmu_notifier_invalidate_range_end(&range); } /** @@ -1351,18 +1351,18 @@ void unmap_vmas(struct mmu_gather *tlb, void zap_page_range(struct vm_area_struct *vma, unsigned long start, unsigned long size) { - struct mm_struct *mm = vma->vm_mm; + struct mmu_notifier_range range; struct mmu_gather tlb; - unsigned long end = start + size; lru_add_drain(); - tlb_gather_mmu(&tlb, mm, start, end); - update_hiwater_rss(mm); - mmu_notifier_invalidate_range_start(mm, start, end); - for ( ; vma && vma->vm_start < end; vma = vma->vm_next) - unmap_single_vma(&tlb, vma, start, end, NULL); - mmu_notifier_invalidate_range_end(mm, start, end); - tlb_finish_mmu(&tlb, start, end); + mmu_notifier_range_init(&range, vma->vm_mm, start, start + size); + tlb_gather_mmu(&tlb, vma->vm_mm, start, range.end); + update_hiwater_rss(vma->vm_mm); + mmu_notifier_invalidate_range_start(&range); + for ( ; vma && vma->vm_start < range.end; vma = vma->vm_next) + unmap_single_vma(&tlb, vma, start, range.end, NULL); + mmu_notifier_invalidate_range_end(&range); + tlb_finish_mmu(&tlb, start, range.end); } /** @@ -1377,17 +1377,17 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, static void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, unsigned long size, struct zap_details *details) { - struct mm_struct *mm = vma->vm_mm; + struct mmu_notifier_range range; struct mmu_gather tlb; - unsigned long end = address + size; lru_add_drain(); - tlb_gather_mmu(&tlb, mm, address, end); - update_hiwater_rss(mm); - mmu_notifier_invalidate_range_start(mm, address, end); - unmap_single_vma(&tlb, vma, address, end, details); - mmu_notifier_invalidate_range_end(mm, address, end); - tlb_finish_mmu(&tlb, address, end); + mmu_notifier_range_init(&range, vma->vm_mm, address, address + size); + tlb_gather_mmu(&tlb, vma->vm_mm, address, range.end); + update_hiwater_rss(vma->vm_mm); + mmu_notifier_invalidate_range_start(&range); + unmap_single_vma(&tlb, vma, address, range.end, details); + mmu_notifier_invalidate_range_end(&range); + tlb_finish_mmu(&tlb, address, range.end); } /** @@ -2247,9 +2247,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) struct page *new_page = NULL; pte_t entry; int page_copied = 0; - const unsigned long mmun_start = vmf->address & PAGE_MASK; - const unsigned long mmun_end = mmun_start + PAGE_SIZE; struct mem_cgroup *memcg; + struct mmu_notifier_range range; if (unlikely(anon_vma_prepare(vma))) goto oom; @@ -2272,7 +2271,9 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) __SetPageUptodate(new_page); - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, mm, vmf->address & PAGE_MASK, + (vmf->address & PAGE_MASK) + PAGE_SIZE); + mmu_notifier_invalidate_range_start(&range); /* * Re-check the pte - we dropped the lock @@ -2349,7 +2350,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * No need to double call mmu_notifier->invalidate_range() callback as * the above ptep_clear_flush_notify() did already call it. */ - mmu_notifier_invalidate_range_only_end(mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_only_end(&range); if (old_page) { /* * Don't let another task, with possibly unlocked vma, @@ -4030,7 +4031,7 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) #endif /* __PAGETABLE_PMD_FOLDED */ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, - unsigned long *start, unsigned long *end, + struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) { pgd_t *pgd; @@ -4058,10 +4059,10 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, if (!pmdpp) goto out; - if (start && end) { - *start = address & PMD_MASK; - *end = *start + PMD_SIZE; - mmu_notifier_invalidate_range_start(mm, *start, *end); + if (range) { + mmu_notifier_range_init(range, mm, address & PMD_MASK, + (address & PMD_MASK) + PMD_SIZE); + mmu_notifier_invalidate_range_start(range); } *ptlp = pmd_lock(mm, pmd); if (pmd_huge(*pmd)) { @@ -4069,17 +4070,17 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, return 0; } spin_unlock(*ptlp); - if (start && end) - mmu_notifier_invalidate_range_end(mm, *start, *end); + if (range) + mmu_notifier_invalidate_range_end(range); } if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; - if (start && end) { - *start = address & PAGE_MASK; - *end = *start + PAGE_SIZE; - mmu_notifier_invalidate_range_start(mm, *start, *end); + if (range) { + range->start = address & PAGE_MASK; + range->end = range->start + PAGE_SIZE; + mmu_notifier_invalidate_range_start(range); } ptep = pte_offset_map_lock(mm, pmd, address, ptlp); if (!pte_present(*ptep)) @@ -4088,8 +4089,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, return 0; unlock: pte_unmap_unlock(ptep, *ptlp); - if (start && end) - mmu_notifier_invalidate_range_end(mm, *start, *end); + if (range) + mmu_notifier_invalidate_range_end(range); out: return -EINVAL; } @@ -4101,20 +4102,20 @@ static inline int follow_pte(struct mm_struct *mm, unsigned long address, /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte_pmd(mm, address, NULL, NULL, + !(res = __follow_pte_pmd(mm, address, NULL, ptepp, NULL, ptlp))); return res; } int follow_pte_pmd(struct mm_struct *mm, unsigned long address, - unsigned long *start, unsigned long *end, + struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) { int res; /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte_pmd(mm, address, start, end, + !(res = __follow_pte_pmd(mm, address, range, ptepp, pmdpp, ptlp))); return res; } diff --git a/mm/migrate.c b/mm/migrate.c index f7e4bfdc13b7..522d310e91d4 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2303,6 +2303,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, */ static void migrate_vma_collect(struct migrate_vma *migrate) { + struct mmu_notifier_range range; struct mm_walk mm_walk; mm_walk.pmd_entry = migrate_vma_collect_pmd; @@ -2314,13 +2315,11 @@ static void migrate_vma_collect(struct migrate_vma *migrate) mm_walk.mm = migrate->vma->vm_mm; mm_walk.private = migrate; - mmu_notifier_invalidate_range_start(mm_walk.mm, - migrate->start, - migrate->end); + mmu_notifier_range_init(&range, mm_walk.mm, migrate->start, + migrate->end); + mmu_notifier_invalidate_range_start(&range); walk_page_range(migrate->start, migrate->end, &mm_walk); - mmu_notifier_invalidate_range_end(mm_walk.mm, - migrate->start, - migrate->end); + mmu_notifier_invalidate_range_end(&range); migrate->end = migrate->start + (migrate->npages << PAGE_SHIFT); } @@ -2701,9 +2700,8 @@ static void migrate_vma_pages(struct migrate_vma *migrate) { const unsigned long npages = migrate->npages; const unsigned long start = migrate->start; - struct vm_area_struct *vma = migrate->vma; - struct mm_struct *mm = vma->vm_mm; - unsigned long addr, i, mmu_start; + struct mmu_notifier_range range; + unsigned long addr, i; bool notified = false; for (i = 0, addr = start; i < npages; addr += PAGE_SIZE, i++) { @@ -2722,11 +2720,12 @@ static void migrate_vma_pages(struct migrate_vma *migrate) continue; } if (!notified) { - mmu_start = addr; notified = true; - mmu_notifier_invalidate_range_start(mm, - mmu_start, - migrate->end); + + mmu_notifier_range_init(&range, + migrate->vma->vm_mm, + addr, migrate->end); + mmu_notifier_invalidate_range_start(&range); } migrate_vma_insert_page(migrate, addr, newpage, &migrate->src[i], @@ -2767,8 +2766,7 @@ static void migrate_vma_pages(struct migrate_vma *migrate) * did already call it. */ if (notified) - mmu_notifier_invalidate_range_only_end(mm, mmu_start, - migrate->end); + mmu_notifier_invalidate_range_only_end(&range); } /* diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 5f6665ae3ee2..4c52b3514c50 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -174,28 +174,20 @@ void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address, srcu_read_unlock(&srcu, id); } -int __mmu_notifier_invalidate_range_start(struct mm_struct *mm, - unsigned long start, unsigned long end, - bool blockable) +int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) { - struct mmu_notifier_range _range, *range = &_range; struct mmu_notifier *mn; int ret = 0; int id; - range->blockable = blockable; - range->start = start; - range->end = end; - range->mm = mm; - id = srcu_read_lock(&srcu); - hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) { + hlist_for_each_entry_rcu(mn, &range->mm->mmu_notifier_mm->list, hlist) { if (mn->ops->invalidate_range_start) { int _ret = mn->ops->invalidate_range_start(mn, range); if (_ret) { pr_info("%pS callback failed with %d in %sblockable context.\n", mn->ops->invalidate_range_start, _ret, - !blockable ? "non-" : ""); + !range->blockable ? "non-" : ""); ret = _ret; } } @@ -206,27 +198,14 @@ int __mmu_notifier_invalidate_range_start(struct mm_struct *mm, } EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range_start); -void __mmu_notifier_invalidate_range_end(struct mm_struct *mm, - unsigned long start, - unsigned long end, +void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *range, bool only_end) { - struct mmu_notifier_range _range, *range = &_range; struct mmu_notifier *mn; int id; - /* - * The end call back will never be call if the start refused to go - * through because of blockable was false so here assume that we - * can block. - */ - range->blockable = true; - range->start = start; - range->end = end; - range->mm = mm; - id = srcu_read_lock(&srcu); - hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) { + hlist_for_each_entry_rcu(mn, &range->mm->mmu_notifier_mm->list, hlist) { /* * Call invalidate_range here too to avoid the need for the * subsystem of having to register an invalidate_range_end @@ -241,7 +220,9 @@ void __mmu_notifier_invalidate_range_end(struct mm_struct *mm, * already happen under page table lock. */ if (!only_end && mn->ops->invalidate_range) - mn->ops->invalidate_range(mn, mm, start, end); + mn->ops->invalidate_range(mn, range->mm, + range->start, + range->end); if (mn->ops->invalidate_range_end) mn->ops->invalidate_range_end(mn, range); } diff --git a/mm/mprotect.c b/mm/mprotect.c index 6d331620b9e5..36cb358db170 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -167,11 +167,12 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pgprot_t newprot, int dirty_accountable, int prot_numa) { pmd_t *pmd; - struct mm_struct *mm = vma->vm_mm; unsigned long next; unsigned long pages = 0; unsigned long nr_huge_updates = 0; - unsigned long mni_start = 0; + struct mmu_notifier_range range; + + range.start = 0; pmd = pmd_offset(pud, addr); do { @@ -183,9 +184,9 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, goto next; /* invoke the mmu notifier if the pmd is populated */ - if (!mni_start) { - mni_start = addr; - mmu_notifier_invalidate_range_start(mm, mni_start, end); + if (!range.start) { + mmu_notifier_range_init(&range, vma->vm_mm, addr, end); + mmu_notifier_invalidate_range_start(&range); } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { @@ -214,8 +215,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, cond_resched(); } while (pmd++, addr = next, addr != end); - if (mni_start) - mmu_notifier_invalidate_range_end(mm, mni_start, end); + if (range.start) + mmu_notifier_invalidate_range_end(&range); if (nr_huge_updates) count_vm_numa_events(NUMA_HUGE_PTE_UPDATES, nr_huge_updates); diff --git a/mm/mremap.c b/mm/mremap.c index 7f9f9180e401..def01d86e36f 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -197,16 +197,14 @@ unsigned long move_page_tables(struct vm_area_struct *vma, bool need_rmap_locks) { unsigned long extent, next, old_end; + struct mmu_notifier_range range; pmd_t *old_pmd, *new_pmd; - unsigned long mmun_start; /* For mmu_notifiers */ - unsigned long mmun_end; /* For mmu_notifiers */ old_end = old_addr + len; flush_cache_range(vma, old_addr, old_end); - mmun_start = old_addr; - mmun_end = old_end; - mmu_notifier_invalidate_range_start(vma->vm_mm, mmun_start, mmun_end); + mmu_notifier_range_init(&range, vma->vm_mm, old_addr, old_end); + mmu_notifier_invalidate_range_start(&range); for (; old_addr < old_end; old_addr += extent, new_addr += extent) { cond_resched(); @@ -247,7 +245,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma, new_pmd, new_addr, need_rmap_locks); } - mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start, mmun_end); + mmu_notifier_invalidate_range_end(&range); return len + old_addr - old_end; /* how much done */ } diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 6589f60d5018..1eea8b04f27a 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -516,19 +516,20 @@ bool __oom_reap_task_mm(struct mm_struct *mm) * count elevated without a good reason. */ if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) { - const unsigned long start = vma->vm_start; - const unsigned long end = vma->vm_end; + struct mmu_notifier_range range; struct mmu_gather tlb; - tlb_gather_mmu(&tlb, mm, start, end); - if (mmu_notifier_invalidate_range_start_nonblock(mm, start, end)) { - tlb_finish_mmu(&tlb, start, end); + mmu_notifier_range_init(&range, mm, vma->vm_start, + vma->vm_end); + tlb_gather_mmu(&tlb, mm, range.start, range.end); + if (mmu_notifier_invalidate_range_start_nonblock(&range)) { + tlb_finish_mmu(&tlb, range.start, range.end); ret = false; continue; } - unmap_page_range(&tlb, vma, start, end, NULL); - mmu_notifier_invalidate_range_end(mm, start, end); - tlb_finish_mmu(&tlb, start, end); + unmap_page_range(&tlb, vma, range.start, range.end, NULL); + mmu_notifier_invalidate_range_end(&range); + tlb_finish_mmu(&tlb, range.start, range.end); } } diff --git a/mm/rmap.c b/mm/rmap.c index 85b7f9423352..c75f72f6fe0e 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -889,15 +889,17 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma, .address = address, .flags = PVMW_SYNC, }; - unsigned long start = address, end; + struct mmu_notifier_range range; int *cleaned = arg; /* * We have to assume the worse case ie pmd for invalidation. Note that * the page can not be free from this function. */ - end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page))); - mmu_notifier_invalidate_range_start(vma->vm_mm, start, end); + mmu_notifier_range_init(&range, vma->vm_mm, address, + min(vma->vm_end, address + + (PAGE_SIZE << compound_order(page)))); + mmu_notifier_invalidate_range_start(&range); while (page_vma_mapped_walk(&pvmw)) { unsigned long cstart; @@ -949,7 +951,7 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma, (*cleaned)++; } - mmu_notifier_invalidate_range_end(vma->vm_mm, start, end); + mmu_notifier_invalidate_range_end(&range); return true; } @@ -1345,7 +1347,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, pte_t pteval; struct page *subpage; bool ret = true; - unsigned long start = address, end; + struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)arg; /* munlock has nothing to gain from examining un-locked vmas */ @@ -1369,15 +1371,18 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, * Note that the page can not be free in this function as call of * try_to_unmap() must hold a reference on the page. */ - end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page))); + mmu_notifier_range_init(&range, vma->vm_mm, vma->vm_start, + min(vma->vm_end, vma->vm_start + + (PAGE_SIZE << compound_order(page)))); if (PageHuge(page)) { /* * If sharing is possible, start and end will be adjusted * accordingly. */ - adjust_range_if_pmd_sharing_possible(vma, &start, &end); + adjust_range_if_pmd_sharing_possible(vma, &range.start, + &range.end); } - mmu_notifier_invalidate_range_start(vma->vm_mm, start, end); + mmu_notifier_invalidate_range_start(&range); while (page_vma_mapped_walk(&pvmw)) { #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION @@ -1428,9 +1433,10 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, * we must flush them all. start/end were * already adjusted above to cover this range. */ - flush_cache_range(vma, start, end); - flush_tlb_range(vma, start, end); - mmu_notifier_invalidate_range(mm, start, end); + flush_cache_range(vma, range.start, range.end); + flush_tlb_range(vma, range.start, range.end); + mmu_notifier_invalidate_range(mm, range.start, + range.end); /* * The ref count of the PMD page was dropped @@ -1650,7 +1656,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, put_page(page); } - mmu_notifier_invalidate_range_end(vma->vm_mm, start, end); + mmu_notifier_invalidate_range_end(&range); return ret; } From patchwork Thu Dec 13 17:13:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10729209 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2EC3914E2 for ; Thu, 13 Dec 2018 17:14:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1389327CAF for ; Thu, 13 Dec 2018 17:14:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 06E6D28683; Thu, 13 Dec 2018 17:14:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D529327CAF for ; Thu, 13 Dec 2018 17:13:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729667AbeLMRNw (ORCPT ); Thu, 13 Dec 2018 12:13:52 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38668 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729654AbeLMRNv (ORCPT ); Thu, 13 Dec 2018 12:13:51 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9481F753FB; Thu, 13 Dec 2018 17:13:50 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-121-185.rdu2.redhat.com [10.10.121.185]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08483600C5; Thu, 13 Dec 2018 17:13:47 +0000 (UTC) From: jglisse@redhat.com To: linux-kernel@vger.kernel.org Cc: =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , Matthew Wilcox , Ross Zwisler , Dan Williams , Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , Michal Hocko , Christian Koenig , Ralph Campbell , John Hubbard , kvm@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-rdma@vger.kernel.org, linux-fsdevel@vger.kernel.org, Arnd Bergmann Subject: [PATCH v3 3/3] mm/mmu_notifier: contextual information for event triggering invalidation v2 Date: Thu, 13 Dec 2018 12:13:30 -0500 Message-Id: <20181213171330.8489-4-jglisse@redhat.com> In-Reply-To: <20181213171330.8489-1-jglisse@redhat.com> References: <20181213171330.8489-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 13 Dec 2018 17:13:50 +0000 (UTC) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse CPU page table update can happens for many reasons, not only as a result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also as a result of kernel activities (memory compression, reclaim, migration, ...). Users of mmu notifier API track changes to the CPU page table and take specific action for them. While current API only provide range of virtual address affected by the change, not why the changes is happening. This patchset adds event information so that users of mmu notifier can differentiate among broad category: - UNMAP: munmap() or mremap() - CLEAR: page table is cleared (migration, compaction, reclaim, ...) - PROTECTION_VMA: change in access protections for the range - PROTECTION_PAGE: change in access protections for page in the range - SOFT_DIRTY: soft dirtyness tracking Being able to identify munmap() and mremap() from other reasons why the page table is cleared is important to allow user of mmu notifier to update their own internal tracking structure accordingly (on munmap or mremap it is not longer needed to track range of virtual address as it becomes invalid). Changes since v1: - use mmu_notifier_range_init() helper to to optimize out the case when mmu notifier is not enabled - use kernel doc format for describing the enum values Signed-off-by: Jérôme Glisse Acked-by: Christian König Acked-by: Jan Kara Acked-by: Felix Kuehling Acked-by: Jason Gunthorpe Cc: Andrew Morton Cc: Matthew Wilcox Cc: Ross Zwisler Cc: Dan Williams Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Michal Hocko Cc: Christian Koenig Cc: Ralph Campbell Cc: John Hubbard Cc: kvm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-rdma@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: Arnd Bergmann --- fs/dax.c | 7 +++++++ fs/proc/task_mmu.c | 3 ++- include/linux/mmu_notifier.h | 35 +++++++++++++++++++++++++++++++++-- kernel/events/uprobes.c | 3 ++- mm/huge_memory.c | 12 ++++++++---- mm/hugetlb.c | 10 ++++++---- mm/khugepaged.c | 3 ++- mm/ksm.c | 6 ++++-- mm/madvise.c | 3 ++- mm/memory.c | 18 ++++++++++++------ mm/migrate.c | 5 +++-- mm/mprotect.c | 3 ++- mm/mremap.c | 3 ++- mm/oom_kill.c | 2 +- mm/rmap.c | 6 ++++-- 15 files changed, 90 insertions(+), 29 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 874085bacaf5..6056b03a1626 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -768,6 +768,13 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, address = pgoff_address(index, vma); + /* + * All the field are populated by follow_pte_pmd() except + * the event field. + */ + mmu_notifier_range_init(&range, NULL, 0, -1UL, + MMU_NOTIFY_PROTECTION_PAGE); + /* * Note because we provide start/end to follow_pte_pmd it will * call mmu_notifier_invalidate_range_start() on our behalf diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index b3ddceb003bc..f68a9ebb0218 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1141,7 +1141,8 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, break; } - mmu_notifier_range_init(&range, mm, 0, -1UL); + mmu_notifier_range_init(&range, mm, 0, -1UL, + MMU_NOTIFY_SOFT_DIRTY); mmu_notifier_invalidate_range_start(&range); } walk_page_range(0, mm->highest_vm_end, &clear_refs_walk); diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 39b06772427f..d249e24acea5 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -25,10 +25,39 @@ struct mmu_notifier_mm { spinlock_t lock; }; +/** + * enum mmu_notifier_event - reason for the mmu notifier callback + * @MMU_NOTIFY_UNMAP: either munmap() that unmap the range or a mremap() that + * move the range + * + * @MMU_NOTIFY_CLEAR: clear page table entry (many reasons for this like + * madvise() or replacing a page by another one, ...). + * + * @MMU_NOTIFY_PROTECTION_VMA: update is due to protection change for the range + * ie using the vma access permission (vm_page_prot) to update the whole range + * is enough no need to inspect changes to the CPU page table (mprotect() + * syscall) + * + * @MMU_NOTIFY_PROTECTION_PAGE: update is due to change in read/write flag for + * pages in the range so to mirror those changes the user must inspect the CPU + * page table (from the end callback). + * + * @MMU_NOTIFY_SOFT_DIRTY: soft dirty accounting (still same page and same + * access flags) + */ +enum mmu_notifier_event { + MMU_NOTIFY_UNMAP = 0, + MMU_NOTIFY_CLEAR, + MMU_NOTIFY_PROTECTION_VMA, + MMU_NOTIFY_PROTECTION_PAGE, + MMU_NOTIFY_SOFT_DIRTY, +}; + struct mmu_notifier_range { struct mm_struct *mm; unsigned long start; unsigned long end; + enum mmu_notifier_event event; bool blockable; }; @@ -320,11 +349,13 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm) static inline void mmu_notifier_range_init(struct mmu_notifier_range *range, struct mm_struct *mm, unsigned long start, - unsigned long end) + unsigned long end, + enum mmu_notifier_event event) { range->mm = mm; range->start = start; range->end = end; + range->event = event; } #define ptep_clear_flush_young_notify(__vma, __address, __ptep) \ @@ -453,7 +484,7 @@ static inline void _mmu_notifier_range_init(struct mmu_notifier_range *range, range->end = end; } -#define mmu_notifier_range_init(range, mm, start, end) \ +#define mmu_notifier_range_init(range, mm, start, end, event) \ _mmu_notifier_range_init(range, start, end) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 1fc8a93709c3..a70c3204f25d 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -174,7 +174,8 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, struct mmu_notifier_range range; struct mem_cgroup *memcg; - mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE); + mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE, + MMU_NOTIFY_CLEAR); VM_BUG_ON_PAGE(PageTransHuge(old_page), old_page); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c1d3ce809416..b8d9029890c5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1183,7 +1183,8 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, } mmu_notifier_range_init(&range, vma->vm_mm, haddr, - haddr + HPAGE_PMD_SIZE); + haddr + HPAGE_PMD_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); @@ -1347,7 +1348,8 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd) __SetPageUptodate(new_page); mmu_notifier_range_init(&range, vma->vm_mm, haddr, - haddr + HPAGE_PMD_SIZE); + haddr + HPAGE_PMD_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); spin_lock(vmf->ptl); @@ -2027,7 +2029,8 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, struct mmu_notifier_range range; mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PUD_MASK, - (address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE); + (address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); ptl = pud_lock(vma->vm_mm, pud); if (unlikely(!pud_trans_huge(*pud) && !pud_devmap(*pud))) @@ -2243,7 +2246,8 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, struct mmu_notifier_range range; mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PMD_MASK, - (address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE); + (address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); ptl = pmd_lock(vma->vm_mm, pmd); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e7c179cbcd75..fd1395918bb3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3246,7 +3246,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, if (cow) { mmu_notifier_range_init(&range, src, vma->vm_start, - vma->vm_end); + vma->vm_end, MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); } @@ -3357,7 +3357,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, /* * If sharing possible, alert mmu notifiers of worst case. */ - mmu_notifier_range_init(&range, mm, start, end); + mmu_notifier_range_init(&range, mm, start, end, MMU_NOTIFY_CLEAR); adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); mmu_notifier_invalidate_range_start(&range); address = start; @@ -3625,7 +3625,8 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, __SetPageUptodate(new_page); set_page_huge_active(new_page); - mmu_notifier_range_init(&range, mm, haddr, haddr + huge_page_size(h)); + mmu_notifier_range_init(&range, mm, haddr, haddr + huge_page_size(h), + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); /* @@ -4345,7 +4346,8 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * start/end. Set range.start/range.end to cover the maximum possible * range if PMD sharing is possible. */ - mmu_notifier_range_init(&range, mm, start, end); + mmu_notifier_range_init(&range, mm, start, end, + MMU_NOTIFY_PROTECTION_VMA); adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); BUG_ON(address >= end); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 7736f6c37f19..331dc2738f48 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1016,7 +1016,8 @@ static void collapse_huge_page(struct mm_struct *mm, pte = pte_offset_map(pmd, address); pte_ptl = pte_lockptr(mm, pmd); - mmu_notifier_range_init(&range, mm, address, address + HPAGE_PMD_SIZE); + mmu_notifier_range_init(&range, mm, address, address + HPAGE_PMD_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */ /* diff --git a/mm/ksm.c b/mm/ksm.c index 6239d2df7a8e..98a627c01eb0 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1051,7 +1051,8 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, BUG_ON(PageTransCompound(page)); mmu_notifier_range_init(&range, mm, pvmw.address, - pvmw.address + PAGE_SIZE); + pvmw.address + PAGE_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); if (!page_vma_mapped_walk(&pvmw)) @@ -1138,7 +1139,8 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, if (!pmd) goto out; - mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE); + mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); ptep = pte_offset_map_lock(mm, pmd, addr, &ptl); diff --git a/mm/madvise.c b/mm/madvise.c index 21a7881a2db4..d220ad7087ed 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -472,7 +472,8 @@ static int madvise_free_single_vma(struct vm_area_struct *vma, range.end = min(vma->vm_end, end_addr); if (range.end <= vma->vm_start) return -EINVAL; - mmu_notifier_range_init(&range, mm, range.start, range.end); + mmu_notifier_range_init(&range, mm, range.start, range.end, + MMU_NOTIFY_CLEAR); lru_add_drain(); tlb_gather_mmu(&tlb, mm, range.start, range.end); diff --git a/mm/memory.c b/mm/memory.c index 574307f11464..893e3bc4b895 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1009,7 +1009,8 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, is_cow = is_cow_mapping(vma->vm_flags); if (is_cow) { - mmu_notifier_range_init(&range, src_mm, addr, end); + mmu_notifier_range_init(&range, src_mm, addr, end, + MMU_NOTIFY_PROTECTION_PAGE); mmu_notifier_invalidate_range_start(&range); } @@ -1333,7 +1334,8 @@ void unmap_vmas(struct mmu_gather *tlb, { struct mmu_notifier_range range; - mmu_notifier_range_init(&range, vma->vm_mm, start_addr, end_addr); + mmu_notifier_range_init(&range, vma->vm_mm, start_addr, + end_addr, MMU_NOTIFY_UNMAP); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) unmap_single_vma(tlb, vma, start_addr, end_addr, NULL); @@ -1355,7 +1357,8 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start, struct mmu_gather tlb; lru_add_drain(); - mmu_notifier_range_init(&range, vma->vm_mm, start, start + size); + mmu_notifier_range_init(&range, vma->vm_mm, start, + start + size, MMU_NOTIFY_CLEAR); tlb_gather_mmu(&tlb, vma->vm_mm, start, range.end); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); @@ -1381,7 +1384,8 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr struct mmu_gather tlb; lru_add_drain(); - mmu_notifier_range_init(&range, vma->vm_mm, address, address + size); + mmu_notifier_range_init(&range, vma->vm_mm, address, + address + size, MMU_NOTIFY_CLEAR); tlb_gather_mmu(&tlb, vma->vm_mm, address, range.end); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); @@ -2272,7 +2276,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) __SetPageUptodate(new_page); mmu_notifier_range_init(&range, mm, vmf->address & PAGE_MASK, - (vmf->address & PAGE_MASK) + PAGE_SIZE); + (vmf->address & PAGE_MASK) + PAGE_SIZE, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); /* @@ -4061,7 +4066,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, if (range) { mmu_notifier_range_init(range, mm, address & PMD_MASK, - (address & PMD_MASK) + PMD_SIZE); + (address & PMD_MASK) + PMD_SIZE, + range->event); mmu_notifier_invalidate_range_start(range); } *ptlp = pmd_lock(mm, pmd); diff --git a/mm/migrate.c b/mm/migrate.c index 522d310e91d4..a83daa54da07 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2316,7 +2316,7 @@ static void migrate_vma_collect(struct migrate_vma *migrate) mm_walk.private = migrate; mmu_notifier_range_init(&range, mm_walk.mm, migrate->start, - migrate->end); + migrate->end, MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); walk_page_range(migrate->start, migrate->end, &mm_walk); mmu_notifier_invalidate_range_end(&range); @@ -2724,7 +2724,8 @@ static void migrate_vma_pages(struct migrate_vma *migrate) mmu_notifier_range_init(&range, migrate->vma->vm_mm, - addr, migrate->end); + addr, migrate->end, + MMU_NOTIFY_CLEAR); mmu_notifier_invalidate_range_start(&range); } migrate_vma_insert_page(migrate, addr, newpage, diff --git a/mm/mprotect.c b/mm/mprotect.c index 36cb358db170..d5bbe5ca61ac 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -185,7 +185,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, /* invoke the mmu notifier if the pmd is populated */ if (!range.start) { - mmu_notifier_range_init(&range, vma->vm_mm, addr, end); + mmu_notifier_range_init(&range, vma->vm_mm, addr, end, + MMU_NOTIFY_PROTECTION_VMA); mmu_notifier_invalidate_range_start(&range); } diff --git a/mm/mremap.c b/mm/mremap.c index def01d86e36f..386e3c492f6e 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -203,7 +203,8 @@ unsigned long move_page_tables(struct vm_area_struct *vma, old_end = old_addr + len; flush_cache_range(vma, old_addr, old_end); - mmu_notifier_range_init(&range, vma->vm_mm, old_addr, old_end); + mmu_notifier_range_init(&range, vma->vm_mm, old_addr, + old_end, MMU_NOTIFY_UNMAP); mmu_notifier_invalidate_range_start(&range); for (; old_addr < old_end; old_addr += extent, new_addr += extent) { diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 1eea8b04f27a..4ac95032b898 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -520,7 +520,7 @@ bool __oom_reap_task_mm(struct mm_struct *mm) struct mmu_gather tlb; mmu_notifier_range_init(&range, mm, vma->vm_start, - vma->vm_end); + vma->vm_end, MMU_NOTIFY_CLEAR); tlb_gather_mmu(&tlb, mm, range.start, range.end); if (mmu_notifier_invalidate_range_start_nonblock(&range)) { tlb_finish_mmu(&tlb, range.start, range.end); diff --git a/mm/rmap.c b/mm/rmap.c index c75f72f6fe0e..6ca019cdc789 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -898,7 +898,8 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma, */ mmu_notifier_range_init(&range, vma->vm_mm, address, min(vma->vm_end, address + - (PAGE_SIZE << compound_order(page)))); + (PAGE_SIZE << compound_order(page))), + MMU_NOTIFY_PROTECTION_PAGE); mmu_notifier_invalidate_range_start(&range); while (page_vma_mapped_walk(&pvmw)) { @@ -1373,7 +1374,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, */ mmu_notifier_range_init(&range, vma->vm_mm, vma->vm_start, min(vma->vm_end, vma->vm_start + - (PAGE_SIZE << compound_order(page)))); + (PAGE_SIZE << compound_order(page))), + MMU_NOTIFY_CLEAR); if (PageHuge(page)) { /* * If sharing is possible, start and end will be adjusted