diff mbox series

[v3,hmm,06/12] mm/hmm: Hold on to the mmget for the lifetime of the range

Message ID 20190614004450.20252-7-jgg@ziepe.ca (mailing list archive)
State New, archived
Headers show
Series mm/hmm: Various revisions from a locking/code review | expand

Commit Message

Jason Gunthorpe June 14, 2019, 12:44 a.m. UTC
From: Jason Gunthorpe <jgg@mellanox.com>

Range functions like hmm_range_snapshot() and hmm_range_fault() call
find_vma, which requires hodling the mmget() and the mmap_sem for the mm.

Make this simpler for the callers by holding the mmget() inside the range
for the lifetime of the range. Other functions that accept a range should
only be called if the range is registered.

This has the side effect of directly preventing hmm_release() from
happening while a range is registered. That means range->dead cannot be
false during the lifetime of the range, so remove dead and
hmm_mirror_mm_is_alive() entirely.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2:
 - Use Jerome's idea of just holding the mmget() for the range lifetime,
   rework the patch to use that as as simplification to remove dead in
   one step
---
 include/linux/hmm.h | 26 --------------------------
 mm/hmm.c            | 28 ++++++++++------------------
 2 files changed, 10 insertions(+), 44 deletions(-)

Comments

Christoph Hellwig June 15, 2019, 2:14 p.m. UTC | #1
>  	mutex_lock(&hmm->lock);
> -	list_for_each_entry(range, &hmm->ranges, list)
> -		range->valid = false;
> -	wake_up_all(&hmm->wq);
> +	/*
> +	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
> +	 * prevented as long as a range exists.
> +	 */
> +	WARN_ON(!list_empty(&hmm->ranges));
>  	mutex_unlock(&hmm->lock);

This can just use list_empty_careful and avoid the lock entirely.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Jason Gunthorpe June 18, 2019, 3:11 p.m. UTC | #2
On Sat, Jun 15, 2019 at 07:14:35AM -0700, Christoph Hellwig wrote:
> >  	mutex_lock(&hmm->lock);
> > -	list_for_each_entry(range, &hmm->ranges, list)
> > -		range->valid = false;
> > -	wake_up_all(&hmm->wq);
> > +	/*
> > +	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
> > +	 * prevented as long as a range exists.
> > +	 */
> > +	WARN_ON(!list_empty(&hmm->ranges));
> >  	mutex_unlock(&hmm->lock);
> 
> This can just use list_empty_careful and avoid the lock entirely.

Sure, it is just a debugging helper and the mmput should serialize
thinigs enough to be reliable. I had to move the RCU patch ahead of
this. Thanks

diff --git a/mm/hmm.c b/mm/hmm.c
index a9ace28984ea42..1eddda45cefae7 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -124,13 +124,11 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 	if (!kref_get_unless_zero(&hmm->kref))
 		return;
 
-	mutex_lock(&hmm->lock);
 	/*
 	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
 	 * prevented as long as a range exists.
 	 */
-	WARN_ON(!list_empty(&hmm->ranges));
-	mutex_unlock(&hmm->lock);
+	WARN_ON(!list_empty_careful(&hmm->ranges));
 
 	down_write(&hmm->mirrors_sem);
 	mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
@@ -938,7 +936,7 @@ void hmm_range_unregister(struct hmm_range *range)
 		return;
 
 	mutex_lock(&hmm->lock);
-	list_del(&range->list);
+	list_del_init(&range->list);
 	mutex_unlock(&hmm->lock);
 
 	/* Drop reference taken by hmm_range_register() */
Christoph Hellwig June 19, 2019, 8:18 a.m. UTC | #3
>  	mutex_lock(&hmm->lock);
> -	list_del(&range->list);
> +	list_del_init(&range->list);
>  	mutex_unlock(&hmm->lock);

I don't see the point why this is a list_del_init - that just
reinitializeѕ range->list, but doesn't change anything for the list
head it was removed from.  (and if the list_del_init was intended
a later patch in your branch reverts it to plain list_del..)
Jason Gunthorpe June 19, 2019, 11:34 a.m. UTC | #4
On Wed, Jun 19, 2019 at 01:18:58AM -0700, Christoph Hellwig wrote:
> >  	mutex_lock(&hmm->lock);
> > -	list_del(&range->list);
> > +	list_del_init(&range->list);
> >  	mutex_unlock(&hmm->lock);
> 
> I don't see the point why this is a list_del_init - that just
> reinitializeѕ range->list, but doesn't change anything for the list
> head it was removed from.  (and if the list_del_init was intended
> a later patch in your branch reverts it to plain list_del..)

Just following the instructions:

/**
 * list_empty_careful - tests whether a list is empty and not being modified
 * @head: the list to test
 *
 * Description:
 * tests whether a list is empty _and_ checks that no other CPU might be
 * in the process of modifying either member (next or prev)
 *
 * NOTE: using list_empty_careful() without synchronization
 * can only be safe if the only activity that can happen
 * to the list entry is list_del_init(). Eg. it cannot be used
 * if another CPU could re-list_add() it.
 */

Agree it doesn't seem obvious why this is relevant when checking the
list head..

Maybe the comment is a bit misleading?

Jason
Christoph Hellwig June 19, 2019, 11:54 a.m. UTC | #5
On Wed, Jun 19, 2019 at 08:34:52AM -0300, Jason Gunthorpe wrote:
> /**
>  * list_empty_careful - tests whether a list is empty and not being modified
>  * @head: the list to test
>  *
>  * Description:
>  * tests whether a list is empty _and_ checks that no other CPU might be
>  * in the process of modifying either member (next or prev)
>  *
>  * NOTE: using list_empty_careful() without synchronization
>  * can only be safe if the only activity that can happen
>  * to the list entry is list_del_init(). Eg. it cannot be used
>  * if another CPU could re-list_add() it.
>  */
> 
> Agree it doesn't seem obvious why this is relevant when checking the
> list head..
> 
> Maybe the comment is a bit misleading?

From looking at the commit log in the history tree list_empty_careful
was initially added by Linus, and then mingo added that comment later.
I don't see how list_del_init would change anything here, so I suspect
list_del_init was just used as a short hand for list_del or
list_del_init.
diff mbox series

Patch

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 26e7c477490c4e..bf013e96525771 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -82,7 +82,6 @@ 
  * @mirrors_sem: read/write semaphore protecting the mirrors list
  * @wq: wait queue for user waiting on a range invalidation
  * @notifiers: count of active mmu notifiers
- * @dead: is the mm dead ?
  */
 struct hmm {
 	struct mm_struct	*mm;
@@ -95,7 +94,6 @@  struct hmm {
 	wait_queue_head_t	wq;
 	struct rcu_head		rcu;
 	long			notifiers;
-	bool			dead;
 };
 
 /*
@@ -459,30 +457,6 @@  struct hmm_mirror {
 int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm);
 void hmm_mirror_unregister(struct hmm_mirror *mirror);
 
-/*
- * hmm_mirror_mm_is_alive() - test if mm is still alive
- * @mirror: the HMM mm mirror for which we want to lock the mmap_sem
- * Return: false if the mm is dead, true otherwise
- *
- * This is an optimization, it will not always accurately return false if the
- * mm is dead; i.e., there can be false negatives (process is being killed but
- * HMM is not yet informed of that). It is only intended to be used to optimize
- * out cases where the driver is about to do something time consuming and it
- * would be better to skip it if the mm is dead.
- */
-static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
-{
-	struct mm_struct *mm;
-
-	if (!mirror || !mirror->hmm)
-		return false;
-	mm = READ_ONCE(mirror->hmm->mm);
-	if (mirror->hmm->dead || !mm)
-		return false;
-
-	return true;
-}
-
 /*
  * Please see Documentation/vm/hmm.rst for how to use the range API.
  */
diff --git a/mm/hmm.c b/mm/hmm.c
index 4c64d4c32f4825..58712d74edd585 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -70,7 +70,6 @@  static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 	mutex_init(&hmm->lock);
 	kref_init(&hmm->kref);
 	hmm->notifiers = 0;
-	hmm->dead = false;
 	hmm->mm = mm;
 
 	hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
@@ -125,20 +124,17 @@  static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
-	struct hmm_range *range;
 
 	/* Bail out if hmm is in the process of being freed */
 	if (!kref_get_unless_zero(&hmm->kref))
 		return;
 
-	/* Report this HMM as dying. */
-	hmm->dead = true;
-
-	/* Wake-up everyone waiting on any range. */
 	mutex_lock(&hmm->lock);
-	list_for_each_entry(range, &hmm->ranges, list)
-		range->valid = false;
-	wake_up_all(&hmm->wq);
+	/*
+	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
+	 * prevented as long as a range exists.
+	 */
+	WARN_ON(!list_empty(&hmm->ranges));
 	mutex_unlock(&hmm->lock);
 
 	down_write(&hmm->mirrors_sem);
@@ -908,8 +904,8 @@  int hmm_range_register(struct hmm_range *range,
 	range->start = start;
 	range->end = end;
 
-	/* Check if hmm_mm_destroy() was call. */
-	if (hmm->mm == NULL || hmm->dead)
+	/* Prevent hmm_release() from running while the range is valid */
+	if (!mmget_not_zero(hmm->mm))
 		return -EFAULT;
 
 	/* Initialize range to track CPU page table updates. */
@@ -952,6 +948,7 @@  void hmm_range_unregister(struct hmm_range *range)
 
 	/* Drop reference taken by hmm_range_register() */
 	range->valid = false;
+	mmput(hmm->mm);
 	hmm_put(hmm);
 	range->hmm = NULL;
 }
@@ -979,10 +976,7 @@  long hmm_range_snapshot(struct hmm_range *range)
 	struct vm_area_struct *vma;
 	struct mm_walk mm_walk;
 
-	/* Check if hmm_mm_destroy() was call. */
-	if (hmm->mm == NULL || hmm->dead)
-		return -EFAULT;
-
+	lockdep_assert_held(&hmm->mm->mmap_sem);
 	do {
 		/* If range is no longer valid force retry. */
 		if (!range->valid)
@@ -1077,9 +1071,7 @@  long hmm_range_fault(struct hmm_range *range, bool block)
 	struct mm_walk mm_walk;
 	int ret;
 
-	/* Check if hmm_mm_destroy() was call. */
-	if (hmm->mm == NULL || hmm->dead)
-		return -EFAULT;
+	lockdep_assert_held(&hmm->mm->mmap_sem);
 
 	do {
 		/* If range is no longer valid force retry. */