mbox series

[0/2] Fixes for hugetlb mapcount at most 1 for shared PMDs

Message ID 20230126222721.222195-1-mike.kravetz@oracle.com (mailing list archive)
Headers show
Series Fixes for hugetlb mapcount at most 1 for shared PMDs | expand

Message

Mike Kravetz Jan. 26, 2023, 10:27 p.m. UTC
This issue of mapcount in hugetlb pages referenced by shared PMDs was
discussed in [1].  The following two patches address user visible
behavior caused by this issue.

Patches apply to mm-stable as they can also target stable backports.

Ongoing folio conversions cause context conflicts in the second patch
when applied to mm-unstable/linux-next.  I can create separate patch(es)
if people agree with these.

[1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/
Mike Kravetz (2):
  mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
  migrate: hugetlb: Check for hugetlb shared PMD in node migration

 fs/proc/task_mmu.c      | 10 ++++++++--
 include/linux/hugetlb.h | 12 ++++++++++++
 mm/mempolicy.c          |  3 ++-
 3 files changed, 22 insertions(+), 3 deletions(-)

Comments

Andrew Morton Jan. 26, 2023, 10:47 p.m. UTC | #1
On Thu, 26 Jan 2023 14:27:19 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:

> Ongoing folio conversions cause context conflicts in the second patch
> when applied to mm-unstable/linux-next.  I can create separate patch(es)
> if people agree with these.

I fixed things up.  queue_folios_hugetlb() is now

static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask,
			       unsigned long addr, unsigned long end,
			       struct mm_walk *walk)
{
	int ret = 0;
#ifdef CONFIG_HUGETLB_PAGE
	struct queue_pages *qp = walk->private;
	unsigned long flags = (qp->flags & MPOL_MF_VALID);
	struct folio *folio;
	spinlock_t *ptl;
	pte_t entry;

	ptl = huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte);
	entry = huge_ptep_get(pte);
	if (!pte_present(entry))
		goto unlock;
	folio = pfn_folio(pte_pfn(entry));
	if (!queue_folio_required(folio, qp))
		goto unlock;

	if (flags == MPOL_MF_STRICT) {
		/*
		 * STRICT alone means only detecting misplaced folio and no
		 * need to further check other vma.
		 */
		ret = -EIO;
		goto unlock;
	}

	if (!vma_migratable(walk->vma)) {
		/*
		 * Must be STRICT with MOVE*, otherwise .test_walk() have
		 * stopped walking current vma.
		 * Detecting misplaced folio but allow migrating folios which
		 * have been queued.
		 */
		ret = 1;
		goto unlock;
	}

	/*
	 * With MPOL_MF_MOVE, we try to migrate only unshared folios. If it
	 * is shared it is likely not worth migrating.
	 *
	 * To check if the folio is shared, ideally we want to make sure
	 * every page is mapped to the same process. Doing that is very
	 * expensive, so check the estimated mapcount of the folio instead.
	 */
	if (flags & (MPOL_MF_MOVE_ALL) ||
	    (flags & MPOL_MF_MOVE && folio_estimated_mapcount(folio) == 1 &&
	     !hugetlb_pmd_shared(pte))) {
		if (isolate_hugetlb(folio, qp->pagelist) &&
			(flags & MPOL_MF_STRICT))
			/*
			 * Failed to isolate folio but allow migrating pages
			 * which have been queued.
			 */
			ret = 1;
	}
unlock:
	spin_unlock(ptl);
#else
	BUG();
#endif
	return ret;
}
Peter Xu Jan. 26, 2023, 10:48 p.m. UTC | #2
On Thu, Jan 26, 2023 at 02:27:19PM -0800, Mike Kravetz wrote:
> This issue of mapcount in hugetlb pages referenced by shared PMDs was
> discussed in [1].  The following two patches address user visible
> behavior caused by this issue.
> 
> Patches apply to mm-stable as they can also target stable backports.
> 
> Ongoing folio conversions cause context conflicts in the second patch
> when applied to mm-unstable/linux-next.  I can create separate patch(es)
> if people agree with these.
> 
> [1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/
> Mike Kravetz (2):
>   mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
>   migrate: hugetlb: Check for hugetlb shared PMD in node migration

Acked-by: Peter Xu <peterx@redhat.com>