mm/memremap: Introduce pgmap_request_folio() using pgmap offsets

A 'struct dev_pagemap' (pgmap) represents a collection of ZONE_DEVICE
pages. The pgmap is a reference counted object that serves a similar
role as a 'struct request_queue'. Live references are obtained for each
in flight request / page, and once a page's reference count drops to
zero the associated pin of the pgmap is dropped as well. While a page is
idle nothing should be accessing it because that is effectively a
use-after-free situation. Unfortunately, all current ZONE_DEVICE
implementations deploy a layering violation to manage requests to
activate pages owned by a pgmap. Specifically, they take steps like walk
the pfns that were previously assigned at memremap_pages() time and use
pfn_to_page() to recall metadata like page->pgmap, or make use of other
data like page->zone_device_data.

The first step towards correcting that situation is to provide a
API to get access to a pgmap page that does not require the caller to
know the pfn, nor access any fields of an idle page. Ideally this API
would be able to support dynamic page creation instead of the current
status quo of pre-allocating and initializing pages.

On a prompt from Jason, introduce pgmap_request_folio() that operates on
an offset into a pgmap. It replaces the shortlived
pgmap_request_folios() that was continuing the layering violation of
assuming pages are available to be consulted before asking the pgmap to
make them available.

For now this only converts the callers to lookup the pgmap and generate
the pgmap offset, but it does not do the deeper cleanup of teaching
those call sites to generate those arguments without walking the page
metadata. For next steps it appears the DEVICE_PRIVATE implementations
could plumb the pgmap into the necessary callsites and switch to using
gen_pool_alloc() to track which offsets of a pgmap are allocated. For
DAX, dax_direct_access() could switch from returning pfns to returning
the associated @pgmap and @pgmap_offset. Those changes are saved for
follow-on work.

Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
This builds on the dax reference counting reworks in mm-unstable.

 arch/powerpc/kvm/book3s_hv_uvmem.c       |   11 ++--
 drivers/dax/mapping.c                    |   10 +++
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |   14 +++--
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |   13 +++-
 include/linux/memremap.h                 |   35 ++++++++---
 lib/test_hmm.c                           |    9 +++
 mm/memremap.c                            |   92 ++++++++++++------------------
 7 files changed, 106 insertions(+), 78 deletions(-)

Message ID	166630293549.1017198.3833687373550679565.stgit@dwillia2-xfh.jf.intel.com (mailing list archive)
State	New, archived
Headers	show Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CADF07B for <nvdimm@lists.linux.dev>; Thu, 20 Oct 2022 21:56:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666303000; x=1697839000; h=subject:from:to:cc:date:message-id:mime-version: content-transfer-encoding; bh=lOKfI6sTflYeTXUvY61fNwx8fUxj12mbkLEBQEJvOj4=; b=koDU+8uesrqXxufRyV9qOIY9IsS9zGko2CE55eOjYAFjGIfRmkWVhweR +0Qr+oQ98SvWqtAYcJ1lygOis6kJsWjfYotXUhdQs0OVjgXstVFweszU0 QrMFiWxhj/O1N7km+vcf+Xyx0iYPGozzHgP2lmrNUEegcqmyHPOJbfRDo E2Dwmf+/w1hjc+YQ7BqjPhhC8uniLi9lDNX5BON1waREyC34r/6Awbfwi F2vuBFr7emheWLMq81Rtcy3XZ9EEBtvka4jqQJIiXVXl7yKyKynMFCNdL jByNhciPUWdxOhBSMkamwvY5EMpStJv+/9jZbJsF1wBtqLT+y3JpQtY8W g==; X-IronPort-AV: E=McAfee;i="6500,9779,10506"; a="308528549" X-IronPort-AV: E=Sophos;i="5.95,199,1661842800"; d="scan'208";a="308528549" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Oct 2022 14:56:40 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10506"; a="581177758" X-IronPort-AV: E=Sophos;i="5.95,199,1661842800"; d="scan'208";a="581177758" Received: from amwalker-mobl1.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.42.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Oct 2022 14:56:39 -0700 Subject: [PATCH] mm/memremap: Introduce pgmap_request_folio() using pgmap offsets From: Dan Williams <dan.j.williams@intel.com> To: akpm@linux-foundation.org Cc: Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>, "Darrick J. Wong" <djwong@kernel.org>, Christoph Hellwig <hch@lst.de>, John Hubbard <jhubbard@nvidia.com>, Alistair Popple <apopple@nvidia.com>, Felix Kuehling <Felix.Kuehling@amd.com>, Alex Deucher <alexander.deucher@amd.com>, Christian =?utf-8?b?S8O2bmln?= <christian.koenig@amd.com>, "Pan, Xinhui" <Xinhui.Pan@amd.com>, David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>, Ben Skeggs <bskeggs@redhat.com>, Karol Herbst <kherbst@redhat.com>, Lyude Paul <lyude@redhat.com>, =?utf-8?b?SsOpcsO0bWU=?= Glisse <jglisse@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>, linux-mm@kvack.org, dri-devel@lists.freedesktop.org, nvdimm@lists.linux.dev Date: Thu, 20 Oct 2022 14:56:39 -0700 Message-ID: <166630293549.1017198.3833687373550679565.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: <nvdimm.lists.linux.dev> List-Subscribe: <mailto:nvdimm+subscribe@lists.linux.dev> List-Unsubscribe: <mailto:nvdimm+unsubscribe@lists.linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit
Series	mm/memremap: Introduce pgmap_request_folio() using pgmap offsets \| expand mm/memremap: Introduce pgmap_request_folio() using pgmap offsets

mm/memremap: Introduce pgmap_request_folio() using pgmap offsets

Commit Message

Comments

Patch