mbox series

[V6,0/4] mm/kvm/vfio/ppc64: Migrate compound pages out of CMA region

Message ID 20190108045110.28597-1-aneesh.kumar@linux.ibm.com (mailing list archive)
Headers show
Series mm/kvm/vfio/ppc64: Migrate compound pages out of CMA region | expand

Message

Aneesh Kumar K.V Jan. 8, 2019, 4:51 a.m. UTC
ppc64 use CMA area for the allocation of guest page table (hash page table). We won't
be able to start guest if we fail to allocate hash page table. We have observed
hash table allocation failure because we failed to migrate pages out of CMA region
because they were pinned. This happen when we are using VFIO. VFIO on ppc64 pins
the entire guest RAM. If the guest RAM pages get allocated out of CMA region, we
won't be able to migrate those pages. The pages are also pinned for the lifetime of the
guest.

Currently we support migration of non-compound pages. With THP and with the addition of
 hugetlb migration we can end up allocating compound pages from CMA region. This
patch series add support for migrating compound pages. The first path adds the helper
get_user_pages_cma_migrate() which pin the page making sure we migrate them out of
CMA region before incrementing the reference count. 

Changes from V5:
* Add PF_MEMALLOC_NOCMA
* remote __GFP_THISNODE when allocating target page for migration

Changes from V4:
* use __GFP_NOWARN when allocating pages to avoid page allocation failure warnings.

Changes from V3:
* Move the hugetlb check before transhuge check
* Use compound head page when isolating hugetlb page


Aneesh Kumar K.V (4):
  mm/cma: Add PF flag to force non cma alloc
  mm: Add get_user_pages_cma_migrate
  powerpc/mm/iommu: Allow migration of cma allocated pages during
    mm_iommu_get
  powerpc/mm/iommu: Allow large IOMMU page size only for hugetlb backing

 arch/powerpc/mm/mmu_context_iommu.c | 144 ++++++++-------------------
 include/linux/hugetlb.h             |   2 +
 include/linux/migrate.h             |   3 +
 include/linux/sched.h               |   1 +
 include/linux/sched/mm.h            |  36 +++++--
 mm/hugetlb.c                        |   4 +-
 mm/migrate.c                        | 149 ++++++++++++++++++++++++++++
 7 files changed, 227 insertions(+), 112 deletions(-)

Comments

Andrew Morton Jan. 8, 2019, 7:56 p.m. UTC | #1
On Tue,  8 Jan 2019 10:21:06 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:

> ppc64 use CMA area for the allocation of guest page table (hash page table). We won't
> be able to start guest if we fail to allocate hash page table. We have observed
> hash table allocation failure because we failed to migrate pages out of CMA region
> because they were pinned. This happen when we are using VFIO. VFIO on ppc64 pins
> the entire guest RAM. If the guest RAM pages get allocated out of CMA region, we
> won't be able to migrate those pages. The pages are also pinned for the lifetime of the
> guest.
> 
> Currently we support migration of non-compound pages. With THP and with the addition of
>  hugetlb migration we can end up allocating compound pages from CMA region. This
> patch series add support for migrating compound pages. The first path adds the helper
> get_user_pages_cma_migrate() which pin the page making sure we migrate them out of
> CMA region before incrementing the reference count. 

Does this code do anything for architectures other than powerpc?  If
not, should we be adding the ifdefs to avoid burdening other
architectures with unused code?
Andrea Arcangeli Jan. 9, 2019, 2:38 a.m. UTC | #2
On Tue, Jan 08, 2019 at 10:21:07AM +0530, Aneesh Kumar K.V wrote:
> This patch add PF_MEMALLOC_NOCMA which make sure any allocation in that context
> is marked non movable and hence cannot be satisfied by CMA region.
> 
> This is useful with get_user_pages_cma_migrate where we take a page pin by
> migrating pages from CMA region. Marking the section PF_MEMALLOC_NOCMA ensures
> that we avoid uncessary page migration later.
> 
> Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Aneesh Kumar K.V Jan. 9, 2019, 8:41 a.m. UTC | #3
Andrew Morton <akpm@linux-foundation.org> writes:

> On Tue,  8 Jan 2019 10:21:06 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:
>
>> ppc64 use CMA area for the allocation of guest page table (hash page table). We won't
>> be able to start guest if we fail to allocate hash page table. We have observed
>> hash table allocation failure because we failed to migrate pages out of CMA region
>> because they were pinned. This happen when we are using VFIO. VFIO on ppc64 pins
>> the entire guest RAM. If the guest RAM pages get allocated out of CMA region, we
>> won't be able to migrate those pages. The pages are also pinned for the lifetime of the
>> guest.
>> 
>> Currently we support migration of non-compound pages. With THP and with the addition of
>>  hugetlb migration we can end up allocating compound pages from CMA region. This
>> patch series add support for migrating compound pages. The first path adds the helper
>> get_user_pages_cma_migrate() which pin the page making sure we migrate them out of
>> CMA region before incrementing the reference count. 
>
> Does this code do anything for architectures other than powerpc?  If
> not, should we be adding the ifdefs to avoid burdening other
> architectures with unused code?

Any architecture enabling CMA may need this. I will move most of this below
CONFIG_CMA.

-aneesh
David Gibson Jan. 10, 2019, 4:11 a.m. UTC | #4
On Wed, Jan 09, 2019 at 02:11:25PM +0530, Aneesh Kumar K.V wrote:
> Andrew Morton <akpm@linux-foundation.org> writes:
> 
> > On Tue,  8 Jan 2019 10:21:06 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:
> >
> >> ppc64 use CMA area for the allocation of guest page table (hash page table). We won't
> >> be able to start guest if we fail to allocate hash page table. We have observed
> >> hash table allocation failure because we failed to migrate pages out of CMA region
> >> because they were pinned. This happen when we are using VFIO. VFIO on ppc64 pins
> >> the entire guest RAM. If the guest RAM pages get allocated out of CMA region, we
> >> won't be able to migrate those pages. The pages are also pinned for the lifetime of the
> >> guest.
> >> 
> >> Currently we support migration of non-compound pages. With THP and with the addition of
> >>  hugetlb migration we can end up allocating compound pages from CMA region. This
> >> patch series add support for migrating compound pages. The first path adds the helper
> >> get_user_pages_cma_migrate() which pin the page making sure we migrate them out of
> >> CMA region before incrementing the reference count. 
> >
> > Does this code do anything for architectures other than powerpc?  If
> > not, should we be adding the ifdefs to avoid burdening other
> > architectures with unused code?
> 
> Any architecture enabling CMA may need this. I will move most of this below
> CONFIG_CMA.

In theory it could affect any architecture using CMA.  I suspect it's
much less likely to bite in practice on architectures other than ppc.
IIUC the main use of CMA there is to allocate things like framebuffers
or other large contiguous blocks used for hardware devices.  That's
usually going to happen rarely and during boot up.  What makes ppc
different is that we need a substantial CMA allocation every time we
start a (POWER8) guest for the HPT.  It's the fact that running guests
on a system both means we need the CMA unfragment and (with vfio added
in) can cause CMA fragmentation which makes this particularly
problematic.