[v15,3/7] mm: Add function __putback_isolated_page

Message ID	20191205162230.19548.70198.stgit@localhost.localdomain (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=8IBs=Z3=vger.kernel.org=kvm-owner@kernel.org> Subject: [PATCH v15 3/7] mm: Add function __putback_isolated_page From: Alexander Duyck <alexander.duyck@gmail.com> To: kvm@vger.kernel.org, mst@redhat.com, linux-kernel@vger.kernel.org, willy@infradead.org, mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, vbabka@suse.cz Cc: yang.zhang.wz@gmail.com, nitesh@redhat.com, konrad.wilk@oracle.com, david@redhat.com, pagupta@redhat.com, riel@surriel.com, lcapitulino@redhat.com, dave.hansen@intel.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com, osalvador@suse.de Date: Thu, 05 Dec 2019 08:22:30 -0800 Message-ID: <20191205162230.19548.70198.stgit@localhost.localdomain> In-Reply-To: <20191205161928.19548.41654.stgit@localhost.localdomain> References: <20191205161928.19548.41654.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	mm / virtio: Provide support for free page reporting \| expand [v15,0/7] mm / virtio: Provide support for free page reporting [v15,1/7] mm: Adjust shuffle code to allow for future coalescing [v15,2/7] mm: Use zone and order instead of free area in free_list manipulators [v15,3/7] mm: Add function __putback_isolated_page [v15,4/7] mm: Introduce Reported pages [v15,5/7] virtio-balloon: Pull page poisoning config out of free page hinting [v15,6/7] virtio-balloon: Add support for providing free page reports to host [v15,7/7] mm: Add free page reporting documentation

Alexander Duyck Dec. 5, 2019, 4:22 p.m. UTC

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

There are cases where we would benefit from avoiding having to go through
the allocation and free cycle to return an isolated page.

Examples for this might include page poisoning in which we isolate a page
and then put it back in the free list without ever having actually
allocated it.

This will enable us to also avoid notifiers for the future free page
reporting which will need to avoid retriggering page reporting when
returning pages that have been reported on.

Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 mm/internal.h       |    1 +
 mm/page_alloc.c     |   24 ++++++++++++++++++++++++
 mm/page_isolation.c |    6 ++----
 3 files changed, 27 insertions(+), 4 deletions(-)

David Hildenbrand Dec. 16, 2019, 11:36 a.m. UTC | #1

[...]
> +/**
> + * __putback_isolated_page - Return a now-isolated page back where we got it
> + * @page: Page that was isolated
> + * @order: Order of the isolated page
> + *
> + * This function is meant to return a page pulled from the free lists via
> + * __isolate_free_page back to the free lists they were pulled from.
> + */
> +void __putback_isolated_page(struct page *page, unsigned int order)
> +{
> +	struct zone *zone = page_zone(page);
> +	unsigned long pfn;
> +	unsigned int mt;
> +
> +	/* zone lock should be held when this function is called */
> +	lockdep_assert_held(&zone->lock);
> +
> +	pfn = page_to_pfn(page);
> +	mt = get_pfnblock_migratetype(page, pfn);

IMHO get_pageblock_migratetype() would be nicer - I guess the compiler
will optimize out the double page_to_pfn().

> +
> +	/* Return isolated page to tail of freelist. */
> +	__free_one_page(page, pfn, zone, order, mt);
> +}
> +
>  /*
>   * Update NUMA hit/miss statistics
>   *
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 04ee1663cdbe..d93d2be0070f 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -134,13 +134,11 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
>  		__mod_zone_freepage_state(zone, nr_pages, migratetype);
>  	}
>  	set_pageblock_migratetype(page, migratetype);
> +	if (isolated_page)
> +		__putback_isolated_page(page, order);
>  	zone->nr_isolate_pageblock--;
>  out:
>  	spin_unlock_irqrestore(&zone->lock, flags);
> -	if (isolated_page) {
> -		post_alloc_hook(page, order, __GFP_MOVABLE);
> -		__free_pages(page, order);
> -	}

So If I get it right:

post_alloc_hook() does quite some stuff like
- arch_alloc_page(page, order);
- kernel_map_pages(page, 1 << order, 1)
- kasan_alloc_pages()
- kernel_poison_pages(1)
- set_page_owner()

Which free_pages_prepare() will undo, like
- reset_page_owner()
- kernel_poison_pages(0)
- arch_free_page()
- kernel_map_pages()
- kasan_free_nondeferred_pages()

Both would be skipped now - which sounds like the right thing to do IMHO
(and smells like quite a performance improvement). I haven't verified if
actually everything we skip in free_pages_prepare() is safe (I think it
is, it seems to be mostly relevant for actually used/allocated pages).

Alexander Duyck Dec. 16, 2019, 4:22 p.m. UTC | #2

On Mon, 2019-12-16 at 12:36 +0100, David Hildenbrand wrote:
> [...]
> > +/**
> > + * __putback_isolated_page - Return a now-isolated page back where we got it
> > + * @page: Page that was isolated
> > + * @order: Order of the isolated page
> > + *
> > + * This function is meant to return a page pulled from the free lists via
> > + * __isolate_free_page back to the free lists they were pulled from.
> > + */
> > +void __putback_isolated_page(struct page *page, unsigned int order)
> > +{
> > +	struct zone *zone = page_zone(page);
> > +	unsigned long pfn;
> > +	unsigned int mt;
> > +
> > +	/* zone lock should be held when this function is called */
> > +	lockdep_assert_held(&zone->lock);
> > +
> > +	pfn = page_to_pfn(page);
> > +	mt = get_pfnblock_migratetype(page, pfn);
> 
> IMHO get_pageblock_migratetype() would be nicer - I guess the compiler
> will optimize out the double page_to_pfn().

The thing is I need the page_to_pfn call already in order to pass the
value to __free_one_page. With that being the case why not juse use
get_pfnblock_migratetype?

Also there are some scenarios where __page_to_pfn is not that simple a
call with us having to get the node ID so we can find the pgdat structure
to perform the calculation. I'm not sure the compiler would be ble to
figure out that the result is the same for both calls, so it is better to
make it explicit.

> > +
> > +	/* Return isolated page to tail of freelist. */
> > +	__free_one_page(page, pfn, zone, order, mt);
> > +}
> > +
> >  /*
> >   * Update NUMA hit/miss statistics
> >   *
> > diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> > index 04ee1663cdbe..d93d2be0070f 100644
> > --- a/mm/page_isolation.c
> > +++ b/mm/page_isolation.c
> > @@ -134,13 +134,11 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
> >  		__mod_zone_freepage_state(zone, nr_pages, migratetype);
> >  	}
> >  	set_pageblock_migratetype(page, migratetype);
> > +	if (isolated_page)
> > +		__putback_isolated_page(page, order);
> >  	zone->nr_isolate_pageblock--;
> >  out:
> >  	spin_unlock_irqrestore(&zone->lock, flags);
> > -	if (isolated_page) {
> > -		post_alloc_hook(page, order, __GFP_MOVABLE);
> > -		__free_pages(page, order);
> > -	}
> 
> So If I get it right:
> 
> post_alloc_hook() does quite some stuff like
> - arch_alloc_page(page, order);
> - kernel_map_pages(page, 1 << order, 1)
> - kasan_alloc_pages()
> - kernel_poison_pages(1)
> - set_page_owner()
> 
> Which free_pages_prepare() will undo, like
> - reset_page_owner()
> - kernel_poison_pages(0)
> - arch_free_page()
> - kernel_map_pages()
> - kasan_free_nondeferred_pages()
> 
> Both would be skipped now - which sounds like the right thing to do IMHO
> (and smells like quite a performance improvement). I haven't verified if
> actually everything we skip in free_pages_prepare() is safe (I think it
> is, it seems to be mostly relevant for actually used/allocated pages).

That was kind of my thought on this. Basically the logic I was following
was that the code path will call move_freepages_block that bypasses all of
the above mentioned calls if the pages it is moving will not be merged. If
it is safe in that case my assumption is that it should be safe to just
call __putback_isolated_page in such a case as it also bypasses the block
above, but it supports merging the page with other pages that are already
on the freelist.

David Hildenbrand Dec. 17, 2019, 10:58 a.m. UTC | #3

On 16.12.19 17:22, Alexander Duyck wrote:
> On Mon, 2019-12-16 at 12:36 +0100, David Hildenbrand wrote:
>> [...]
>>> +/**
>>> + * __putback_isolated_page - Return a now-isolated page back where we got it
>>> + * @page: Page that was isolated
>>> + * @order: Order of the isolated page
>>> + *
>>> + * This function is meant to return a page pulled from the free lists via
>>> + * __isolate_free_page back to the free lists they were pulled from.
>>> + */
>>> +void __putback_isolated_page(struct page *page, unsigned int order)
>>> +{
>>> +	struct zone *zone = page_zone(page);
>>> +	unsigned long pfn;
>>> +	unsigned int mt;
>>> +
>>> +	/* zone lock should be held when this function is called */
>>> +	lockdep_assert_held(&zone->lock);
>>> +
>>> +	pfn = page_to_pfn(page);
>>> +	mt = get_pfnblock_migratetype(page, pfn);
>>
>> IMHO get_pageblock_migratetype() would be nicer - I guess the compiler
>> will optimize out the double page_to_pfn().
> 
> The thing is I need the page_to_pfn call already in order to pass the
> value to __free_one_page. With that being the case why not juse use
> get_pfnblock_migratetype?

I was reading
	set_pageblock_migratetype(page, migratetype);
And wondered why we don't use straight forward
	get_pageblock_migratetype()
but instead something that looks like a micro-optimization

> 
> Also there are some scenarios where __page_to_pfn is not that simple a
> call with us having to get the node ID so we can find the pgdat structure
> to perform the calculation. I'm not sure the compiler would be ble to
> figure out that the result is the same for both calls, so it is better to
> make it explicit.

Only in case of CONFIG_SPARSEMEM we have to go via the section - but I
doubt this is really worth optimizing here.

But yeah, I'm fine with this change, only "IMHO
get_pageblock_migratetype() would be nicer" :)

> 
>>> +
>>> +	/* Return isolated page to tail of freelist. */
>>> +	__free_one_page(page, pfn, zone, order, mt);
>>> +}
>>> +
>>>  /*
>>>   * Update NUMA hit/miss statistics
>>>   *
>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>>> index 04ee1663cdbe..d93d2be0070f 100644
>>> --- a/mm/page_isolation.c
>>> +++ b/mm/page_isolation.c
>>> @@ -134,13 +134,11 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
>>>  		__mod_zone_freepage_state(zone, nr_pages, migratetype);
>>>  	}
>>>  	set_pageblock_migratetype(page, migratetype);
>>> +	if (isolated_page)
>>> +		__putback_isolated_page(page, order);
>>>  	zone->nr_isolate_pageblock--;
>>>  out:
>>>  	spin_unlock_irqrestore(&zone->lock, flags);
>>> -	if (isolated_page) {
>>> -		post_alloc_hook(page, order, __GFP_MOVABLE);
>>> -		__free_pages(page, order);
>>> -	}
>>
>> So If I get it right:
>>
>> post_alloc_hook() does quite some stuff like
>> - arch_alloc_page(page, order);
>> - kernel_map_pages(page, 1 << order, 1)
>> - kasan_alloc_pages()
>> - kernel_poison_pages(1)
>> - set_page_owner()
>>
>> Which free_pages_prepare() will undo, like
>> - reset_page_owner()
>> - kernel_poison_pages(0)
>> - arch_free_page()
>> - kernel_map_pages()
>> - kasan_free_nondeferred_pages()
>>
>> Both would be skipped now - which sounds like the right thing to do IMHO
>> (and smells like quite a performance improvement). I haven't verified if
>> actually everything we skip in free_pages_prepare() is safe (I think it
>> is, it seems to be mostly relevant for actually used/allocated pages).
> 
> That was kind of my thought on this. Basically the logic I was following
> was that the code path will call move_freepages_block that bypasses all of
> the above mentioned calls if the pages it is moving will not be merged. If
> it is safe in that case my assumption is that it should be safe to just
> call __putback_isolated_page in such a case as it also bypasses the block
> above, but it supports merging the page with other pages that are already
> on the freelist.

Makes sense to me

Acked-by: David Hildenbrand <david@redhat.com>

Alexander Duyck Dec. 17, 2019, 4:26 p.m. UTC | #4

On Tue, 2019-12-17 at 11:58 +0100, David Hildenbrand wrote:
> On 16.12.19 17:22, Alexander Duyck wrote:
> > On Mon, 2019-12-16 at 12:36 +0100, David Hildenbrand wrote:
> > > [...]
> > > > +/**
> > > > + * __putback_isolated_page - Return a now-isolated page back where we got it
> > > > + * @page: Page that was isolated
> > > > + * @order: Order of the isolated page
> > > > + *
> > > > + * This function is meant to return a page pulled from the free lists via
> > > > + * __isolate_free_page back to the free lists they were pulled from.
> > > > + */
> > > > +void __putback_isolated_page(struct page *page, unsigned int order)
> > > > +{
> > > > +	struct zone *zone = page_zone(page);
> > > > +	unsigned long pfn;
> > > > +	unsigned int mt;
> > > > +
> > > > +	/* zone lock should be held when this function is called */
> > > > +	lockdep_assert_held(&zone->lock);
> > > > +
> > > > +	pfn = page_to_pfn(page);
> > > > +	mt = get_pfnblock_migratetype(page, pfn);
> > > 
> > > IMHO get_pageblock_migratetype() would be nicer - I guess the compiler
> > > will optimize out the double page_to_pfn().
> > 
> > The thing is I need the page_to_pfn call already in order to pass the
> > value to __free_one_page. With that being the case why not juse use
> > get_pfnblock_migratetype?
> 
> I was reading
> 	set_pageblock_migratetype(page, migratetype);
> And wondered why we don't use straight forward
> 	get_pageblock_migratetype()
> but instead something that looks like a micro-optimization

So there ends up being a some other optimizations you may not have
noticed.

For instance, the fact that get_pfnblock_migratetype is an inline
function, whereas get_pageblock_migratetype calls get_pfnblock_flags_mask
which is not an inline function. So you end up having to take the overhead
for a call/return.

I hadn't noticed that myself until taking a look at the code.

> > Also there are some scenarios where __page_to_pfn is not that simple a
> > call with us having to get the node ID so we can find the pgdat structure
> > to perform the calculation. I'm not sure the compiler would be ble to
> > figure out that the result is the same for both calls, so it is better to
> > make it explicit.
> 
> Only in case of CONFIG_SPARSEMEM we have to go via the section - but I
> doubt this is really worth optimizing here.
> 
> But yeah, I'm fine with this change, only "IMHO
> get_pageblock_migratetype() would be nicer" :)

Aren't most distros running with CONFIG_SPARSEMEM enabled? If that is the
case why not optimize for it?

As I stated earlier, in my case I already have to pull out the PFN as a
part of freeing the page anyway, so why not reuse the value instead of
having it be computed twice? It is in keeping with how the other handlers
are dealing with this such as free_one_page, __free_pages_ok, and
free_unref_page_prepare. I suspect it has to do with the fact that it is
an inline like I pointed out above.

> > > > +
> > > > +	/* Return isolated page to tail of freelist. */
> > > > +	__free_one_page(page, pfn, zone, order, mt);
> > > > +}
> > > > +
> > > >  /*
> > > >   * Update NUMA hit/miss statistics
> > > >   *
> > > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> > > > index 04ee1663cdbe..d93d2be0070f 100644
> > > > --- a/mm/page_isolation.c
> > > > +++ b/mm/page_isolation.c
> > > > @@ -134,13 +134,11 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
> > > >  		__mod_zone_freepage_state(zone, nr_pages, migratetype);
> > > >  	}
> > > >  	set_pageblock_migratetype(page, migratetype);
> > > > +	if (isolated_page)
> > > > +		__putback_isolated_page(page, order);
> > > >  	zone->nr_isolate_pageblock--;
> > > >  out:
> > > >  	spin_unlock_irqrestore(&zone->lock, flags);
> > > > -	if (isolated_page) {
> > > > -		post_alloc_hook(page, order, __GFP_MOVABLE);
> > > > -		__free_pages(page, order);
> > > > -	}
> > > 
> > > So If I get it right:
> > > 
> > > post_alloc_hook() does quite some stuff like
> > > - arch_alloc_page(page, order);
> > > - kernel_map_pages(page, 1 << order, 1)
> > > - kasan_alloc_pages()
> > > - kernel_poison_pages(1)
> > > - set_page_owner()
> > > 
> > > Which free_pages_prepare() will undo, like
> > > - reset_page_owner()
> > > - kernel_poison_pages(0)
> > > - arch_free_page()
> > > - kernel_map_pages()
> > > - kasan_free_nondeferred_pages()
> > > 
> > > Both would be skipped now - which sounds like the right thing to do IMHO
> > > (and smells like quite a performance improvement). I haven't verified if
> > > actually everything we skip in free_pages_prepare() is safe (I think it
> > > is, it seems to be mostly relevant for actually used/allocated pages).
> > 
> > That was kind of my thought on this. Basically the logic I was following
> > was that the code path will call move_freepages_block that bypasses all of
> > the above mentioned calls if the pages it is moving will not be merged. If
> > it is safe in that case my assumption is that it should be safe to just
> > call __putback_isolated_page in such a case as it also bypasses the block
> > above, but it supports merging the page with other pages that are already
> > on the freelist.
> 
> Makes sense to me
> 
> Acked-by: David Hildenbrand <david@redhat.com>

Thanks. I will add the Ack to the patch for v16.

David Hildenbrand Dec. 17, 2019, 5:24 p.m. UTC | #5

>>> Also there are some scenarios where __page_to_pfn is not that simple a
>>> call with us having to get the node ID so we can find the pgdat structure
>>> to perform the calculation. I'm not sure the compiler would be ble to
>>> figure out that the result is the same for both calls, so it is better to
>>> make it explicit.
>>
>> Only in case of CONFIG_SPARSEMEM we have to go via the section - but I
>> doubt this is really worth optimizing here.
>>
>> But yeah, I'm fine with this change, only "IMHO
>> get_pageblock_migratetype() would be nicer" :)
> 
> Aren't most distros running with CONFIG_SPARSEMEM enabled? If that is the
> case why not optimize for it?

Because I tend to dislike micro-optimizations without performance
numbers for code that is not on a hot path. But I mean in this case, as
you said, you need the pfn either way, so it's completely fine with.

I do wonder, however, if you should just pass in the migratetype from
the caller. That would be even faster ;)

Alexander Duyck Dec. 17, 2019, 6:24 p.m. UTC | #6

On Tue, 2019-12-17 at 18:24 +0100, David Hildenbrand wrote:
>  >>> Also there are some scenarios where __page_to_pfn is not that simple a
> > > > call with us having to get the node ID so we can find the pgdat structure
> > > > to perform the calculation. I'm not sure the compiler would be ble to
> > > > figure out that the result is the same for both calls, so it is better to
> > > > make it explicit.
> > > 
> > > Only in case of CONFIG_SPARSEMEM we have to go via the section - but I
> > > doubt this is really worth optimizing here.
> > > 
> > > But yeah, I'm fine with this change, only "IMHO
> > > get_pageblock_migratetype() would be nicer" :)
> > 
> > Aren't most distros running with CONFIG_SPARSEMEM enabled? If that is the
> > case why not optimize for it?
> 
> Because I tend to dislike micro-optimizations without performance
> numbers for code that is not on a hot path. But I mean in this case, as
> you said, you need the pfn either way, so it's completely fine with.
> 
> I do wonder, however, if you should just pass in the migratetype from
> the caller. That would be even faster ;)

The problem is page isolation. We can end up with a page being moved to an
isolate pageblock while we aren't holding the zone lock, and as such we
likely need to test it again anyway. So there isn't value in storing and
reusing the value for cases like page reporting.

In addition, the act of isolating the page can cause the migratetype to
change as __isolate_free_page will attempt to change the migratetype to
movable if it is one of the standard percpu types and we are pulling at
least half a pageblock out. So storing the value before we isolate it
would be problematic as well.

Undoing page isolation is the exception to the issues pointed out above,
but in that case we are overwriting the pageblock migratetype anyway so
the cache lines involved should all be warm from having just set the
value.

David Hildenbrand Dec. 17, 2019, 6:46 p.m. UTC | #7

> Am 17.12.2019 um 19:25 schrieb Alexander Duyck <alexander.h.duyck@linux.intel.com>:
> 
> On Tue, 2019-12-17 at 18:24 +0100, David Hildenbrand wrote:
>>>>> Also there are some scenarios where __page_to_pfn is not that simple a
>>>>> call with us having to get the node ID so we can find the pgdat structure
>>>>> to perform the calculation. I'm not sure the compiler would be ble to
>>>>> figure out that the result is the same for both calls, so it is better to
>>>>> make it explicit.
>>>> 
>>>> Only in case of CONFIG_SPARSEMEM we have to go via the section - but I
>>>> doubt this is really worth optimizing here.
>>>> 
>>>> But yeah, I'm fine with this change, only "IMHO
>>>> get_pageblock_migratetype() would be nicer" :)
>>> 
>>> Aren't most distros running with CONFIG_SPARSEMEM enabled? If that is the
>>> case why not optimize for it?
>> 
>> Because I tend to dislike micro-optimizations without performance
>> numbers for code that is not on a hot path. But I mean in this case, as
>> you said, you need the pfn either way, so it's completely fine with.
>> 
>> I do wonder, however, if you should just pass in the migratetype from
>> the caller. That would be even faster ;)
> 
> The problem is page isolation. We can end up with a page being moved to an
> isolate pageblock while we aren't holding the zone lock, and as such we
> likely need to test it again anyway. So there isn't value in storing and
> reusing the value for cases like page reporting.
> 
> In addition, the act of isolating the page can cause the migratetype to
> change as __isolate_free_page will attempt to change the migratetype to
> movable if it is one of the standard percpu types and we are pulling at
> least half a pageblock out. So storing the value before we isolate it
> would be problematic as well.
> 
> Undoing page isolation is the exception to the issues pointed out above,
> but in that case we are overwriting the pageblock migratetype anyway so
> the cache lines involved should all be warm from having just set the
> value.

Nothing would speak against querying the migratetype in the caller and passing it on. After all you‘re holding the zone lock, so it can‘t change.
>

Alexander Duyck Dec. 17, 2019, 9:50 p.m. UTC | #8

On Tue, 2019-12-17 at 19:46 +0100, David Hildenbrand wrote:
> > Am 17.12.2019 um 19:25 schrieb Alexander Duyck <alexander.h.duyck@linux.intel.com>:
> > 
> > On Tue, 2019-12-17 at 18:24 +0100, David Hildenbrand wrote:
> > > > > > Also there are some scenarios where __page_to_pfn is not that simple a
> > > > > > call with us having to get the node ID so we can find the pgdat structure
> > > > > > to perform the calculation. I'm not sure the compiler would be ble to
> > > > > > figure out that the result is the same for both calls, so it is better to
> > > > > > make it explicit.
> > > > > 
> > > > > Only in case of CONFIG_SPARSEMEM we have to go via the section - but I
> > > > > doubt this is really worth optimizing here.
> > > > > 
> > > > > But yeah, I'm fine with this change, only "IMHO
> > > > > get_pageblock_migratetype() would be nicer" :)
> > > > 
> > > > Aren't most distros running with CONFIG_SPARSEMEM enabled? If that is the
> > > > case why not optimize for it?
> > > 
> > > Because I tend to dislike micro-optimizations without performance
> > > numbers for code that is not on a hot path. But I mean in this case, as
> > > you said, you need the pfn either way, so it's completely fine with.
> > > 
> > > I do wonder, however, if you should just pass in the migratetype from
> > > the caller. That would be even faster ;)
> > 
> > The problem is page isolation. We can end up with a page being moved to an
> > isolate pageblock while we aren't holding the zone lock, and as such we
> > likely need to test it again anyway. So there isn't value in storing and
> > reusing the value for cases like page reporting.
> > 
> > In addition, the act of isolating the page can cause the migratetype to
> > change as __isolate_free_page will attempt to change the migratetype to
> > movable if it is one of the standard percpu types and we are pulling at
> > least half a pageblock out. So storing the value before we isolate it
> > would be problematic as well.
> > 
> > Undoing page isolation is the exception to the issues pointed out above,
> > but in that case we are overwriting the pageblock migratetype anyway so
> > the cache lines involved should all be warm from having just set the
> > value.
> 
> Nothing would speak against querying the migratetype in the caller and
> passing it on. After all you‘re holding the zone lock, so it can‘t
> change.

That's a fair argument. I will go ahead and make that change since it only
really adds one line to patch 4 and allows us to drop several lines from
patch 3.

[v15,3/7] mm: Add function __putback_isolated_page

Commit Message

Comments

Patch