diff mbox series

[06/25] mm, compaction: Skip pageblocks with reserved pages

Message ID 20190104125011.16071-7-mgorman@techsingularity.net (mailing list archive)
State New, archived
Headers show
Series Increase success rates and reduce latency of compaction v2 | expand

Commit Message

Mel Gorman Jan. 4, 2019, 12:49 p.m. UTC
Reserved pages are set at boot time, tend to be clustered and almost never
become unreserved. When isolating pages for either migration sources or
target, skip the entire pageblock is one PageReserved page is encountered
on the grounds that it is highly probable the entire pageblock is reserved.

The performance impact is relative to the number of reserved pages in
the system and their location so it'll be variable but intuitively it
should make sense. If the memblock allocator was ever changed to spread
reserved pages throughout the address space then this patch would be
impaired but it would also be considered a bug given that such a change
would ruin fragmentation.

On both 1-socket and 2-socket machines, scan rates are reduced slightly
on workloads that intensively allocate THP while the system is fragmented.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/compaction.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Vlastimil Babka Jan. 15, 2019, 12:10 p.m. UTC | #1
On 1/4/19 1:49 PM, Mel Gorman wrote:
> Reserved pages are set at boot time, tend to be clustered and almost never
> become unreserved. When isolating pages for either migration sources or
> target, skip the entire pageblock is one PageReserved page is encountered
> on the grounds that it is highly probable the entire pageblock is reserved.
> 
> The performance impact is relative to the number of reserved pages in
> the system and their location so it'll be variable but intuitively it
> should make sense. If the memblock allocator was ever changed to spread
> reserved pages throughout the address space then this patch would be
> impaired but it would also be considered a bug given that such a change
> would ruin fragmentation.
> 
> On both 1-socket and 2-socket machines, scan rates are reduced slightly
> on workloads that intensively allocate THP while the system is fragmented.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  mm/compaction.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 3afa4e9188b6..94d1e5b062ea 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -484,6 +484,15 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>  			goto isolate_fail;
>  		}
>  
> +		/*
> +		 * A reserved page is never freed and tend to be clustered in
> +		 * the same pageblock. Skip the block.
> +		 */
> +		if (PageReserved(page)) {
> +			blockpfn = end_pfn;
> +			break;
> +		}
> +
>  		if (!PageBuddy(page))
>  			goto isolate_fail;
>  
> @@ -827,6 +836,13 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>  					goto isolate_success;
>  			}
>  
> +			/*
> +			 * A reserved page is never freed and tend to be
> +			 * clustered in the same pageblocks. Skip the block.

AFAICS memory allocator is not the only user of PageReserved. There
seems to be some drivers as well, notably the DRM subsystem via
drm_pci_alloc(). There's an effort to clean those up [1] but until then,
there might be some false positives here.

[1] https://marc.info/?l=linux-mm&m=154747078617898&w=2

> +			 */
> +			if (PageReserved(page))
> +				low_pfn = end_pfn;
> +
>  			goto isolate_fail;
>  		}
>  
>
Mel Gorman Jan. 15, 2019, 12:50 p.m. UTC | #2
On Tue, Jan 15, 2019 at 01:10:57PM +0100, Vlastimil Babka wrote:
> On 1/4/19 1:49 PM, Mel Gorman wrote:
> > Reserved pages are set at boot time, tend to be clustered and almost never
> > become unreserved. When isolating pages for either migration sources or
> > target, skip the entire pageblock is one PageReserved page is encountered
> > on the grounds that it is highly probable the entire pageblock is reserved.
> > 
> > The performance impact is relative to the number of reserved pages in
> > the system and their location so it'll be variable but intuitively it
> > should make sense. If the memblock allocator was ever changed to spread
> > reserved pages throughout the address space then this patch would be
> > impaired but it would also be considered a bug given that such a change
> > would ruin fragmentation.
> > 
> > On both 1-socket and 2-socket machines, scan rates are reduced slightly
> > on workloads that intensively allocate THP while the system is fragmented.
> > 
> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> > ---
> >  mm/compaction.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index 3afa4e9188b6..94d1e5b062ea 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -484,6 +484,15 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> >  			goto isolate_fail;
> >  		}
> >  
> > +		/*
> > +		 * A reserved page is never freed and tend to be clustered in
> > +		 * the same pageblock. Skip the block.
> > +		 */
> > +		if (PageReserved(page)) {
> > +			blockpfn = end_pfn;
> > +			break;
> > +		}
> > +
> >  		if (!PageBuddy(page))
> >  			goto isolate_fail;
> >  
> > @@ -827,6 +836,13 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> >  					goto isolate_success;
> >  			}
> >  
> > +			/*
> > +			 * A reserved page is never freed and tend to be
> > +			 * clustered in the same pageblocks. Skip the block.
> 
> AFAICS memory allocator is not the only user of PageReserved. There
> seems to be some drivers as well, notably the DRM subsystem via
> drm_pci_alloc(). There's an effort to clean those up [1] but until then,
> there might be some false positives here.
> 
> [1] https://marc.info/?l=linux-mm&m=154747078617898&w=2
> 

Hmm, I'm tempted to leave this anyway. The reservations for PCI space are
likely to be persistent and I also do not expect them to grow much. While
I consider it to be partially abuse to use PageReserved like this, it
should get cleaned up slowly over time. If this turns out to be wrong,
I'll attempt to fix the responsible driver that is scattering
PageReserved around the place and at worst, revert this if it turns out
to be a major problem in practice. Any objections?
Mel Gorman Jan. 16, 2019, 9:42 a.m. UTC | #3
On Tue, Jan 15, 2019 at 12:50:45PM +0000, Mel Gorman wrote:
> > AFAICS memory allocator is not the only user of PageReserved. There
> > seems to be some drivers as well, notably the DRM subsystem via
> > drm_pci_alloc(). There's an effort to clean those up [1] but until then,
> > there might be some false positives here.
> > 
> > [1] https://marc.info/?l=linux-mm&m=154747078617898&w=2
> > 
> 
> Hmm, I'm tempted to leave this anyway. The reservations for PCI space are
> likely to be persistent and I also do not expect them to grow much. While
> I consider it to be partially abuse to use PageReserved like this, it
> should get cleaned up slowly over time. If this turns out to be wrong,
> I'll attempt to fix the responsible driver that is scattering
> PageReserved around the place and at worst, revert this if it turns out
> to be a major problem in practice. Any objections?
> 

I decided to drop this anyway as the series does not hinge on it, it's a
relatively minor improvement overall and I don't want to halt the entire
series over it. The maintain that the system would recover even if the
driver released the pages as the check would eventually fail and then be
cleared after a reset. The only downside from the patch that I can see
really is that it's a small maintenance overhead due to an apparent
duplicated check. The CPU overhead of compaction will be slightly higher
due to the revert but there are other options on the horizon that would
bring down that overhead again.
diff mbox series

Patch

diff --git a/mm/compaction.c b/mm/compaction.c
index 3afa4e9188b6..94d1e5b062ea 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -484,6 +484,15 @@  static unsigned long isolate_freepages_block(struct compact_control *cc,
 			goto isolate_fail;
 		}
 
+		/*
+		 * A reserved page is never freed and tend to be clustered in
+		 * the same pageblock. Skip the block.
+		 */
+		if (PageReserved(page)) {
+			blockpfn = end_pfn;
+			break;
+		}
+
 		if (!PageBuddy(page))
 			goto isolate_fail;
 
@@ -827,6 +836,13 @@  isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 					goto isolate_success;
 			}
 
+			/*
+			 * A reserved page is never freed and tend to be
+			 * clustered in the same pageblocks. Skip the block.
+			 */
+			if (PageReserved(page))
+				low_pfn = end_pfn;
+
 			goto isolate_fail;
 		}