From patchwork Tue Sep 25 08:13:27 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Minchan Kim X-Patchwork-Id: 1502671 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 4BB963FE65 for ; Tue, 25 Sep 2012 08:10:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753766Ab2IYIKa (ORCPT ); Tue, 25 Sep 2012 04:10:30 -0400 Received: from LGEMRELSE1Q.lge.com ([156.147.1.111]:42168 "EHLO LGEMRELSE1Q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753658Ab2IYIKZ (ORCPT ); Tue, 25 Sep 2012 04:10:25 -0400 X-AuditID: 9c93016f-b7c19ae000000e6d-91-506166eec680 Received: from localhost ( [10.177.220.67]) by LGEMRELSE1Q.lge.com (Symantec Brightmail Gateway) with SMTP id 49.A0.03693.EE661605; Tue, 25 Sep 2012 17:10:23 +0900 (KST) Date: Tue, 25 Sep 2012 17:13:27 +0900 From: Minchan Kim To: Mel Gorman Cc: Andrew Morton , Richard Davies , Shaohua Li , Rik van Riel , Avi Kivity , QEMU-devel , KVM , Linux-MM , LKML Subject: Re: [PATCH 5/9] mm: compaction: Acquire the zone->lru_lock as late as possible Message-ID: <20120925081327.GA7759@bbox> References: <1348224383-1499-1-git-send-email-mgorman@suse.de> <1348224383-1499-6-git-send-email-mgorman@suse.de> <20120925070517.GK13234@bbox> <20120925075105.GC11266@suse.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20120925075105.GC11266@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: AAAAAA== Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Tue, Sep 25, 2012 at 08:51:05AM +0100, Mel Gorman wrote: > On Tue, Sep 25, 2012 at 04:05:17PM +0900, Minchan Kim wrote: > > Hi Mel, > > > > I have a question below. > > > > On Fri, Sep 21, 2012 at 11:46:19AM +0100, Mel Gorman wrote: > > > Compactions migrate scanner acquires the zone->lru_lock when scanning a range > > > of pages looking for LRU pages to acquire. It does this even if there are > > > no LRU pages in the range. If multiple processes are compacting then this > > > can cause severe locking contention. To make matters worse commit b2eef8c0 > > > (mm: compaction: minimise the time IRQs are disabled while isolating pages > > > for migration) releases the lru_lock every SWAP_CLUSTER_MAX pages that are > > > scanned. > > > > > > This patch makes two changes to how the migrate scanner acquires the LRU > > > lock. First, it only releases the LRU lock every SWAP_CLUSTER_MAX pages if > > > the lock is contended. This reduces the number of times it unnecessarily > > > disables and re-enables IRQs. The second is that it defers acquiring the > > > LRU lock for as long as possible. If there are no LRU pages or the only > > > LRU pages are transhuge then the LRU lock will not be acquired at all > > > which reduces contention on zone->lru_lock. > > > > > > Signed-off-by: Mel Gorman > > > Acked-by: Rik van Riel > > > --- > > > mm/compaction.c | 63 +++++++++++++++++++++++++++++++++++++------------------ > > > 1 file changed, 43 insertions(+), 20 deletions(-) > > > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > > index 6b55491..a6068ff 100644 > > > --- a/mm/compaction.c > > > +++ b/mm/compaction.c > > > @@ -50,6 +50,11 @@ static inline bool migrate_async_suitable(int migratetype) > > > return is_migrate_cma(migratetype) || migratetype == MIGRATE_MOVABLE; > > > } > > > > > > +static inline bool should_release_lock(spinlock_t *lock) > > > +{ > > > + return need_resched() || spin_is_contended(lock); > > > +} > > > + > > > /* > > > * Compaction requires the taking of some coarse locks that are potentially > > > * very heavily contended. Check if the process needs to be scheduled or > > > @@ -62,7 +67,7 @@ static inline bool migrate_async_suitable(int migratetype) > > > static bool compact_checklock_irqsave(spinlock_t *lock, unsigned long *flags, > > > bool locked, struct compact_control *cc) > > > { > > > - if (need_resched() || spin_is_contended(lock)) { > > > + if (should_release_lock(lock)) { > > > if (locked) { > > > spin_unlock_irqrestore(lock, *flags); > > > locked = false; > > > @@ -327,7 +332,7 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, > > > isolate_mode_t mode = 0; > > > struct lruvec *lruvec; > > > unsigned long flags; > > > - bool locked; > > > + bool locked = false; > > > > > > /* > > > * Ensure that there are not too many pages isolated from the LRU > > > @@ -347,23 +352,17 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, > > > > > > /* Time to isolate some pages for migration */ > > > cond_resched(); > > > - spin_lock_irqsave(&zone->lru_lock, flags); > > > - locked = true; > > > for (; low_pfn < end_pfn; low_pfn++) { > > > struct page *page; > > > > > > /* give a chance to irqs before checking need_resched() */ > > > - if (!((low_pfn+1) % SWAP_CLUSTER_MAX)) { > > > - spin_unlock_irqrestore(&zone->lru_lock, flags); > > > - locked = false; > > > + if (locked && !((low_pfn+1) % SWAP_CLUSTER_MAX)) { > > > + if (should_release_lock(&zone->lru_lock)) { > > > + spin_unlock_irqrestore(&zone->lru_lock, flags); > > > + locked = false; > > > + } > > > } > > > > > > - /* Check if it is ok to still hold the lock */ > > > - locked = compact_checklock_irqsave(&zone->lru_lock, &flags, > > > - locked, cc); > > > - if (!locked || fatal_signal_pending(current)) > > > - break; > > > - > > > /* > > > * migrate_pfn does not necessarily start aligned to a > > > * pageblock. Ensure that pfn_valid is called when moving > > > @@ -403,21 +402,38 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, > > > pageblock_nr = low_pfn >> pageblock_order; > > > if (!cc->sync && last_pageblock_nr != pageblock_nr && > > > !migrate_async_suitable(get_pageblock_migratetype(page))) { > > > - low_pfn += pageblock_nr_pages; > > > - low_pfn = ALIGN(low_pfn, pageblock_nr_pages) - 1; > > > - last_pageblock_nr = pageblock_nr; > > > - continue; > > > + goto next_pageblock; > > > } > > > > > > + /* Check may be lockless but that's ok as we recheck later */ > > > if (!PageLRU(page)) > > > continue; > > > > > > /* > > > - * PageLRU is set, and lru_lock excludes isolation, > > > - * splitting and collapsing (collapsing has already > > > - * happened if PageLRU is set). > > > + * PageLRU is set. lru_lock normally excludes isolation > > > + * splitting and collapsing (collapsing has already happened > > > + * if PageLRU is set) but the lock is not necessarily taken > > > + * here and it is wasteful to take it just to check transhuge. > > > + * Check transhuge without lock and skip if it's either a > > > + * transhuge or hugetlbfs page. > > > */ > > > if (PageTransHuge(page)) { > > > + if (!locked) > > > + goto next_pageblock; > > > > Why skip all pages in a pageblock if !locked? > > Shouldn't we add some comment? > > > > The comment is above the block already. The lru_lock normally excludes > isolation and splitting. If we do not hold the hold, it's not safe to > call compound_order so instead we skip the entire pageblock. I see. To me, your saying is better than current comment. I hope comment could be more explicit. Anyway, it's trivial and if anyone think it's valuable, feel free to apply it, please. > > -- > Mel Gorman > SUSE Labs > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org diff --git a/mm/compaction.c b/mm/compaction.c index df01b4e..f1d2cc7 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -542,8 +542,9 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, * splitting and collapsing (collapsing has already happened * if PageLRU is set) but the lock is not necessarily taken * here and it is wasteful to take it just to check transhuge. - * Check transhuge without lock and skip if it's either a - * transhuge or hugetlbfs page. + * Check transhuge without lock and *skip* if it's either a + * transhuge or hugetlbfs page because it's not safe to call + * compound_order. */ if (PageTransHuge(page)) { if (!locked)