From patchwork Tue Oct 16 13:09:27 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Minchan Kim X-Patchwork-Id: 1600791 Return-Path: X-Original-To: patchwork-linux-pm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 0C5B1E00AF for ; Tue, 16 Oct 2012 13:09:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753423Ab2JPNJg (ORCPT ); Tue, 16 Oct 2012 09:09:36 -0400 Received: from mail-pb0-f46.google.com ([209.85.160.46]:49843 "EHLO mail-pb0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752255Ab2JPNJe (ORCPT ); Tue, 16 Oct 2012 09:09:34 -0400 Received: by mail-pb0-f46.google.com with SMTP id rr4so5964894pbb.19 for ; Tue, 16 Oct 2012 06:09:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=lxrhUaeyk60KHK5bJLI/6E9GuXgkSUyO3ZhLjm6W+4k=; b=fCv220at7RJLw6BjisfWwegsSu3n/YmwyP74J8NM0UmG9FPPfaKPYdzEWxCUKtSlEl wSjJa1uPltgKFK0cmFx1p243NAOMLqGvZMLNxJbkUz4m12XFcv+ztbgUoAyN0IBUyGBT gbSRNYFHQZXPv3qs/k45s+8SNqg+26OoN4BCaYkP8ygHPOuylOj1R1qRZsHlGHzdoyMS Rp70z1qgFhvEpNQXPeT24BbIu2KjnntrPEC3E0SWt9pmz/jX7jOHlqBTLF78PxteDJ/x twBru2ecEv9qVwlnws+pH/vK2N80juOsAAfCq9RzgtVB6uZ2PX3FIfhQafuy0K3ylyXe MTog== Received: by 10.66.76.231 with SMTP id n7mr41688451paw.68.1350392974266; Tue, 16 Oct 2012 06:09:34 -0700 (PDT) Received: from barrios ([12.153.189.234]) by mx.google.com with ESMTPS id p3sm4338878paw.8.2012.10.16.06.09.29 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 16 Oct 2012 06:09:31 -0700 (PDT) Date: Tue, 16 Oct 2012 22:09:27 +0900 From: Minchan Kim To: Ming Lei Cc: linux-kernel@vger.kernel.org, Greg Kroah-Hartman , linux-usb@vger.kernel.org, linux-pm@vger.kernel.org, Alan Stern , Oliver Neukum , Jiri Kosina , Andrew Morton , Mel Gorman , KAMEZAWA Hiroyuki , Michal Hocko , Ingo Molnar , Peter Zijlstra , "Rafael J. Wysocki" , linux-mm Subject: Re: [RFC PATCH 1/3] mm: teach mm by current context info to not do I/O during memory allocation Message-ID: <20121016130927.GA5603@barrios> References: <1350278059-14904-1-git-send-email-ming.lei@canonical.com> <1350278059-14904-2-git-send-email-ming.lei@canonical.com> <20121015154724.GA2840@barrios> <20121016054946.GA3934@barrios> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org On Tue, Oct 16, 2012 at 03:08:41PM +0800, Ming Lei wrote: > On Tue, Oct 16, 2012 at 1:49 PM, Minchan Kim wrote: > > > > Fair enough but it wouldn't be a good idea that add new unlikely branch > > in allocator's fast path. Please move the check into slow path which could > > be in __alloc_pages_slowpath. > > Thanks for your comment. > > I have considered to add the branch into gfp_to_alloc_flags() before, > but didn't do it because I see that get_page_from_freelist() may use > the GFP_IO or GFP_FS flag at least in zone_reclaim() path. Good point. You can check it in __zone_reclaim and change gfp_mask of scan_control because it's never hot path. > > So could you make sure it is safe to move the branch into > __alloc_pages_slowpath()? If so, I will add the check into > gfp_to_alloc_flags(). How about this? * We need to be able to allocate from the reserves for RECLAIM_SWAP > > > Thanks, > -- > Ming Lei > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d976957..b3607fa 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2614,10 +2614,16 @@ retry_cpuset: page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, nodemask, order, zonelist, high_zoneidx, alloc_flags, preferred_zone, migratetype); - if (unlikely(!page)) + if (unlikely(!page)) { + /* + * Resume path can deadlock because block device + * isn't active yet. + */ + if (unlikely(tsk_memalloc_no_io(current))) + gfp_mask &= ~GFP_IOFS; page = __alloc_pages_slowpath(gfp_mask, order, zonelist, high_zoneidx, nodemask, preferred_zone, migratetype); + } trace_mm_page_alloc(page, order, gfp_mask, migratetype); diff --git a/mm/vmscan.c b/mm/vmscan.c index b5e45f4..6c2ccdd 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3290,6 +3290,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order) }; unsigned long nr_slab_pages0, nr_slab_pages1; + if (unlikely(tsk_memalloc_no_io(current))) { + sc.gfp_mask &= ~GFP_IOFS; + shrink.gfp_mask = sc.gfp_mask; + /* + * We allow to reclaim only clean pages. + * It can affect RECLAIM_SWAP and RECLAIM_WRITE mode + * but this is really rare event and allocator can * fallback to other zones. + */ + sc.may_writepage = 0; + sc.may_swap = 0; + } + cond_resched(); /*