From patchwork Wed Apr 13 02:23:52 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Linus Torvalds X-Patchwork-Id: 702591 Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p3D2RkhP021301 for ; Wed, 13 Apr 2011 02:28:07 GMT Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx4-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p3D2Osvr007691; Tue, 12 Apr 2011 22:24:55 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p3D2Orl2021497 for ; Tue, 12 Apr 2011 22:24:53 -0400 Received: from mx1.redhat.com (ext-mx12.extmail.prod.ext.phx2.redhat.com [10.5.110.17]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p3D2OmfP008051; Tue, 12 Apr 2011 22:24:48 -0400 Received: from smtp1.linux-foundation.org (smtp1.linux-foundation.org [140.211.169.13]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p3D2OiwF031991; Tue, 12 Apr 2011 22:24:45 -0400 Received: from mail-iw0-f174.google.com (mail-iw0-f174.google.com [209.85.214.174]) (authenticated bits=0) by smtp1.linux-foundation.org (8.14.2/8.13.5/Debian-3ubuntu1.1) with ESMTP id p3D2OESc015055 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=FAIL); Tue, 12 Apr 2011 19:24:14 -0700 Received: by iwn34 with SMTP id 34so246397iwn.33 for ; Tue, 12 Apr 2011 19:24:12 -0700 (PDT) Received: by 10.231.34.4 with SMTP id j4mr7621493ibd.83.1302661452058; Tue, 12 Apr 2011 19:24:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.33.199 with HTTP; Tue, 12 Apr 2011 19:23:52 -0700 (PDT) In-Reply-To: <20110413070827.11b2a71d@notabene.brown> References: <20110411223623.4278fad1@notabene.brown> <4DA2F8AD.1060605@fusionio.com> <20110412011255.GA29236@infradead.org> <4DA40F0E.1070903@fusionio.com> <20110412122248.GC31057@dastard> <4DA4456F.3070301@fusionio.com> <20110412124134.GD31057@dastard> <4DA44C86.3090305@fusionio.com> <20110412133117.GE31057@dastard> <4DA45790.2010109@fusionio.com> <20110412143452.GH31057@dastard> <20110413070827.11b2a71d@notabene.brown> From: Linus Torvalds Date: Tue, 12 Apr 2011 19:23:52 -0700 Message-ID: To: NeilBrown X-Spam-Status: No, hits=-103.482 required=5 tests=AWL, BAYES_00, OSDL_HEADER_SUBJECT_BRACKETED, USER_IN_WHITELIST X-Spam-Checker-Version: SpamAssassin 3.2.4-osdl_revision__1.47__ X-MIMEDefang-Filter: lf$Revision: 1.188 $ X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Scanned-By: MIMEDefang 2.68 on 10.5.110.17 X-Scanned-By: MIMEDefang 2.63 on 140.211.169.13 X-RedHat-Spam-Score: -0.01 (T_RP_MATCHES_RCVD) X-loop: dm-devel@redhat.com Cc: "linux-raid@vger.kernel.org" , Mike Snitzer , Jens Axboe , Dave Chinner , "linux-kernel@vger.kernel.org" , "hch@infradead.org" , "dm-devel@redhat.com" Subject: Re: [dm-devel] [PATCH 05/10] block: remove per-queue plugging X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Wed, 13 Apr 2011 02:28:07 +0000 (UTC) On Tue, Apr 12, 2011 at 2:08 PM, NeilBrown wrote: > On Wed, 13 Apr 2011 00:34:52 +1000 Dave Chinner wrote: >> >> Well, not really - now taking any sleeping lock or waiting on >> anything can trigger a plug flush where previously you had to >> explicitly issue them. I'm not saying what we had is better, just >> that there are implicit flushes with your changes that are >> inherently uncontrollable... > > It's not just sleeping locks - if preempt is enabled a schedule can happen at > any time - at any depth.  I've seen a spin_unlock do it. Hmm. I don't think we should flush IO in the preemption path. That smells wrong on many levels, just one of them being the "any time, any depth". It also sounds really wrong from an IO pattern standpoint. The process is actually still running, and the IO flushing _already_ does the "only if it's going to sleep" test, but it actually does it _wrong_. The "current->state" check doesn't make sense for a preemption event, because it's not actually going to sleep there. So a patch like the attached (UNTESTED!) sounds like the right thing to do. Whether it makes any difference for any MD issues, who knows.. But considering that the unplugging already used to test for "prev->state != TASK_RUNNING", this is absolutely the right thing to do - that old test was just broken. Linus --- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel kernel/sched.c | 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 48013633d792..a187c3fe027b 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -4111,20 +4111,20 @@ need_resched: try_to_wake_up_local(to_wakeup); } deactivate_task(rq, prev, DEQUEUE_SLEEP); + + /* + * If we are going to sleep and we have plugged IO queued, make + * sure to submit it to avoid deadlocks. + */ + if (blk_needs_flush_plug(prev)) { + raw_spin_unlock(&rq->lock); + blk_flush_plug(prev); + raw_spin_lock(&rq->lock); + } } switch_count = &prev->nvcsw; } - /* - * If we are going to sleep and we have plugged IO queued, make - * sure to submit it to avoid deadlocks. - */ - if (prev->state != TASK_RUNNING && blk_needs_flush_plug(prev)) { - raw_spin_unlock(&rq->lock); - blk_flush_plug(prev); - raw_spin_lock(&rq->lock); - } - pre_schedule(rq, prev); if (unlikely(!rq->nr_running))