From patchwork Thu Apr 20 21:41:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 9691377 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 74C3F601D4 for ; Thu, 20 Apr 2017 21:41:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63E2928495 for ; Thu, 20 Apr 2017 21:41:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 586302849C; Thu, 20 Apr 2017 21:41:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BCEC828495 for ; Thu, 20 Apr 2017 21:41:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S945538AbdDTVlZ (ORCPT ); Thu, 20 Apr 2017 17:41:25 -0400 Received: from mail-io0-f173.google.com ([209.85.223.173]:36475 "EHLO mail-io0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S945374AbdDTVlY (ORCPT ); Thu, 20 Apr 2017 17:41:24 -0400 Received: by mail-io0-f173.google.com with SMTP id o22so96978292iod.3 for ; Thu, 20 Apr 2017 14:41:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=hvUX4aAqvt4uAIuOrdtQnKy/WpJmsgca/tCeMxkTcL0=; b=jcYT6X0yXAn46H+2WM749qOf2uDofD6J/9oKL5gUNmhS6N2JZD+aMSBC57BmutOWtB FtqECibzi195i0uJg++Et/+HGGBJ/v9+4i5Js0rH3izeXcYJP8XsSkTbmtdpgJPeqHfx uN2hXMaPo8LbmACQTfsDRykDWulIU3yIbXjlL3KuM8RFO8jC2CbFTP+Gar1pTbjwcGuV FdL4ASFpi3Wa2V8YpDl4tFpj3yh8KKNimCfDn8xsVNf8oo+GbpHend03Cmtn/q7eSgMg H4XWGGEXJJ5DCiS4D1P6SvrFaTVYJGdzgzAqupGLAEMdh5yy2BeIMZz8+1lmgfUDFK6/ sHBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=hvUX4aAqvt4uAIuOrdtQnKy/WpJmsgca/tCeMxkTcL0=; b=qF31tjuWAa7dP9Rptdi0Ar6+J9LXpLVzleo9GeDO766060f1+c2VERJJ+P9LcWi2B2 X3hT0C1ZPXAFt2ftWw3lxdrZGcCEZ+yCJyDhQp/s1oho4Ck25wPv+WulmRCxNf1KmR/9 TVawpH/H4SqVK9W5e7y9qJgerHHou34IjCpzfxiwbNR7PfBEdXvJsSXB2x7yLPRnrYxh JZ9R77loWl6u2dlgRTB0GDPL7tKjo66FpuRn26Ai/c6SYFBqPHFgm6SIKq6HjFCvcWLM Mq/oH+TxrZgeHQ+GFaWTxiKcAPTneweH+GPFVECuxrn7ihsFoA9Y8Rkc+0jQoM80NK9a YkUg== X-Gm-Message-State: AN3rC/5m9G5xNq8TOwjrXYeshWsvrmq4t7NbNC4iTHI7x+obUGskzjtl nevtm9p4H1O33A== X-Received: by 10.107.1.205 with SMTP id 196mr12322222iob.131.1492724483220; Thu, 20 Apr 2017 14:41:23 -0700 (PDT) Received: from [192.168.1.154] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id m124sm240975itd.3.2017.04.20.14.41.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Apr 2017 14:41:22 -0700 (PDT) Subject: Re: [PATCH] blk-mq: fix schedule-while-atomic with scheduler attached To: Omar Sandoval References: <20170420213040.GA30116@vader.DHCP.thefacebook.com> Cc: "linux-block@vger.kernel.org" , Christoph Hellwig From: Jens Axboe Message-ID: <79636c0b-8201-2b68-ae08-f7c3b8e17aac@kernel.dk> Date: Thu, 20 Apr 2017 15:41:21 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170420213040.GA30116@vader.DHCP.thefacebook.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 04/20/2017 03:30 PM, Omar Sandoval wrote: > On Thu, Apr 20, 2017 at 03:13:43PM -0600, Jens Axboe wrote: >> We must have dropped the ctx before we call >> blk_mq_sched_insert_request() with can_block=true, otherwise we risk >> that a flush request can block on insertion if we are currently out of >> tags. >> >> [ 47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002 >> [ 47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre >> [ 47.690572] Preemption disabled at: >> [ 47.690584] [] blk_mq_sched_get_request+0x6c/0x280 >> [ 47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271 >> [ 47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016 >> [ 47.718081] Call Trace: >> [ 47.720903] dump_stack+0x4f/0x73 >> [ 47.724694] ? blk_mq_sched_get_request+0x6c/0x280 >> [ 47.730137] __schedule_bug+0x6c/0xc0 >> [ 47.734314] __schedule+0x559/0x780 >> [ 47.738302] schedule+0x3b/0x90 >> [ 47.741899] io_schedule+0x11/0x40 >> [ 47.745788] blk_mq_get_tag+0x167/0x2a0 >> [ 47.750162] ? remove_wait_queue+0x70/0x70 >> [ 47.754901] blk_mq_get_driver_tag+0x92/0xf0 >> [ 47.759758] blk_mq_sched_insert_request+0x134/0x170 >> [ 47.765398] ? blk_account_io_start+0xd0/0x270 >> [ 47.770679] blk_mq_make_request+0x1b2/0x850 >> [ 47.775766] generic_make_request+0xf7/0x2d0 >> [ 47.780860] submit_bio+0x5f/0x120 >> [ 47.784979] ? submit_bio+0x5f/0x120 >> [ 47.789631] submit_bh_wbc.isra.46+0x10d/0x130 >> [ 47.794902] submit_bh+0xb/0x10 >> [ 47.798719] journal_submit_commit_record+0x190/0x210 >> [ 47.804686] ? _raw_spin_unlock+0x13/0x30 >> [ 47.809480] jbd2_journal_commit_transaction+0x180a/0x1d00 >> [ 47.815925] kjournald2+0xb6/0x250 >> [ 47.820022] ? kjournald2+0xb6/0x250 >> [ 47.824328] ? remove_wait_queue+0x70/0x70 >> [ 47.829223] kthread+0x10e/0x140 >> [ 47.833147] ? commit_timeout+0x10/0x10 >> [ 47.837742] ? kthread_create_on_node+0x40/0x40 >> [ 47.843122] ret_from_fork+0x29/0x40 >> >> Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request") >> Signed-off-by: Jens Axboe >> >> diff --git a/block/blk-mq.c b/block/blk-mq.c >> index 9d7645f24b05..323eed50d615 100644 >> --- a/block/blk-mq.c >> +++ b/block/blk-mq.c >> @@ -1634,8 +1634,10 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) >> blk_mq_try_issue_directly(data.hctx, rq, &cookie); >> return cookie; >> } else if (q->elevator) { >> + blk_mq_put_ctx(data.ctx); >> blk_mq_bio_to_request(rq, bio); >> blk_mq_sched_insert_request(rq, false, true, true, true); >> + return cookie; >> } else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio)) >> blk_mq_run_hw_queue(data.hctx, true); >> >> > > I'm confused, the first thing we check in make_request is: > > if (unlikely(is_flush_fua)) { > blk_mq_bio_to_request(rq, bio); > if (q->elevator) { > blk_mq_sched_insert_request(rq, false, true, true, > true); > } else { > blk_insert_flush(rq); > blk_mq_run_hw_queue(data.hctx, true); > } > } > > and can_block doesn't do anything in the !flush case, so shouldn't it be > changed in that one instead? It should be changed in both cases. But I do wonder why we're triggering the bottom one, and not the top one, since this is clearly a flush. > Alternatively, blk_mq_get_tag() knows how to drop the ctx before it > sleeps, but blk_mq_get_driver_tag() doesn't set data.ctx. I'm not sure > if that'll do what we want, and this is making my head hurt. How about the below? Drop the ctx in all the cases. We only really want to hold on to it for the case where we insert into the software queues. But as long as we guarantee that all cases in that massive if/else drops it, we should be fine, and we can skip the various places that do manual returns to avoid having to hit the put_ctx/return cookie case at the end. diff --git a/block/blk-mq.c b/block/blk-mq.c index dd9e8d6d471d..992f09772f8a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1571,6 +1571,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) plug = current->plug; if (unlikely(is_flush_fua)) { + blk_mq_put_ctx(data.ctx); blk_mq_bio_to_request(rq, bio); if (q->elevator) { blk_mq_sched_insert_request(rq, false, true, true, @@ -1582,6 +1583,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) } else if (plug && q->nr_hw_queues == 1) { struct request *last = NULL; + blk_mq_put_ctx(data.ctx); blk_mq_bio_to_request(rq, bio); /* @@ -1626,20 +1628,19 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) if (same_queue_rq) blk_mq_try_issue_directly(data.hctx, same_queue_rq, &cookie); - - return cookie; } else if (q->nr_hw_queues > 1 && is_sync) { blk_mq_put_ctx(data.ctx); blk_mq_bio_to_request(rq, bio); blk_mq_try_issue_directly(data.hctx, rq, &cookie); - return cookie; } else if (q->elevator) { + blk_mq_put_ctx(data.ctx); blk_mq_bio_to_request(rq, bio); blk_mq_sched_insert_request(rq, false, true, true, true); - } else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio)) + } else if (!blk_mq_merge_queue_io(data.hctx, data.ctx, rq, bio)) { + blk_mq_put_ctx(data.ctx); blk_mq_run_hw_queue(data.hctx, true); + } - blk_mq_put_ctx(data.ctx); return cookie; }