From patchwork Wed Feb 7 17:06:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 10205671 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 787BE6020F for ; Wed, 7 Feb 2018 17:06:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 672D128E2E for ; Wed, 7 Feb 2018 17:06:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B36C28E2D; Wed, 7 Feb 2018 17:06:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B73028E2D for ; Wed, 7 Feb 2018 17:06:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754689AbeBGRGQ (ORCPT ); Wed, 7 Feb 2018 12:06:16 -0500 Received: from mail-qk0-f176.google.com ([209.85.220.176]:41200 "EHLO mail-qk0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754639AbeBGRGP (ORCPT ); Wed, 7 Feb 2018 12:06:15 -0500 Received: by mail-qk0-f176.google.com with SMTP id p23so1992723qke.8 for ; Wed, 07 Feb 2018 09:06:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=O7YhNjgMDVoXViBYMD7zEZ61EABVhovAyN9TzGSj3Ts=; b=HMHGdXokBSSmAPIlGqLoY0JDrofkNCuNsjO04SGyabo4CMSPKvqGNpd188UeHez/KX nCxhyVRtJ6A2KcZnSxN49j4ViMktG3tinjvKIKE1d0nfXMrzyDP9iq2VcEWwsBdu5way I8HFnc/cVkUtzADwvA82mbEuhOSZnjg4u0yQKxeFvWT4+7rEkYnAZUZxpVRU3++51z90 NBB6nsfBdOUk5xM/4HElVHS0djwfXOZNP9t3XwHITUKGnRLWobCrD2hU5LvDRzORDl7u Tr7Rqz74W8/TRt/KG7G7CWO41K2qE6e+0WlSjoQ3ni04UiDD76Qz0EL9KyjtoWAv7Xg5 a7iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=O7YhNjgMDVoXViBYMD7zEZ61EABVhovAyN9TzGSj3Ts=; b=slivAnZhAnUhIBkn6wYP17SYY9X+DqY8qyLHMy/M2kavWuC/vCjS/4j3a3T068vfHU /vVDwFoED9DguanSS1BUjVr8eVJhjJHtYe1bV0IQiLHLger61xbe8iaIvbODquGl7XZH 1dYZPUKZQishESm0+cydjkXpPekCk0JlFZ8SoBbAKskaqvyi4SKE6DYO5htlfow5h9lv 1S0T+nAjVc1b8DUPrFWYDyRQqzkTWqGf80wDZBp8gZaTxY1plOlzKX/3NSihQh3fy7HS Kg9rEM014By7Wbna1Cp9hz9O7ONO6TnE7x5ZHxp6pkQkhB08lS/GBKfiTvmm4XLyMML2 4sQw== X-Gm-Message-State: APf1xPClTyw2wEyO0YPNq0D3FuddiomBFsQeIoMZJ2h5j/yPDSMHv7E+ cEneI6MyF8q0GRdUPRSRDXo= X-Google-Smtp-Source: AH8x226FzclHJqOJScHYSUIBwhYlMwCNvuCsiB7o8ehg/sm3DbkWsPs5g11iViLDxlpaRDsWmlcQdA== X-Received: by 10.55.86.195 with SMTP id k186mr9536516qkb.338.1518023174803; Wed, 07 Feb 2018 09:06:14 -0800 (PST) Received: from localhost (dhcp-ec-8-6b-ed-7a-cf.cpe.echoes.net. [72.28.5.223]) by smtp.gmail.com with ESMTPSA id q28sm1370411qkh.43.2018.02.07.09.06.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Feb 2018 09:06:14 -0800 (PST) Date: Wed, 7 Feb 2018 09:06:12 -0800 From: Tejun Heo To: Bart Van Assche Cc: Jens Axboe , linux-block@vger.kernel.org, Christoph Hellwig Subject: Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling Message-ID: <20180207170612.GB695913@devbig577.frc2.facebook.com> References: <20180207011133.25957-1-bart.vanassche@wdc.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180207011133.25957-1-bart.vanassche@wdc.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hello, Bart. On Tue, Feb 06, 2018 at 05:11:33PM -0800, Bart Van Assche wrote: > The following race can occur between the code that resets the timer > and completion handling: > - The code that handles BLK_EH_RESET_TIMER resets aborted_gstate. > - A completion occurs and blk_mq_complete_request() calls > __blk_mq_complete_request(). > - The timeout code calls blk_add_timer() and that function sets the > request deadline and adjusts the timer. > - __blk_mq_complete_request() frees the request tag. > - The timer fires and the timeout handler gets called for a freed > request. Can you see whether by any chance the following patch fixes the issue? If not, can you share the repro case? Thanks. diff --git a/block/blk-mq.c b/block/blk-mq.c index df93102..651d18c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -836,8 +836,8 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved) * ->aborted_gstate is set, this may lead to ignored * completions and further spurious timeouts. */ - blk_mq_rq_update_aborted_gstate(req, 0); blk_add_timer(req); + blk_mq_rq_update_aborted_gstate(req, 0); break; case BLK_EH_NOT_HANDLED: break;