From patchwork Wed Nov 9 02:02:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 9418355 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6656B60459 for ; Wed, 9 Nov 2016 02:03:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F008C28A86 for ; Wed, 9 Nov 2016 02:03:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E41E228B13; Wed, 9 Nov 2016 02:03:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2BFF28AD5 for ; Wed, 9 Nov 2016 02:02:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751479AbcKICC5 (ORCPT ); Tue, 8 Nov 2016 21:02:57 -0500 Received: from mail-pf0-f170.google.com ([209.85.192.170]:34069 "EHLO mail-pf0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751316AbcKICC4 (ORCPT ); Tue, 8 Nov 2016 21:02:56 -0500 Received: by mail-pf0-f170.google.com with SMTP id n85so117495660pfi.1 for ; Tue, 08 Nov 2016 18:02:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=gB+UiNaj2lOaLzRyD6tJL9QkKcYbcjSSHfHljGrnLwo=; b=DFUMtGCtYmv6cSL5Bu+lXl91mopjFq+BdD0l0O0CWNWOoW+691VjoVfLQDvdzEK3zA QxnO142l+75TaSX7AM9zft22G4IA2YtwXPP3Bd8y6yH6FshxwujK3J3BU0dVte2x5QHE lxxAPjJH2K+wYHcKcXgUNoHENrzrft0V+YbInFtapNVUIjIu2JdftcfISLO427MVY7KF mYyoT8LXYm1LebumPy8/iCIoBiPIt5FXC+TwXyoA5hMBo75/4iqQ/h0vqtXxJ4T/dpkc 9IeWV2F+PLeJ+v8w5MG7013mS1C7ukFJzONwCRalsXEZlpJsSFWngjHxbpve3PT2ikjq 8Yug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=gB+UiNaj2lOaLzRyD6tJL9QkKcYbcjSSHfHljGrnLwo=; b=Q8TFOHGpp3LUfKiUkj/5KPvi2M5lfy4Nw+4mc2EfCDai+ihLx0ZwStBC/SOZBSc3lr NJi/1rZnHA0sjpbbL//K4OwACJTpswhLPXd9yakSbajf6NKM5G10oCK6Vfmaxksa7rIt JQ06jij4/Efhr0cW7GVU5vcZuSvkveYFS5nY4GDnscdrvL9dc+igvo+i5Ibe/Pn8KYWG 6tuP8B7hpnbiaHSgm8Km5TDq1NFUBUjLSM0yu/r2Fs4h7VCelJIZxAClXGH4aSkJYfFd GSRUEQblc0GP5J7V7cIqWaKbjzP92t6yzpUXHbYWoAhYCQrTXqLg9eGUisvYh1XBPg42 2yWw== X-Gm-Message-State: ABUngvcpUn6G8Mb6AliGtm3e4Lhz6D5EnFb6U+2LmTaUZBpNQWW7uoosInlhyATXXv8ahg== X-Received: by 10.98.102.22 with SMTP id a22mr28443600pfc.182.1478656975738; Tue, 08 Nov 2016 18:02:55 -0800 (PST) Received: from [192.168.1.176] (66.29.164.166.static.utbb.net. [66.29.164.166]) by smtp.gmail.com with ESMTPSA id b126sm21611190pfg.90.2016.11.08.18.02.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Nov 2016 18:02:54 -0800 (PST) Subject: Re: BUG: Hung task timeouts in for-4.10/dio To: Damien Le Moal , Christoph Hellwig References: <8eed541f-b4c2-37c9-6650-216f3d0aa92f@fb.com> <21f0cddd-28d9-790d-b4fc-4635cca77ea8@deltatee.com> <9fe151a1-2d9f-28ef-25a7-bbcb01d8d972@fb.com> <6997fb64-aaac-208f-fa89-b8b4567f7321@fb.com> <6d532d4a-8d35-f610-65ed-efef949a3686@fb.com> <84693d82-b8dc-42b3-03da-055def7499a0@deltatee.com> <1e0e590b-fde3-a86d-bc60-8e5465928533@fb.com> <84cf8089-7d6d-0146-49fb-91a2d08650b2@wdc.com> <20161109010959.GA32126@lst.de> Cc: Logan Gunthorpe , linux-block@vger.kernel.org, Mike Snitzer From: Jens Axboe Message-ID: <03cda65e-3a82-6ccd-861b-67a55888eb19@kernel.dk> Date: Tue, 8 Nov 2016 19:02:52 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 11/08/2016 06:28 PM, Jens Axboe wrote: > On 11/08/2016 06:25 PM, Damien Le Moal wrote: >> >> On 11/9/16 10:09, Christoph Hellwig wrote: >>> Ok, sounds like I'm really the one to blame. I'll see if I can >>> find a reproducer. Damien, or you using device mapper on that >>> system? >> >> No LVM/md/dm used on boot. Mount is direct to the block device (SSDs >> with ext4). The devices are simple SSDs, so no polling involved. >> >> The hangs suspiciously look like they are either background write or >> flush. So I was wondering if it is indeed related to FLUSH/FUA as Logan >> suggested or the background write stuff, rather than the direct-IO >> optimization & polling. > > The background write stuff is not in either of those branches, plus the > backtrace would have looked different. Yours is showing us waiting for a > request. I don't think it's the direct-io or polling code, it looks like > a generic issue. > >> Will try again/bisect to see if I can get more info. > > Maybe try and revert the one that Logan pointed his finger at, if that > is doable. It smells like an accounting error. One thing that I don't like with the current scheme is the implicit knowledge that certain flags imply sync as well. If we clear any of those flags, then we screw up accounting at the end. Does this make a difference? diff --git a/block/blk-flush.c b/block/blk-flush.c index c486b7aa62ee..d70983e28115 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -395,6 +395,8 @@ void blk_insert_flush(struct request *rq) if (!(fflags & (1UL << QUEUE_FLAG_FUA))) rq->cmd_flags &= ~REQ_FUA; + rq->cmd_flags |= REQ_SYNC; + /* * An empty flush handed down from a stacking driver may * translate into nothing if the underlying device does not