From patchwork Thu Apr 12 16:23:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Jenkins X-Patchwork-Id: 10339003 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 397CF60365 for ; Thu, 12 Apr 2018 16:23:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2919526224 for ; Thu, 12 Apr 2018 16:23:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1DC3A26E79; Thu, 12 Apr 2018 16:23:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B37CF26224 for ; Thu, 12 Apr 2018 16:23:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752561AbeDLQXh (ORCPT ); Thu, 12 Apr 2018 12:23:37 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34936 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752752AbeDLQXg (ORCPT ); Thu, 12 Apr 2018 12:23:36 -0400 Received: by mail-wm0-f66.google.com with SMTP id r82so10724193wme.0; Thu, 12 Apr 2018 09:23:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=Na5JChgQwNpkZcgzE/vddctEyauVg+JIPPc2kS195JA=; b=jDkcLJMmXR4Qp+w2B4SUCFu/MEPssriwA2ymFXVekvh5BXWe/okoaLmkKTBHq0CRAo QdsUe+34ApOlXSX6Jocaj94lUJGa50Ya34l5yoRLd62OSqlUF5jlXgqt8F7AOx7gCu9n /SLuqCKV0RbWCVhRmFC8jpi/2XJzajOG0Nrf4m1KLKZJVJevFhJbsqB0o/sdzMXlo136 zbs0CgwoLS+/q2is18KODp4PU+tuNBqMIj6CMfOJrQil+xmBmVzp2cuvbqrfh4NAI+L0 qnL6CEztIl9v/Z3epFdwkRhebKOe9CcW8q+OvXLoxTWSaL5gWjJZCukwcnP6v7fS93S6 6ROw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Na5JChgQwNpkZcgzE/vddctEyauVg+JIPPc2kS195JA=; b=KBXGr673FUeuLN/EoyXOGmYSPWxGGliTqzzhrggFwAFuHATp41DbkTN978UdXAkzeS ZXLkU6CStQbqUa8iPqmn4Xu0wfpQHxvxehyeoVDbxeUwJm4qKBj3q6JX1ytL/iDX/OPY m6psh+mMhq9LsScOl9FLw7y6TJtoGUuedLBEWeK8bV/fXX6FPfZW/g7umLUH8z8AfGTb hDsNT7dIjukjv3kKBk4vdrWaJl3aMZMu7Z1kztRomDrFcgbpCiXn9CAvtyNqpmBfZR2g 117rL73k4nLNDg13Jb+gPEY+gjOCbeLNGUxQrEQu79RsPIIFntq7hA7jJZSV299sh+Gp pE0A== X-Gm-Message-State: ALQs6tCTgJ0BbP+dO+2aEfTIOoldbHWH6P/62brsOzHNJvCg/nOJ6so/ L+MPba9cyduhYwjgeNRnvJk= X-Google-Smtp-Source: AIpwx4+wpqXZEGebnGTl40XkDTJ3Rjzljiwaw23WW0vuOcBgcwJwltxrWeK643isWr9XHPk1Oj5qfQ== X-Received: by 10.28.13.205 with SMTP id 196mr1282548wmn.73.1523550214798; Thu, 12 Apr 2018 09:23:34 -0700 (PDT) Received: from alan-laptop.carrier.duckdns.org (host-89-243-165-90.as13285.net. [89.243.165.90]) by smtp.gmail.com with ESMTPSA id a19sm3599782wme.45.2018.04.12.09.23.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 12 Apr 2018 09:23:33 -0700 (PDT) From: Alan Jenkins To: Jens Axboe , linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Alan Jenkins , stable@vger.kernel.org Subject: [PATCH] block: do not use interruptible wait anywhere Date: Thu, 12 Apr 2018 17:23:21 +0100 Message-Id: <20180412162321.3671-1-alan.christopher.jenkins@gmail.com> X-Mailer: git-send-email 2.14.3 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When blk_queue_enter() waits for a queue to unfreeze, or unset the PREEMPT_ONLY flag, do not allow it to be interrupted by a signal. The PREEMPT_ONLY flag was introduced later in commit 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI device is resumed asynchronously, i.e. after un-freezing userspace tasks. So that commit exposed the bug as a regression in v4.15. A mysterious SIGBUS (or -EIO) sometimes happened during the time the device was being resumed. Most frequently, there was no kernel log message, and we saw Xorg or Xwayland killed by SIGBUS.[1] [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979 Without this fix, I get an IO error in this test: # dd if=/dev/sda of=/dev/null iflag=direct & \ while killall -SIGUSR1 dd; do sleep 0.1; done & \ echo mem > /sys/power/state ; \ sleep 5; killall dd # stop after 5 seconds The interruptible wait was added to blk_queue_enter in commit 3ef28e83ab15 ("block: generic request_queue reference counting"). Before then, the interruptible wait was only in blk-mq, but I don't think it could ever have been correct. Cc: stable@vger.kernel.org Signed-off-by: Alan Jenkins Reviewed-by: Bart Van Assche --- block/blk-core.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index abcb8684ba67..5a6d20069364 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -915,7 +915,6 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) while (true) { bool success = false; - int ret; rcu_read_lock(); if (percpu_ref_tryget_live(&q->q_usage_counter)) { @@ -947,14 +946,12 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) */ smp_rmb(); - ret = wait_event_interruptible(q->mq_freeze_wq, + wait_event(q->mq_freeze_wq, (atomic_read(&q->mq_freeze_depth) == 0 && (preempt || !blk_queue_preempt_only(q))) || blk_queue_dying(q)); if (blk_queue_dying(q)) return -ENODEV; - if (ret) - return ret; } }