From patchwork Sat Dec 2 00:03:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Heinz Mauelshagen X-Patchwork-Id: 10088333 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 25B8360327 for ; Sat, 2 Dec 2017 00:04:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1734C2A0BA for ; Sat, 2 Dec 2017 00:04:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BF112A88F; Sat, 2 Dec 2017 00:04:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0EEB82A851 for ; Sat, 2 Dec 2017 00:04:32 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 38B6D80476; Sat, 2 Dec 2017 00:04:32 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1DC6460C97; Sat, 2 Dec 2017 00:04:32 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id D9AFD1801210; Sat, 2 Dec 2017 00:04:31 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id vB2046wd015126 for ; Fri, 1 Dec 2017 19:04:06 -0500 Received: by smtp.corp.redhat.com (Postfix) id E0EFC5C578; Sat, 2 Dec 2017 00:04:06 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from redhat.com.com (unknown [10.40.205.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id 24FA55C544; Sat, 2 Dec 2017 00:04:05 +0000 (UTC) From: Heinz Mauelshagen To: dm-devel@redhat.com Date: Sat, 2 Dec 2017 01:03:48 +0100 Message-Id: In-Reply-To: References: In-Reply-To: References: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: dm-devel@redhat.com Cc: snitzer@redhat.com Subject: [dm-devel] [PATCH V2 01/12] dm raid: fix deadlock caused by stopped writes X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Sat, 02 Dec 2017 00:04:32 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP md_stop_writes() is called in presuspend causing deadlocks on bios submitted afterwards which happens on loaded raid sets with conversion requests. Fix by moving md_stop_writes to postsuspend. Hence the raid set is quiesced remove superfluous readonly setting too. Adjust target version to be able to recognize the fix. --- Documentation/device-mapper/dm-raid.txt | 3 ++- drivers/md/dm-raid.c | 20 +++++++++----------- 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.txt index 32df07e29f68..4d260fedcd8b 100644 --- a/Documentation/device-mapper/dm-raid.txt +++ b/Documentation/device-mapper/dm-raid.txt @@ -343,5 +343,6 @@ Version History 1.11.0 Fix table line argument order (wrong raid10_copies/raid10_format sequence) 1.11.1 Add raid4/5/6 journal write-back support via journal_mode option -1.12.1 fix for MD deadlock between mddev_suspend() and md_write_start() available +1.12.1 Fix for MD deadlock between mddev_suspend() and md_write_start() available 1.13.0 Fix dev_health status at end of "recover" (was 'a', now 'A') +1.13.1 Fix deadlock caused by early md_stop_writes() diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 6319d846e0ad..536666facbf1 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3613,24 +3613,23 @@ static void raid_io_hints(struct dm_target *ti, struct queue_limits *limits) blk_limits_io_opt(limits, chunk_size * mddev_data_stripes(rs)); } -static void raid_presuspend(struct dm_target *ti) -{ - struct raid_set *rs = ti->private; - - md_stop_writes(&rs->md); -} - static void raid_postsuspend(struct dm_target *ti) { struct raid_set *rs = ti->private; if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) { + /* + * Writes have to be stopped before suspending to avoid deadlocks. + * + * https://bugzilla.redhat.com/show_bug.cgi?id=1514539 + */ + if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery)) + md_stop_writes(&rs->md); + mddev_lock_nointr(&rs->md); mddev_suspend(&rs->md); mddev_unlock(&rs->md); } - - rs->md.ro = 1; } static void attempt_restore_of_faulty_devices(struct raid_set *rs) @@ -3894,7 +3893,7 @@ static void raid_resume(struct dm_target *ti) static struct target_type raid_target = { .name = "raid", - .version = {1, 13, 0}, + .version = {1, 13, 1}, .module = THIS_MODULE, .ctr = raid_ctr, .dtr = raid_dtr, @@ -3903,7 +3902,6 @@ static struct target_type raid_target = { .message = raid_message, .iterate_devices = raid_iterate_devices, .io_hints = raid_io_hints, - .presuspend = raid_presuspend, .postsuspend = raid_postsuspend, .preresume = raid_preresume, .resume = raid_resume,