From patchwork Sun Aug 27 10:44:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 9923771 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4C205602BD for ; Sun, 27 Aug 2017 10:44:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E50128450 for ; Sun, 27 Aug 2017 10:44:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3054228475; Sun, 27 Aug 2017 10:44:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6BFF528450 for ; Sun, 27 Aug 2017 10:44:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751223AbdH0KoS (ORCPT ); Sun, 27 Aug 2017 06:44:18 -0400 Received: from mail-wr0-f195.google.com ([209.85.128.195]:36794 "EHLO mail-wr0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751189AbdH0KoS (ORCPT ); Sun, 27 Aug 2017 06:44:18 -0400 Received: by mail-wr0-f195.google.com with SMTP id p8so2455873wrf.3; Sun, 27 Aug 2017 03:44:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=01WUvqcyfhivhkiMkiUftAzBtONIH6mIETOJ3p7rUEI=; b=CJXGs9t7sRGC1MrtsXbpOIeNSIRq+pTL7fV0P9v97B690kGJkHdReFzPLL3V2mARUI 9b7gNBeAGN3roZ4w4l1EYM+h+wXJ8QjdKvXEo0N45Cqt4o8/alxNxQuzqt0ws5EKkFPd eluBeRMgZVZdjvC45C/wpQsJUIvUgsiuKHtvaggBQ25zSH9nupf0qkeTAl0xIHp5qjc+ 5NGyX7WJvuxeFbL3UEE483lYP09NGClkw4ycMeAcdQ73bnEE3x/OJum0B96lXJYtkdLr w1zd/Am4NLm1lcaLegS1cHRVZ3LtdNNcK0DZlEY9gBG6s2EzicYs4OsjMrQM734BS13t twKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=01WUvqcyfhivhkiMkiUftAzBtONIH6mIETOJ3p7rUEI=; b=HgY85Joz39KzN4lFs8Brz4Ys0Zs2t7K31nTnUP02DvySP0zREpI137smrpCiO1evZH isQNSGNPOvN4lE6mG+v7VDknaaFdbXSyT/P6tW8rn/uVMAimrEFzxsOaB0v8arg5S92k AOoNWC/LsjZDWI8SCSEElM9SVm3s90+KHLNjL+Ij4yxv21CYpuhaW/2MU/EsbTTJqnKQ 3D6b7xqtwZ3gp0s6dLPKKzXs9NYSt3+KbzWJlGlynY7RjRu4hi7zAA8xC+j/jl7HI34D nTy6fj0vFEMdmNZt8H4WvVwDfnM2xy7p2vONbJaScawibe3ZyqVJtxV82Fkjb1d7sAOY SvYg== X-Gm-Message-State: AHYfb5hmoMgVV93wjik6EElT5ACrnRb8he2ZuMI3aYAgj6NwauVXL/AQ 7SCe8ts/5mPL/g== X-Received: by 10.223.172.230 with SMTP id o93mr2142068wrc.273.1503830656418; Sun, 27 Aug 2017 03:44:16 -0700 (PDT) Received: from localhost.localdomain (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id q203sm5587472wmg.43.2017.08.27.03.44.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 27 Aug 2017 03:44:15 -0700 (PDT) From: Amir Goldstein To: Theodore Ts'o Cc: Eryu Guan , Josef Bacik , fstests@vger.kernel.org, linux-ext4@vger.kernel.org Subject: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug Date: Sun, 27 Aug 2017 13:44:43 +0300 Message-Id: <1503830683-21455-1-git-send-email-amir73il@gmail.com> X-Mailer: git-send-email 2.7.4 Sender: fstests-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This test is motivated by a bug found in ext4 during random crash consistency tests. This test uses device mapper flakey target to demonstrate the bug found using device mapper log-writes target. Signed-off-by: Amir Goldstein --- Ted, While working on crash consistency xfstests [1], I stubmled on what appeared to be an ext4 crash consistency bug. The tests I used rely on the log-writes dm target code written by Josef Bacik, which had little exposure to the wide community as far as I know. I wanted to prove to myself that the found inconsistency was not due to a test bug, so I bisected the failed test to the minimal operations that trigger the failure and wrote a small independent test to reproduce the issue using dm flakey target. The following fsck error is reliably reproduced by replaying some fsx ops on overlapping file regions, then emulating a crash, followed by mount, umount and fsck -nf: ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile 1 write 0x137dd thru 0x21445 (0xdc69 bytes) 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) All 7 operations completed A-OK! _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent *** fsck.ext4 output *** fsck from util-linux 2.27.1 e2fsck 1.42.13 (17-May-2015) Pass 1: Checking inodes, blocks, and sizes Inode 12, end of extent exceeds allowed value (logical block 33, physical block 33441, len 7) Clear? no Inode 12, i_blocks is 184, should be 128. Fix? no Note that the inconsistency is "applied" by journal replay during mount. fsck -nf before mount does not report any errors. I did not intend for this test to be merged as is, but rather to be used by ext4 developers to analyze the problem and then re-write the test with more comments and less arbitrary offset/length values. P.S.: crash consistency tests also reliably reproduce a btrfs fsck error. a detailed report with I/O recording was sent to Josef. P.S.2: crash consistency tests report file data checksum errors on xfs after fsync+crash, but I still need to prove the reliability of these reports. [1] https://github.com/amir73il/xfstests/commits/dm-log-writes tests/generic/501 | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++ tests/generic/501.out | 2 ++ tests/generic/group | 1 + 3 files changed, 83 insertions(+) create mode 100755 tests/generic/501 create mode 100644 tests/generic/501.out diff --git a/tests/generic/501 b/tests/generic/501 new file mode 100755 index 0000000..ccb513d --- /dev/null +++ b/tests/generic/501 @@ -0,0 +1,80 @@ +#! /bin/bash +# FS QA Test No. 501 +# +# This test is motivated by a bug found in ext4 during random crash +# consistency tests. +# +#----------------------------------------------------------------------- +# Copyright (C) 2017 CTERA Networks. All Rights Reserved. +# Author: Amir Goldstein +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! + +_cleanup() +{ + _cleanup_flakey + cd / + rm -f $tmp.* +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/dmflakey + +# real QA test starts here +_supported_fs generic +_supported_os Linux +_require_scratch +_require_dm_target flakey +_require_metadata_journaling $SCRATCH_DEV + +rm -f $seqres.full + +_scratch_mkfs >> $seqres.full 2>&1 + +_init_flakey +_mount_flakey + +fsxops=$tmp.fsxops +cat < $fsxops +write 0x137dd 0xdc69 0x0 +fallocate 0xb531 0xb5ad 0x21446 +collapse_range 0x1c000 0x4000 0x21446 +write 0x3e5ec 0x1a14 0x21446 +zero_range 0x20fac 0x6d9c 0x40000 keep_size +mapwrite 0x216ad 0x274f 0x40000 +EOF +run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile + +_flakey_drop_and_remount +_unmount_flakey +_cleanup_flakey +_check_scratch_fs + +echo "Silence is golden" + +status=0 +exit diff --git a/tests/generic/501.out b/tests/generic/501.out new file mode 100644 index 0000000..00133b6 --- /dev/null +++ b/tests/generic/501.out @@ -0,0 +1,2 @@ +QA output created by 501 +Silence is golden diff --git a/tests/generic/group b/tests/generic/group index 2396b72..bb870f2 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -454,3 +454,4 @@ 449 auto quick acl enospc 450 auto quick rw 500 auto log replay +501 auto quick metadata