From patchwork Wed Jul 7 00:21:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70A88C07E96 for ; Wed, 7 Jul 2021 00:21:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5641161CB2 for ; Wed, 7 Jul 2021 00:21:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230006AbhGGAXw (ORCPT ); Tue, 6 Jul 2021 20:23:52 -0400 Received: from mail.kernel.org ([198.145.29.99]:52180 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229873AbhGGAXw (ORCPT ); Tue, 6 Jul 2021 20:23:52 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id C027761CAC; Wed, 7 Jul 2021 00:21:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617272; bh=LgYaZN4M2MNTfb/b/VqXPJGc/4nMrpa9oSJ5ivd44/c=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=pqX2RzXK6OlQcUx8pokSWw90s8F23HyjeKR0a3l/+knGqXlbfV5dyjlQx8xOFc/cz o5gnWE8tqHIKWCi6qPMhKwVUqU7xvBSAMnFLVkpTGA4c9f5+MFcdpbdWH3ybcHH53r STlc+rZ5ouWb7DECArOFs2R3a5jMovUJcxGkDkQgefZSfLk6+PCxqB4wsLah27yLee dynbNkXLWq4BOzRi+L+lm6xnln9uFewT4RyCC8xastOIky8PkxJYZpinFDyqLUPtJb FYhtL/Iop9KtGeaP5xob2oH6XGErHTNxAFr+5Fw+X1SGCIVbXCqezq33gPg7pJsUzK 5Dv19+j0K03ow== Subject: [PATCH 1/8] xfs/172: disable test when file writes don't use delayed allocation From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:12 -0700 Message-ID: <162561727244.543423.13321546742830675478.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong This test tries to exploit an interaction between delayed allocation and writeback on full filesystems to see if it can trip up the filestreams allocator. The behaviors do not present if the filesystem allocates space at write time, so disable it under these scenarios. Signed-off-by: Darrick J. Wong Reviewed-by: Allison Henderson --- tests/xfs/172 | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/tests/xfs/172 b/tests/xfs/172 index 0d1b441e..c0495305 100755 --- a/tests/xfs/172 +++ b/tests/xfs/172 @@ -16,9 +16,37 @@ _begin_fstest rw filestreams # real QA test starts here _supported_fs xfs - +_require_command "$FILEFRAG_PROG" filefrag _require_scratch +# The first _test_streams call sets up the filestreams allocator to fail and +# then checks that it actually failed. It does this by creating a very small +# filesystem, writing a lot of data in parallel to separate streams, and then +# flushes the dirty data, also in parallel. To trip the allocator, the test +# relies on writeback combining adjacent dirty ranges into large allocation +# requests which eventually bleed across AGs. This happens either because the +# big writes are slow enough that filestreams contexts expire between +# allocation requests, or because the AGs are so full at allocation time that +# the bmapi allocator decides to scan for a less full AG. Either way, stream +# directories share AGs, which is what the test considers a success. +# +# However, this only happens if writes use the delayed allocation code paths. +# If the kernel allocates small amounts of space at the time of each write() +# call, the successive small allocations never trip the bmapi allocator's +# rescan thresholds and will keep pushing out the expiration time, with the +# result that the filestreams allocator succeeds in maintaining the streams. +# The test considers this a failure. +# +# Make sure that a regular buffered write produces delalloc reservations. +# This effectively disables the test for files with extent size hints or DAX +# mode set. +_scratch_mkfs > $seqres.full +_scratch_mount +$XFS_IO_PROG -f -c 'pwrite 0 64k' $SCRATCH_MNT/testy &> /dev/null +$FILEFRAG_PROG -v $SCRATCH_MNT/testy 2>&1 | grep -q delalloc || \ + _notrun "test requires delayed allocation buffered writes" +_scratch_unmount + _check_filestreams_support || _notrun "filestreams not available" # test reaper works by setting timeout low. Expected to fail From patchwork Wed Jul 7 00:21:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CB43C07E96 for ; Wed, 7 Jul 2021 00:21:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EA7D061CB2 for ; Wed, 7 Jul 2021 00:21:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229891AbhGGAX5 (ORCPT ); Tue, 6 Jul 2021 20:23:57 -0400 Received: from mail.kernel.org ([198.145.29.99]:52322 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229873AbhGGAX5 (ORCPT ); Tue, 6 Jul 2021 20:23:57 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 3CEBF61CAC; Wed, 7 Jul 2021 00:21:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617278; bh=NO12Tw6QZY5DocZQ9diAFbXa7mcr1ltqynm3BdIKO2I=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=bBbnEUmU7hsYQNwcwAum9oBg75rMyU+bqrvsDXzgs14nbBdci9Ku8QA4inj85HfAw NE6DzD1OKBr/2HfXhIeHxgPTRVJgeVWQgZ4rhNblY2XUUH/+X694kU+mnW4gAkucmi /kMikwtyvHRqg1WnQ4U6efH+JB5a23mwAVbYHDhbJ4P7eT3NK6yLpT285YIMOH/3lX ohFqnWIgo4tZTzqlquU+r+LIooFwQgo/h/aFm97/+prBBavKAOHg06BtmqkTN0uesS uZQEv26MGsffmqJuMAUPXtknHgJSO/pJMY/1XwAJsHGa3PcHevHNQefQgrzjiL8958 QYa06VVuEh65w== Subject: [PATCH 2/8] generic/561: hide assertions when duperemove is killed From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:17 -0700 Message-ID: <162561727795.543423.1496821526582808789.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Use some bash redirection trickery to capture in $seqres.full all of bash's warnings about duperemove being killed due to assertions triggering. Signed-off-by: Darrick J. Wong Reviewed-by: Allison Henderson --- tests/generic/561 | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/tests/generic/561 b/tests/generic/561 index bfd4443d..85037e50 100755 --- a/tests/generic/561 +++ b/tests/generic/561 @@ -62,8 +62,13 @@ dupe_run=$TEST_DIR/${seq}-running touch $dupe_run for ((i = 0; i < $((2 * LOAD_FACTOR)); i++)); do while [ -e $dupe_run ]; do - $DUPEREMOVE_PROG -dr --dedupe-options=same $testdir \ - >>$seqres.full 2>&1 + # Employ shell trickery here so that the golden output does not + # capture assertions that trigger when killall shoots down + # dupremove processes in an arbitrary order, which leaves the + # memory in an inconsistent state long enough for the assert + # to trip. + cmd="$DUPEREMOVE_PROG -dr --dedupe-options=same $testdir" + bash -c "$cmd" >> $seqres.full 2>&1 done 2>&1 | sed -e '/Terminated/d' & dedup_pids="$! $dedup_pids" done From patchwork Wed Jul 7 00:21:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89096C07E96 for ; Wed, 7 Jul 2021 00:21:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 71FD761CAD for ; Wed, 7 Jul 2021 00:21:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230033AbhGGAYD (ORCPT ); Tue, 6 Jul 2021 20:24:03 -0400 Received: from mail.kernel.org ([198.145.29.99]:52480 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229873AbhGGAYD (ORCPT ); Tue, 6 Jul 2021 20:24:03 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id BC5CF61CAC; Wed, 7 Jul 2021 00:21:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617283; bh=84NI+vdNCAjgNt1u6M0ETco3h9hnPRmdnm7wHHgqLPI=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=tNHwe3zmzmCbkX5Ph1ib4GgjvpMYpEOR2peO1NKU4vA6wpAqSnSZWElbDrNXk/MJG fSKZ+V8CcYGXGkuO5i9aVv9KZXrWLKqS16PjuJ93o33nr3TuxGfI5d5AvUpLMrtd2W l1u1diQjlM0MzqIbVcB6/iG8qx18uMbLy/bGVHziyK0sKSzBwVsrtCE6/piFNzC6MK +kSLlulv/S4dBsKolmpG8u6jT0725k/1vDFcKPQFtSXp6/xM8Y4YyPbv87pbKf3mvG 5TA7mxp/iCmo2SPN9SF/qP7Zr2qyeIcXeYxt+BhOtIYiK6eEpg8PTsJEzbnEBKHHL7 257mLtaMHkQpA== Subject: [PATCH 3/8] shared/298: fix random deletion when filenames contain spaces From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:23 -0700 Message-ID: <162561728342.543423.12599584091972556414.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Correct the deletion loop in this test to work properly when there are files in $here that have spaces in their name. Signed-off-by: Darrick J. Wong Reviewed-by: Allison Henderson --- tests/shared/298 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/shared/298 b/tests/shared/298 index 981a4dfc..bd52b6a0 100755 --- a/tests/shared/298 +++ b/tests/shared/298 @@ -163,7 +163,7 @@ get_holes $img_file > $fiemap_ref # Delete some files find $loop_mnt -type f -print | $AWK_PROG \ - 'BEGIN {srand()}; {if(rand() > 0.7) print $1;}' | xargs rm + 'BEGIN {srand()}; {if(rand() > 0.7) printf("%s\0", $0);}' | xargs -0 rm echo "done." echo -n "Running fstrim..." From patchwork Wed Jul 7 00:21:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD097C07E96 for ; Wed, 7 Jul 2021 00:21:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 95B1A61CAC for ; Wed, 7 Jul 2021 00:21:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229876AbhGGAYJ (ORCPT ); Tue, 6 Jul 2021 20:24:09 -0400 Received: from mail.kernel.org ([198.145.29.99]:52572 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229873AbhGGAYI (ORCPT ); Tue, 6 Jul 2021 20:24:08 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 4DEEA619B9; Wed, 7 Jul 2021 00:21:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617289; bh=wGU90Unr0g1HSxXhPYJwFr3DMSyFOBLmzQzyrlTUh00=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=igRqwbYF1sbyM1fhVJKt1T+vMkPsmLC5OT5+s7YFGEcNT6APetTRscVrL99y+lJAz 0333gJ8TsrTvXWcTtOd1mnZP5rtP/FJoJHzVleE2W3mjufj2wgkMFftiWbCLnNjB1p DehjbqXmAN7Luit6PYC7cmFibneUb17DY3ddTnrzC+Qq2BbZP/xNxtr+yfh2s5T++Y khwTFLJogNpDtKTV2eN5TTU5ol/jm1OLz413RkENiDkXziQ0aU0uEdM20HMGNbsQvy Q8uNnATjbj6HBBaMzNkJiz/H9Z9OetOXAkEs9phJkY5npYsGLZnQD9n1xRItrUcpTg 1wUxgQGQU8XyA== Subject: [PATCH 4/8] dmthin: erase the metadata device properly before starting From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:28 -0700 Message-ID: <162561728893.543423.5093723938379703860.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Every now and then I see the following failure when running generic/347: Reviewed-by: Allison Henderson --- generic/347.out +++ generic/347.out.bad @@ -1,2 +1,2 @@ QA output created by 347 -=== completed +failed to create dm thin pool device Accompanied by the following dmesg spew: device-mapper: thin metadata: sb_check failed: blocknr 7016996765293437281: wanted 0 device-mapper: block manager: superblock validator check failed for block 0 device-mapper: thin metadata: couldn't read superblock device-mapper: table: 253:2: thin-pool: Error creating metadata object device-mapper: ioctl: error adding target to table 7016996765293437281 is of course the magic number 0x6161616161616161, which are stale ondisk contents left behind by previous tests that wrote known tests patterns to files on the scratch device. This is a bit surprising, since _dmthin_init supposedly zeroes the first 4k of the thin pool metadata device before initializing the pool. Or does it? dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null Herein lies the problem: the dd process writes zeroes into the page cache and exits. Normally the block layer will flush the page cache after the last file descriptor is closed, but once in a while the terminating dd process won't be the only process in the system with an open file descriptor! That process is of course udev. The write() call from dd triggers a kernel uevent, which starts udev. If udev is running particularly slowly, it'll still be running an instant later when dd terminates, thereby preventing the page cache flush. If udev is still running a moment later when we call dmsetup to set up the thin pool, the pool creation will issue a bio to read the ondisk superblock. This read isn't coherent with the page cache, so it sees old disk contents and the test fails even though we supposedly formatted the metadata device. Fix this by explicitly flushing the page cache after writing the zeroes. Fixes: 4b52fffb ("dm-thinp helpers in common/dmthin") Signed-off-by: Darrick J. Wong --- common/dmthin | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/common/dmthin b/common/dmthin index 3b1c7d45..91147e47 100644 --- a/common/dmthin +++ b/common/dmthin @@ -113,8 +113,12 @@ _dmthin_init() _dmsetup_create $DMTHIN_DATA_NAME --table "$DMTHIN_DATA_TABLE" || \ _fatal "failed to create dm thin data device" - # Zap the pool metadata dev - dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null + # Zap the pool metadata dev. Explicitly fsync the zeroes to disk + # because a slow-running udev running concurrently with dd can maintain + # an open file descriptor. The block layer only flushes the page cache + # on last close, which means that the thin pool creation below will + # see the (stale) ondisk contents and fail. + dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 conv=fsync &>/dev/null # Thin pool # "start length thin-pool metadata_dev data_dev data_block_size low_water_mark" From patchwork Wed Jul 7 00:21:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB799C07E96 for ; Wed, 7 Jul 2021 00:21:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 94F3F61CAC for ; Wed, 7 Jul 2021 00:21:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230048AbhGGAYO (ORCPT ); Tue, 6 Jul 2021 20:24:14 -0400 Received: from mail.kernel.org ([198.145.29.99]:52724 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229919AbhGGAYO (ORCPT ); Tue, 6 Jul 2021 20:24:14 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id C550C61C91; Wed, 7 Jul 2021 00:21:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617294; bh=nr9OVZYxHAbfh9voBUxv1+S94pQ3NcNKVzG/jNmUiyk=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=A0tXrkzWbDeAOJU25EClO3bfq0yjdZUuuNh8LNWUqggY2vff+OJWC6TFseqwCDtr7 +61+06TqTCR/5/yVGKiqJgRb0tdsb8crexO1wFB4FjfrHuETG1BCF77c9xdmqibmTS qvSdkr288v/YIe16q+5NipCDRDXzBW1LOxxDTWh0ZoOabobdzQQjOVOZhFMK3dY5BS zGiAV3nmqlYT2SQQ7julFp1tmGNXd/iBCjmKhlsMn5wdirpjAQEkASYEvY1s7edn5c GFIk1Q/3u91ITN02EvPdK/5sWOVBoKTVNn4TISITVXkG//HctmYkauYXa6N3fj59UE GTzV9tNPxwI+A== Subject: [PATCH 5/8] check: run _check_filesystems in an OOM-happy subshell From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:34 -0700 Message-ID: <162561729448.543423.13588309966120368094.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong While running fstests one night, I observed that fstests stopped abruptly because ./check ran _check_filesystems to run xfs_repair. In turn, repair (which inherited oom_score_adj=-1000 from ./check) consumed so much memory that the OOM killer ran around killing other daemons, rendering the system nonfunctional. This is silly -- we set an OOM score adjustment of -1000 on the ./check process so that the test framework itself wouldn't get OOM-killed, because that aborts the entire run. Everything else is fair game for that, including subprocesses started by _check_filesystems. Therefore, adapt _check_filesystems (and its children) to run in a subshell with a much higher oom score adjustment. Signed-off-by: Darrick J. Wong --- check | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/check b/check index de8104d0..bb7e030c 100755 --- a/check +++ b/check @@ -525,17 +525,20 @@ _summary() _check_filesystems() { + local ret=0 + if [ -f ${RESULT_DIR}/require_test ]; then - _check_test_fs || err=true + _check_test_fs || ret=1 rm -f ${RESULT_DIR}/require_test* else _test_unmount 2> /dev/null fi if [ -f ${RESULT_DIR}/require_scratch ]; then - _check_scratch_fs || err=true + _check_scratch_fs || ret=1 rm -f ${RESULT_DIR}/require_scratch* fi _scratch_unmount 2> /dev/null + return $ret } _expunge_test() @@ -558,11 +561,15 @@ test $? -eq 77 && HAVE_SYSTEMD_SCOPES=yes # Make the check script unattractive to the OOM killer... OOM_SCORE_ADJ="/proc/self/oom_score_adj" -test -w ${OOM_SCORE_ADJ} && echo -1000 > ${OOM_SCORE_ADJ} +function _adjust_oom_score() { + test -w "${OOM_SCORE_ADJ}" && echo "$1" > "${OOM_SCORE_ADJ}" +} +_adjust_oom_score -1000 # ...and make the tests themselves somewhat more attractive to it, so that if # the system runs out of memory it'll be the test that gets killed and not the -# test framework. +# test framework. The test is run in a separate process without any of our +# functions, so we open-code adjusting the OOM score. # # If systemd is available, run the entire test script in a scope so that we can # kill all subprocesses of the test if it fails to clean up after itself. This @@ -875,9 +882,12 @@ function run_section() rm -f ${RESULT_DIR}/require_scratch* err=true else - # the test apparently passed, so check for corruption - # and log messages that shouldn't be there. - _check_filesystems + # The test apparently passed, so check for corruption + # and log messages that shouldn't be there. Run the + # checking tools from a subshell with adjusted OOM + # score so that the OOM killer will target them instead + # of the check script itself. + (_adjust_oom_score 250; _check_filesystems) || err=true _check_dmesg || err=true fi From patchwork Wed Jul 7 00:21:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FD8AC07E96 for ; Wed, 7 Jul 2021 00:21:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 184A861CAD for ; Wed, 7 Jul 2021 00:21:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230019AbhGGAYT (ORCPT ); Tue, 6 Jul 2021 20:24:19 -0400 Received: from mail.kernel.org ([198.145.29.99]:52910 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229919AbhGGAYT (ORCPT ); Tue, 6 Jul 2021 20:24:19 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 47BF161C91; Wed, 7 Jul 2021 00:21:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617300; bh=c785Xk/13yhAfMGqHE1CXfFeAX/U1nwgbLROr3HhVjU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=JHO7iy5GhGW2kYtWDMqWLFuemIQytBk3c1nnRCtZpoDIvuls0zbQRzblYyNyimxyo aE34P+ny7NHymmH7bscw/aq6199tgHzcmXG+lKNcVcGv25jk59m46nmO/flgv4cKTr RvNQtKXANWcZ1SrW//fA4xegD1FBXs0AEubHpdJYGheQci2TdQ4SdvQ2apULfeFIyx ZxTstdn1bZK1AGoNNInVA4wLvkJ2lQdzwXGc38nCi75DjKBF2g8RUtwmTAoU3jwE6k pImI00OZEYusHj8MyCfGt+/QT3gJvJqi7x6bl/S+p6DSKKkA1vE+BHjFgr38/aLiQz jdeBCIZ9zk2Bw== Subject: [PATCH 6/8] xfs/084: fix test program status collection and processing From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:39 -0700 Message-ID: <162561729997.543423.18037428142167687667.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong On a test VM with 1.2GB memory, I noticed that the test will sometimes fail because resvtest leaks too much memory and gets OOM killed. It would be useful to _notrun the test when this happens so that it doesn't appear as an intermittent regression. The exit code processing in this test is incorrect, since "$?" will get us the exit status of _filter_resv, not $here/src/resvtest. Fix that as part of learning to detect a SIGKILL and skip the test. Signed-off-by: Darrick J. Wong --- tests/xfs/084 | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tests/xfs/084 b/tests/xfs/084 index 5967fe12..e796fec4 100755 --- a/tests/xfs/084 +++ b/tests/xfs/084 @@ -33,13 +33,17 @@ _require_test echo echo "*** First case - I/O blocksize same as pagesize" $here/src/resvtest -i 20 -b $pgsize "$TEST_DIR/resv" | _filter_resv -[ $? -eq 0 ] && echo done +res=${PIPESTATUS[0]} +[ $res -eq 137 ] && _notrun "resvtest -i 20 -b $pgsize was SIGKILLed (OOM?)" +[ $res -eq 0 ] && echo done rm -f "$TEST_DIR/mumble" echo echo "*** Second case - 512 byte I/O blocksize" $here/src/resvtest -i 40 -b 512 "$TEST_DIR/resv" | _filter_resv -[ $? -eq 0 ] && echo done +res=${PIPESTATUS[0]} +[ $res -eq 137 ] && _notrun "resvtest -i 40 -b 512 was SIGKILLed (OOM?)" +[ $res -eq 0 ] && echo done rm -f "$TEST_DIR/grumble" # success, all done From patchwork Wed Jul 7 00:21:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3255C07E96 for ; Wed, 7 Jul 2021 00:21:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8C29D61CAD for ; Wed, 7 Jul 2021 00:21:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229919AbhGGAYZ (ORCPT ); Tue, 6 Jul 2021 20:24:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:52998 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229873AbhGGAYZ (ORCPT ); Tue, 6 Jul 2021 20:24:25 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id B6BB161C91; Wed, 7 Jul 2021 00:21:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617305; bh=8+V/k9laNcuwuCM0X8S6pZTJufyL0kJq/7Qgp0n2Mec=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=WM58MUdJdZIOpjbSD41LaGIkvKwjd0sQ8LYWRZnHprXYiC+rby0i+RntEviEB8NvC TIBs6mPvkZEZ9G2qYfgJ/8nZEFAv1iPXadCqy3kpTiJXAjExy8DT5VpM6+owVmC3kR it4l+JosXvA4Jtt67t/P2dmNNcif1KTrEANEXadX7LF/Y/y5Uc8mYDJ6JhjtyWycUc pb+XF+DT9Zb26D+xxLEu3xOK9fE4MeLN2kKC9O78wav1sROK55lp4C4g06w19CygWr tYqwgLT1lVi0a5z/dtsFPo9eMMg9XKDxx6KeymGQFSgMsR9UuXxEHTmAyVWCvYGYKC 86bCG9NIyVbFw== Subject: [PATCH 7/8] generic/371: disable speculative preallocation regressions on XFS From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:45 -0700 Message-ID: <162561730547.543423.5029188797370208051.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Once in a very long while, the fallocate calls in this test will fail due to ENOSPC conditions. While in theory this test is careful only to allocate at most 160M of space from a 256M filesystem, there's a twist on XFS: speculative preallocation. The first loop in this test is an 80M appending write done in units of 4k. Once the file size hits 64k, XFS will begin speculatively preallocating blocks past the end of the file; as the file grows larger, so will the speculative preallocation. Since the pwrite/rm loop races with the fallocate/rm loop, it's possible that the fallocate loop will free that file just before the buffered write extends the speculative preallocation out to 160MB. With fs and log overhead, that doesn't leave enough free space to start the 80MB fallocate request, which tries to avoid disappointing the caller by freeing all speculative preallocations. That fails if the pwriter thread owns the IOLOCK on $testfile1, so fallocate returns ENOSPC and the test fails. The simple solution here is to disable speculative preallocation by setting an extent size hint if the fs is XFS. Signed-off-by: Darrick J. Wong Reviewed-by: Allison Henderson --- tests/generic/371 | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tests/generic/371 b/tests/generic/371 index c94fa85e..a2fdaf7b 100755 --- a/tests/generic/371 +++ b/tests/generic/371 @@ -18,10 +18,18 @@ _begin_fstest auto quick enospc prealloc _supported_fs generic _require_scratch _require_xfs_io_command "falloc" +test "$FSTYP" = "xfs" && _require_xfs_io_command "extsize" _scratch_mkfs_sized $((256 * 1024 * 1024)) >> $seqres.full 2>&1 _scratch_mount +# Disable speculative post-EOF preallocation on XFS, which can grow fast enough +# that a racing fallocate then fails. +if [ "$FSTYP" = "xfs" ]; then + alloc_sz="$(_get_file_block_size $SCRATCH_MNT)" + $XFS_IO_PROG -c "extsize $alloc_sz" $SCRATCH_MNT >> $seqres.full +fi + testfile1=$SCRATCH_MNT/testfile1 testfile2=$SCRATCH_MNT/testfile2 From patchwork Wed Jul 7 00:21:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12361353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 028F6C07E96 for ; Wed, 7 Jul 2021 00:21:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DAE00619B9 for ; Wed, 7 Jul 2021 00:21:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230071AbhGGAYa (ORCPT ); Tue, 6 Jul 2021 20:24:30 -0400 Received: from mail.kernel.org ([198.145.29.99]:53102 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229834AbhGGAYa (ORCPT ); Tue, 6 Jul 2021 20:24:30 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 3076261C91; Wed, 7 Jul 2021 00:21:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625617311; bh=wsR7O4YF+XUkT361BzP6TGQSgYg5RrVBcsDkHEmSIkw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=uL2LRtkPHGJVV2Mwb4oqjtOS8UpWvQ7Ukp/RdQAsAY0aRmgyQ3+oOfK3lVXOPaFFU Q+etkScTvgpW2bxY9+tfhu91bfi5sCSU4WBQGcYea8SNnZwSMpz+n58WP/pDAHhdqP RVbhfXt9sybooIpK7gR47P4apVfP7MjbxbhUMCloYpEml7WwCV5oJUCpbs0o1SCNDX 3LdenGECazuQhOEbC4AdM9o9W7tV3kar6g4WJ9rutRMW7VfB5EAITHLKlrv/IBtwva JZnzx8GQZ87gr4O2nUHn4Tg/U4nLVL/t9pyD4GmKiC3s6PygbkhvNEWBtws4XalZ1t 0RLxOn7ODHVvQ== Subject: [PATCH 8/8] generic/019: don't dump cores when fio/fsstress hit io errors From: "Darrick J. Wong" To: djwong@kernel.org, guaneryu@gmail.com Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Date: Tue, 06 Jul 2021 17:21:50 -0700 Message-ID: <162561731092.543423.12382027169225482171.stgit@locust> In-Reply-To: <162561726690.543423.15033740972304281407.stgit@locust> References: <162561726690.543423.15033740972304281407.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org From: Darrick J. Wong Disable coredumps so that fstests won't mark the test failed when the EIO injector causes an mmap write to abort with SIGBUS. Signed-off-by: Darrick J. Wong Reviewed-by: Allison Henderson --- tests/generic/019 | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/generic/019 b/tests/generic/019 index bd234815..b8d025d6 100755 --- a/tests/generic/019 +++ b/tests/generic/019 @@ -62,6 +62,9 @@ NUM_JOBS=$((4*LOAD_FACTOR)) BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV` FILE_SIZE=$((BLK_DEV_SIZE * 512)) +# Don't fail the test just because fio or fsstress dump cores +ulimit -c 0 + cat >$fio_config <