From patchwork Wed Mar 3 08:12:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12114831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9E86C15506 for ; Thu, 4 Mar 2021 00:27:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 52DA564E62 for ; Thu, 4 Mar 2021 00:27:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1382094AbhCDAYE (ORCPT ); Wed, 3 Mar 2021 19:24:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:40942 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1842785AbhCCINM (ORCPT ); Wed, 3 Mar 2021 03:13:12 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1614759144; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=I8MgoPWi0A/DEEHhPBXEgxuH6kqVoUyfUIdsEn236AY=; b=pONt1fdU991V35EOQAVrYaE/KnVnkkuNVpZ4ZEP8BWHB84/oDcChdP06vvoru2S3kdHSgd m3yTSEReuligYKvrn4AgdUpRsr55S8ad+JLDuDH7rJg3wq4sP0RsZ7tC9h7e1DOBUn3SY6 pIBD3UX8mH/UAzpRo+1HBfxQK0ZDcFk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D751DAE88; Wed, 3 Mar 2021 08:12:24 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org, fstests@vger.kernel.org Subject: [PATCH 1/2] btrfs: use a dedicated variable for qgroup released data rsv for insert_prealloc_file_extent() Date: Wed, 3 Mar 2021 16:12:19 +0800 Message-Id: <20210303081220.80913-1-wqu@suse.com> X-Mailer: git-send-email 2.30.1 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org There is a piece of weird code in insert_prealloc_file_extent(), which looks like: ret = btrfs_qgroup_release_data(inode, file_offset, len); if (ret < 0) return ERR_PTR(ret); if (trans) { ret = insert_reserved_file_extent(trans, inode, file_offset, &stack_fi, true, ret); ... } extent_info.is_new_extent = true; extent_info.qgroup_reserved = ret; ... Note how the variable @ret is abused here, and if anyone is adding code just after btrfs_qgroup_release_data() call, it's super easy to overwrite the @ret and cause tons of qgroup related bugs. Fix such abuse by introducing new variable @qgroup_released, so that we won't reuse the existing variable @ret. Signed-off-by: Qu Wenruo --- I really want to go back time and slap my face. That damn naming, especially that "extent_info.qgroup_reserved = ret", wasted too long time. What the heck I was doing in the past?? --- fs/btrfs/inode.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4f2f1e932751..4e9717c29451 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9873,6 +9873,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( struct btrfs_path *path; u64 start = ins->objectid; u64 len = ins->offset; + int qgroup_released; int ret; memset(&stack_fi, 0, sizeof(stack_fi)); @@ -9885,14 +9886,14 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( btrfs_set_stack_file_extent_compression(&stack_fi, BTRFS_COMPRESS_NONE); /* Encryption and other encoding is reserved and all 0 */ - ret = btrfs_qgroup_release_data(inode, file_offset, len); - if (ret < 0) - return ERR_PTR(ret); + qgroup_released = btrfs_qgroup_release_data(inode, file_offset, len); + if (qgroup_released < 0) + return ERR_PTR(qgroup_released); if (trans) { ret = insert_reserved_file_extent(trans, inode, file_offset, &stack_fi, - true, ret); + true, qgroup_released); if (ret) return ERR_PTR(ret); return trans; @@ -9905,7 +9906,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( extent_info.file_offset = file_offset; extent_info.extent_buf = (char *)&stack_fi; extent_info.is_new_extent = true; - extent_info.qgroup_reserved = ret; + extent_info.qgroup_reserved = qgroup_released; extent_info.insertions = 0; path = btrfs_alloc_path(); From patchwork Wed Mar 3 08:12:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12114829 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CE79C43142 for ; Thu, 4 Mar 2021 00:26:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 426A964EBA for ; Thu, 4 Mar 2021 00:26:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1382100AbhCDAYF (ORCPT ); Wed, 3 Mar 2021 19:24:05 -0500 Received: from mx2.suse.de ([195.135.220.15]:40968 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1842793AbhCCINR (ORCPT ); Wed, 3 Mar 2021 03:13:17 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1614759149; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FIeRUC4Xz5nUe+cn2fSGODwwohV/zKhXXchpgESZLFg=; b=f1Dy6JBwmyyoujNTD1U+88Z99EF8tQLvcfuPBx8Ywn8fguNsts7LkgNQp3MZqyeZWMKk61 gfoCJR6L3OM0qOWqEDdyJoC2YHt3aemqg4P8iG6sAky++QX9/ZRnmX8daE7m6SXirUexrH 7KgrpJrjJ7OMAyYyjv5rBJStGKjNBVw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 53602B030; Wed, 3 Mar 2021 08:12:29 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org, fstests@vger.kernel.org Cc: Nikolay Borisov , David Sterba Subject: [PATCH 2/2] btrfs: fix qgroup data rsv leak caused by falloc failure Date: Wed, 3 Mar 2021 16:12:20 +0800 Message-Id: <20210303081220.80913-2-wqu@suse.com> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210303081220.80913-1-wqu@suse.com> References: <20210303081220.80913-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org [BUG] When running fsstress with only falloc workload, and a very low qgroup limit set, we can get qgroup data rsv leak at unmount time. BTRFS warning (device dm-0): qgroup 0/5 has unreleased space, type 0 rsv 20480 BTRFS error (device dm-0): qgroup reserved space leaked The minimal reproducer looks like: #!/bin/bash dev=/dev/test/test mnt="/mnt/btrfs" fsstress=~/xfstests-dev/ltp/fsstress runtime=8 workload() { umount $dev &> /dev/null umount $mnt &> /dev/null mkfs.btrfs -f $dev > /dev/null mount $dev $mnt btrfs quota en $mnt btrfs quota rescan -w $mnt btrfs qgroup limit 16m 0/5 $mnt $fsstress -w -z -f creat=10 -f fallocate=10 -p 2 -n 100 \ -d $mnt -v > /tmp/fsstress umount $mnt if dmesg | grep leak ; then echo "!!! FAILED !!!" exit 1 fi } for (( i=0; i < $runtime; i++)); do echo "=== $i/$runtime===" workload done Normally it would fail before round 4. [CAUSE] In function insert_prealloc_file_extent(), we first call btrfs_qgroup_release_data() to know how many bytes are reserved for qgroup data rsv. Then use that @qgroup_released number to continue our work. But after we call btrfs_qgroup_release_data(), we should either queue @qgroup_released to delayed ref or free them manually in error path. Unfortunately, we lack the error handling to free the released bytes, leaking qgroup data rsv. All the error handling function outside won't help at all, as we have released the range, meaning in inode io tree, the EXTENT_QGROUP_RESERVED bit is already cleared, thus all btrfs_qgroup_free_data() call won't free any data rsv. [FIX] Add free_qgroup tag to manually free the released qgroup data rsv. Fixes: 9729f10a608f ("btrfs: inode: move qgroup reserved space release to the callers of insert_reserved_file_extent()") Reported-by: Nikolay Borisov Reported-by: David Sterba Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4e9717c29451..9ffa1b1f3b82 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9910,17 +9910,30 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( extent_info.insertions = 0; path = btrfs_alloc_path(); - if (!path) - return ERR_PTR(-ENOMEM); + if (!path) { + ret = -ENOMEM; + goto free_qgroup; + } ret = btrfs_replace_file_extents(&inode->vfs_inode, path, file_offset, file_offset + len - 1, &extent_info, &trans); btrfs_free_path(path); if (ret) - return ERR_PTR(ret); - + goto free_qgroup; return trans; +free_qgroup: + /* + * We have released qgroup data range at the beginning of the function, + * and normally @qgroup_released bytes will be freed when committing + * transaction. + * But if we error out early, we have to free what we have released + * or we leak qgroup data rsv. + */ + btrfs_qgroup_free_refroot(inode->root->fs_info, + inode->root->root_key.objectid, qgroup_released, + BTRFS_QGROUP_RSV_DATA); + return ERR_PTR(ret); } static int __btrfs_prealloc_file_range(struct inode *inode, int mode,