From patchwork Sat Aug 30 16:00:40 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 4813741 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 2578DC0338 for ; Sat, 30 Aug 2014 16:23:13 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 300612012D for ; Sat, 30 Aug 2014 16:23:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0548C2012B for ; Sat, 30 Aug 2014 16:23:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751928AbaH3QXH (ORCPT ); Sat, 30 Aug 2014 12:23:07 -0400 Received: from linux-libre.fsfla.org ([208.118.235.54]:39746 "EHLO linux-libre.fsfla.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751900AbaH3QXF (ORCPT ); Sat, 30 Aug 2014 12:23:05 -0400 X-Greylist: delayed 720 seconds by postgrey-1.27 at vger.kernel.org; Sat, 30 Aug 2014 12:23:05 EDT Received: from freie.home (home.lxoliva.fsfla.org [172.31.160.22]) by linux-libre.fsfla.org (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id s7UGAxl7020077 for ; Sat, 30 Aug 2014 16:10:59 GMT Received: from free.home (free.home [172.31.160.1]) by freie.home (8.14.8/8.14.7) with ESMTP id s7UG0hRc015895; Sat, 30 Aug 2014 13:00:43 -0300 From: Alexandre Oliva To: linux-btrfs@vger.kernel.org Subject: fixes for btrfs check --repair Organization: Free thinker, not speaking for the GNU Project User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) Date: Sat, 30 Aug 2014 13:00:40 -0300 Message-ID: MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, T_TVD_MIME_NO_HEADERS, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP I got a faulty memory module a while ago, and it ran for a while, corrupting a number of filesystems on that server. Most of the corruption is long gone, as the filesystems (ceph osds) were reconstructed, but I tried really hard to avoid having to rebuild one 4TB filesystem from scratch, since it was still fully operational. I failed, but in the process, I ran into and fixed two btrfs check --repair bugs. I gave up when removing an old snapshot caused the delayed refs processing to abort because it couldn't find a ref to delete, whereas btrfs check --repair completed successfully without fixing anything. Mounting the apparently-clean filesystem would still run into the same delayed refs error, but trying to map the logical extent back to a file produced an error. Since it was far too big to preserve, even in metadata only, I didn't, and proceeded to mkfs.btrfs right away. Here are the patches. check: do not dereference tree_refs as data_refs From: Alexandre Oliva In a filesystem corrupted by a faulty memory module, btrfsck would get very confused attempting to access backrefs that weren't data backrefs as if they were. Besides invoking undefined behavior for accessing potentially-uninitialized data past the end of objects, or with dynamic types unrelated with the static types held in the corresponding memory, it used offsets and lengths from such fields that did not correspond to anything in the filesystem proper. Moving the test for full backrefs and checking that they're data backrefs earlier avoided the crash I was running into, but that was not enough to make the filesystem complete a successful repair. Signed-off-by: Alexandre Oliva --- cmds-check.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index 66c982f..319dd2b 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -4781,15 +4781,17 @@ static int verify_backrefs(struct btrfs_trans_handle *trans, return 0; list_for_each_entry(back, &rec->backrefs, list) { + if (back->full_backref || !back->is_data) + continue; + dback = (struct data_backref *)back; + /* * We only pay attention to backrefs that we found a real * backref for. */ if (dback->found_ref == 0) continue; - if (back->full_backref) - continue; /* * For now we only catch when the bytes don't match, not the @@ -4905,6 +4907,9 @@ static int verify_backrefs(struct btrfs_trans_handle *trans, * references and fix up the ones that don't match. */ list_for_each_entry(back, &rec->backrefs, list) { + if (back->full_backref || !back->is_data) + continue; + dback = (struct data_backref *)back; /* @@ -4913,8 +4918,6 @@ static int verify_backrefs(struct btrfs_trans_handle *trans, */ if (dback->found_ref == 0) continue; - if (back->full_backref) - continue; if (dback->bytes == best->bytes && dback->disk_bytenr == best->bytenr) @@ -5134,14 +5137,16 @@ static int find_possible_backrefs(struct btrfs_trans_handle *trans, int ret; list_for_each_entry(back, &rec->backrefs, list) { + /* Don't care about full backrefs (poor unloved backrefs) */ + if (back->full_backref || !back->is_data) + continue; + dback = (struct data_backref *)back; /* We found this one, we don't need to do a lookup */ if (dback->found_ref) continue; - /* Don't care about full backrefs (poor unloved backrefs) */ - if (back->full_backref) - continue; + key.objectid = dback->root; key.type = BTRFS_ROOT_ITEM_KEY; key.offset = (u64)-1;