From patchwork Fri Jul 27 14:37:10 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Lyakas X-Patchwork-Id: 1249941 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id A682A3FC5A for ; Fri, 27 Jul 2012 14:37:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752259Ab2G0OhO (ORCPT ); Fri, 27 Jul 2012 10:37:14 -0400 Received: from mail-ob0-f174.google.com ([209.85.214.174]:46752 "EHLO mail-ob0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752074Ab2G0OhM (ORCPT ); Fri, 27 Jul 2012 10:37:12 -0400 Received: by obbuo13 with SMTP id uo13so4351789obb.19 for ; Fri, 27 Jul 2012 07:37:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=c/owgBq6fer4cNny9v3ewAKpFWTpfgqABlFbshnsWSg=; b=rzXLIOajD91yAUAmKiQwmRIODztu1w0VyzBn8gc1EwQczLVqvRDc6h8jEtCHANnFTH q/F3n79RrvDLu4ZclkKUThn8PnUYz3WkkpZ9IyeBpw216oT0KYs5fiaFILf6e6tyQ49P 7ujmHk62FRRmyRiiMiRt5meZv5Hg5viCqWFhsL7FbDwQSU3+Jn11+sEkXHVlwdIOLsQL HL64bLK6LFW2tAehKm+8tajvBHgrxwdgokYCxGYdg54ZPfayPZ8YHIvYOt/i5bme5WrL Dgp5+g1SXTXGwwGJbU/WMMhZ3gCjEtu65lv5kXrMQK3gIWdGMQ/2y4M4kqWt9e9QpjFu QF/A== MIME-Version: 1.0 Received: by 10.182.216.99 with SMTP id op3mr3906272obc.30.1343399831813; Fri, 27 Jul 2012 07:37:11 -0700 (PDT) Received: by 10.76.5.147 with HTTP; Fri, 27 Jul 2012 07:37:10 -0700 (PDT) In-Reply-To: References: Date: Fri, 27 Jul 2012 17:37:10 +0300 Message-ID: Subject: Re: btrfs send/receive: if new inode ino is less than its new directory ino, incorrect path is sent From: Alex Lyakas To: Alexander Block Cc: linux-btrfs@vger.kernel.org Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Hi Alexander, your solution is simple and elegant. I this this issue is solved now. Thanks! Two minor issues: 1) /* * We need some special handling for inodes that get processed before the parent * directory got created. See process_all_refs for details. * This function does the check if we already created the dir out of order. */ /* * Only creates the inode if it is: * 1. Not a directory * 2. Or a directory which was not created already due to out of order * directories. See did_create_dir and process_all_refs for details. */ These comments tell to look at process_all_refs(), while we should look at process_recorded_refs(). 2) * We can however not delete the orphan in case the inode relies * in a directory that was not created yet (see * __record_new_ref) */ This part should be removed, because your new solution does not work this way. If you find, time, pls look at the two attached scripts. btrfs_test_1.sh: it tries to explore the is_first_ref() issue and founds a problem. Proposed patch - compare also the (dir,gen) tuple and only the name: if (ret < 0) goto out; btrfs_test_2.sh The last test exposes an interesting issue: when a directory has a deleted reference recorded, this deleted reference is not added to the 'check_dirs' list. As a result, the upper directory (already orphanized) is not rmdir'd. Thanks, Alex. On Fri, Jul 27, 2012 at 12:42 AM, Alexander Block wrote: > I have pushed a for-alex branch to github with a new approach for the > whole problem. Can you test this? > > On Thu, Jul 26, 2012 at 4:07 PM, Alexander Block > wrote: >> I'm currently working on another solution for the initial problem. I >> will create a for-alex branch for you to test later. >> >> On Thu, Jul 26, 2012 at 4:04 PM, Alex Lyakas >> wrote: >>> Alexander, >>> (pls let me know when this gets annoying:). >>> >>> Parent: >>> /mnt/src/v2_snap0/ >>> ??? [ 257] file1 >>> >>> Send: >>> /mnt/src/v2_snap1 >>> ??? [ 259] dir1 >>> ??? [ 258] dir2 >>> ??? [ 257] file1 >>> >>> I encountered two problems: >>> 1) process_recorded_refs_if_needed() if needed does not call >>> process_recorded_refs() if both new_refs and deleted_refs() are empty. >>> But in this case, we need to get to finish_outoforder_dir() by dir2 to >>> move file1 under it (this is before dir1 is created). >>> >>> @@ -4199,8 +4227,25 @@ static int >>> process_recorded_refs_if_needed(struct send_ctx *sctx, int at_end) >>> if (!at_end && sctx->cur_ino == sctx->cmp_key->objectid && >>> sctx->cmp_key->type <= BTRFS_INODE_REF_KEY) >>> goto out; >>> - if (list_empty(&sctx->new_refs) && list_empty(&sctx->deleted_refs)) >>> - goto out; >>> + if (list_empty(&sctx->new_refs) && list_empty(&sctx->deleted_refs) && >>> + /* >>> + * If this is a new directory, still do the >>> finish_outoforder_dir() thing, >>> + * even though it has no references recorded. This >>> means that the directory's >>> + * parent has higher inode and was not created yet >>> (thus we should have >>> + * sctx->cur_inode_first_ref_orphan flag set). >>> + * Note that after a call to process_recorded_refs(), >>> new_refs and deleted_refs >>> + * become empty, which prevents further calls to >>> process_recorded_refs(), >>> + * but here we need something else to prevent it, so >>> look at send_progress too. >>> + */ >>> + !(S_ISDIR(sctx->cur_inode_mode) && sctx->cur_inode_new && >>> + sctx->cur_inode_first_ref_orphan && >>> sctx->send_progress == sctx->cur_ino)) >>> + goto out; >>> >>> ret = process_recorded_refs(sctx); >>> >>> Then I encountered another problem that finish_outoforder_dir() does >>> not check for itself the cur_inode_first_ref_orphan flag: >>> @@ -2736,7 +2754,17 @@ static int finish_outoforder_dir(struct >>> send_ctx *sctx, u64 dir, u64 dir_gen) >>> } >>> fctx.dir_ino = dir; >>> >>> - ret = get_cur_path(sctx, dir, dir_gen, fctx.dir_path, 1/*do_print*/); >>> + /* >>> + * If the current directory itself has a parent, which was not >>> + * created yet, we need to use gen_unique_name(). >>> + */ >>> + BUG_ON(sctx->cur_ino != dir || sctx->cur_inode_gen != dir_gen); >>> + if (sctx->cur_inode_first_ref_orphan) >>> + ret = gen_unique_name(sctx, dir, dir_gen, fctx.dir_path); >>> + else >>> + ret = get_cur_path(sctx, dir, dir_gen, fctx.dir_path); >>> >>> Finally, the send_truncate(), send_chmod(), send_chown(),send_utimes() >>> need the following check: >>> >>> if (sctx->cur_ino == ino && sctx->cur_inode_first_ref_orphan) { >>> WARN_ON(sctx->cur_inode_gen != gen); >>> ret = gen_unique_name(sctx, ino, gen, p); >>> } else { >>> ret = get_cur_path(sctx, ino, gen, p); >>> } >>> >>> All of them except utimes() are used with cur_ino only, so for those >>> this check is redundant (and probably makes sense to drop ino/gen >>> parameters of them?). >>> >>> Thanks, >>> Alex. >>> >>> >>> On Thu, Jul 26, 2012 at 1:52 PM, Alex Lyakas >>> wrote: >>>> Hi Alexander, >>>> I did some very initial testing, and there is still an issue. >>>> The logic of finish_outoforder_dir works as expected. But then problem >>>> is that later, when we process xattr/extents or finish the inode, the >>>> code still uses get_cur_path(), which brings an incorrect name. >>>> >>>> Consider the following simple scenario: >>>> >>>> Parent tree: >>>> /mnt/src/v2 >>>> ??? [ 260] file1 >>>> >>>> Send tree: >>>> /mnt/src/v2 >>>> ??? [ 262] dir1 >>>> ??? [ 260] file1 >>>> >>>> So when file1 is being processed, it is first renamed, as expected: >>>> C_RENAME: A_PATH=file1, A_PATH_TO=o260-511-0 >>>> But then, when we finish it, we do: >>>> C_TRUNCATE: A_PATH=o262-517-0/file1, A_SIZE=16 >>>> >>>> So in some functions like send_truncate(), send_write(), send_utimes() >>>> etc, we need: >>>> >>>> - ret = get_cur_path(sctx, ino, gen, p, 0/*do_print*/); >>>> + if (sctx->cur_inode_first_ref_orphan) >>>> + ret = gen_unique_name(sctx, ino, gen, p); >>>> + else >>>> + ret = get_cur_path(sctx, ino, gen, p, 0/*do_print*/); >>>> if (ret < 0) >>>> goto out; >>>> >>>> I will continue testing more complicated cases now. >>>> >>>> Thanks, >>>> Alex. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Jul 24, 2012 at 11:26 PM, Alexander Block >>>> wrote: >>>>> On Wed, Jul 18, 2012 at 7:45 PM, Alex Lyakas >>>>> wrote: >>>>>> Hi Alexander, >>>>>> I am testing different scenarios in order to better understand the >>>>>> non-trivial magic of >>>>>> get_cur_path()/will_overwrite_ref()/did_overwrite_ref()/did_overwrite_first_ref(). >>>>>> I hit the following issue, when testing full-send: >>>>>> >>>>>> This is my source subvolume (inode numbers are written): >>>>>> tree -A --inodes --noreport /mnt/src/tmp/ >>>>>> /mnt/src/tmp/ >>>>>> ??? [ 270] dir2 >>>>>> ??? [ 268] file1_nod >>>>>> >>>>>> As you see, the ino(file1_nod) < ino(dir2). It is very easy to >>>>>> achieve: first create the file, then the dir, and then move the file >>>>>> to dir. >>>>>> >>>>>> During send the following happens (I augmented the send code with many prints): >>>>>> >>>>>> file1_nod is sent first. Since its a new inode, it is sent as an >>>>>> orphan. When recording its reference, __record_new_ref() calls >>>>>> get_cur_path() for its parent (270). Then __get_cur_name_and_parent() >>>>>> is called on 270, which calls is_inode_existent(), which calls >>>>>> get_cur_inode_state(), and the state of the parent is "will_create". >>>>>> So __get_cur_name_and_parent() creates an orphan name for it, and >>>>>> finally the new reference for 268 is recorded as: >>>>>> o270-136-0/file1_nod: >>>>>> >>>>>> [changed_cb:4102] key(256 INODE_ITEM 0) : NEW >>>>>> [changed_cb:4102] key(256 INODE_REF 256) : NEW >>>>>> [changed_cb:4102] key(268 INODE_ITEM 0) : NEW >>>>>> [send_create_inode:2407] NEW ino(268,135) type=0100000, path=[o268-135-0] >>>>>> [changed_cb:4102] key(268 INODE_REF 270) : NEW >>>>>> [get_cur_inode_state:1475] (270,136): L(EX,136) >>>>>> R(NE,18446744072099047770) sp=268 ==> will_create >>>>>> [is_inode_existent:1498] (270,136): NOT existent >>>>>> [__get_cur_name_and_parent:1918] ino(270,136) not existent => unique >>>>>> name [o270-136-0] >>>>>> [get_cur_path:2051] ino(0,0) cur_path=[o270-136-0] >>>>>> [__record_new_ref:2911] record new ref [o270-136-0/file1_nod] >>>>>> >>>>>> Then process_recorded_refs() sees that 268 is still orphan, so it >>>>>> sends "rename" to its valid place, but the problem is that its parent >>>>>> dir was not sent yet (and its parent dir is also an orphan): >>>>>> [process_recorded_refs:2601] ino(268,135): start with refs >>>>>> [28118.347602] [process_recorded_refs:2651] ino(268,135): new=1, >>>>>> did_overwrite_first_ref=0, is_orphan=1, valid_path=[o268-135-0] >>>>>> [28118.347605] [process_recorded_refs:2701] ino(268,135): is orphan, >>>>>> move it: [o268-135-0]=>[o270-136-0/file1_nod] >>>>>> [28118.347610] [process_recorded_refs:2837] checking dir(270,136) >>>>>> [28118.347612] [process_recorded_refs:2869] ino(268,135) done with refs >>>>>> >>>>>> Now the parent dir is processed: >>>>>> [changed_cb:4102] key(270 INODE_ITEM 0) : NEW >>>>>> [send_create_inode:2407] NEW ino(270,136) type=040000, path=[o270-136-0] >>>>>> [changed_cb:4102] key(270 INODE_REF 256) : NEW >>>>>> [get_cur_path:2051] ino(256,133) cur_path=[] >>>>>> [__record_new_ref:2911] record new ref [dir2] >>>>>> [process_recorded_refs:2601] ino(270,136): start with refs >>>>>> [process_recorded_refs:2651] ino(270,136): new=1, >>>>>> did_overwrite_first_ref=0, is_orphan=1, valid_path=[o270-136-0] >>>>>> [process_recorded_refs:2701] ino(270,136): is orphan, move it: >>>>>> [o270-136-0]=>[dir2] >>>>>> [process_recorded_refs:2837] checking dir(256,133) >>>>>> [get_cur_inode_state:1475] (256,133): L(EX,133) >>>>>> R(NE,18446612135413283512) sp=270 ==> did_create >>>>>> [process_recorded_refs:2869] ino(270,136) done with refs >>>>>> >>>>>> Nothing special here, the parent is first sent as an orphan, and then >>>>>> renamed to its valid name, but it's too late. >>>>>> >>>>>> During receive: >>>>>> ERROR: rename o268-135-0 -> o270-136-0/file1_nod failed. No such file >>>>>> or directory >>>>>> >>>>>> I am not yet sure where is the proper place to fix this, I just wanted >>>>>> to report it first. Basically, I think that when sending any kind of >>>>>> A_PATH, it is needed to ensure that path components exist, either as >>>>>> orphan or real path (by sending them out-of-order if needed?). But I >>>>>> am not yet sure where is the core place that should ensure this. >>>>>> >>>>>> Thanks, >>>>>> Alex. >>>>> >>>>> I have pushed a fix for this case. Basically, the solution is to >>>>> postpone the processing of refs in not created dirs until the dir is >>>>> created. Big thanks for investigating this one. diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 68e504c..b83ec5f 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -1597,7 +1597,7 @@ out: static int is_first_ref(struct send_ctx *sctx, struct btrfs_root *root, - u64 ino, u64 dir, + u64 ino, u64 dir, u64 dir_gen, const char *name, int name_len) { int ret; @@ -1613,6 +1613,11 @@ static int is_first_ref(struct send_ctx *sctx, if (ret < 0) goto out; + if (dir != tmp_dir || dir_gen != tmp_dir_gen) { + ret = 0; + goto out; + } + if (name_len != fs_path_len(tmp_name)) { ret = 0; goto out; @@ -2834,7 +2839,7 @@ verbose_printk("btrfs: process_recorded_refs %llu\n", sctx->cur_ino); goto out; if (ret) { ret = is_first_ref(sctx, sctx->parent_root, - ow_inode, cur->dir, cur->name, + ow_inode, cur->dir, cur->dir_gen, cur->name, cur->name_len);