From patchwork Sat Mar 10 18:19:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andiry Xu X-Patchwork-Id: 10274001 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DE7CC602BD for ; Sat, 10 Mar 2018 18:22:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CBC8629106 for ; Sat, 10 Mar 2018 18:22:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C07C329735; Sat, 10 Mar 2018 18:22:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_NONE,T_DKIM_INVALID autolearn=no version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5139429106 for ; Sat, 10 Mar 2018 18:22:10 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id A3FAB2264D243; Sat, 10 Mar 2018 10:15:38 -0800 (PST) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=2607:f8b0:400e:c05::244; helo=mail-pg0-x244.google.com; envelope-from=jix024@eng.ucsd.edu; receiver=linux-nvdimm@lists.01.org Received: from mail-pg0-x244.google.com (mail-pg0-x244.google.com [IPv6:2607:f8b0:400e:c05::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id D362D2264D22A for ; Sat, 10 Mar 2018 10:15:35 -0800 (PST) Received: by mail-pg0-x244.google.com with SMTP id w17so768541pgq.8 for ; Sat, 10 Mar 2018 10:21:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eng.ucsd.edu; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=H1sz9O0VUZmZmpu78M/B9jlaz/1LfH3UXVYpgJEqqRE=; b=EHZ0ib0JLTz6+tFZpKTvw+ZuqCE0fTHEkgZo6lBBoJGwSzQCHSSYyYG4TmESN33wKJ 0cH299QcRZYq3ZOJgut4a1GG6BmEf5p6C8DDW4OvfskopveBoZRYYMW5RCMRPjJdWb6x glxgZBCdIBi5g08XK8XdDbq3NWLjV3ugq9Ukg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=H1sz9O0VUZmZmpu78M/B9jlaz/1LfH3UXVYpgJEqqRE=; b=qBFn0aV2gE395sWU+Sh5KLGR6AqOYQv/qDal+QqdB7maMzvN7OraRbYN1158D8THEc OA1mW4h40SgkI3MZUJoLdpyXgRL0+7PEymL7tPpg8B08z/Ks81D+AKyBU3H/VtfvlGz8 KfpvH0VUOnEIQcGntYukTyCwXlinr0Oqqd+LtUBnm3Eus3RnCJC0cn6jGL3KRPBd2+JZ xewywQfzfuA/MTUL4ZyuGL4uGgbTAhQiC+tUIld0b/fARzDx6gEKyB9Z/RVbjYfICYXn pjzMwDoDAEFkIathYQi4BEL84J5zAfINCxlPMmf5aUiioQ6mVofYMP6lJPCHQDKP2ga4 6D4g== X-Gm-Message-State: AElRT7G0GDi2oG5ekvADUSPkvD4LrDjcK8pf6kmkEM+Ra2mSjLmecDZv cat93pkPjHdg4Ap6NdniIN4M3g== X-Google-Smtp-Source: AG47ELsNGuIApgpYXqp8OgBNOZjQP623LLwqzRFMlSzHy9XFtV+vmZbTzGl1HCcPAhX0mjZjnpLFTQ== X-Received: by 10.99.109.72 with SMTP id i69mr1605769pgc.417.1520706114245; Sat, 10 Mar 2018 10:21:54 -0800 (PST) Received: from brienza-desktop.8.8.4.4 (andxu.ucsd.edu. [132.239.17.134]) by smtp.gmail.com with ESMTPSA id h80sm9210167pfj.181.2018.03.10.10.21.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 10 Mar 2018 10:21:53 -0800 (PST) From: Andiry Xu To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Subject: [RFC v2 81/83] Failure recovery: Inode pages recovery routines. Date: Sat, 10 Mar 2018 10:19:02 -0800 Message-Id: <1520705944-6723-82-git-send-email-jix024@eng.ucsd.edu> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> References: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: coughlan@redhat.com, miklos@szeredi.hu, Andiry Xu , david@fromorbit.com, jack@suse.com, swanson@cs.ucsd.edu, swhiteho@redhat.com, andiry.xu@gmail.com MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP From: Andiry Xu For each inode, NOVA traverses the inode log and records the pages allocated in the bitmap. For directory inode, NOVA only set the log pages. For file and symlink inodes, NOVA needs to set the data pages. NOVA divides the file into 1GB zones, and records the pages fall into the current zone, until all the pages have been recorded. Signed-off-by: Andiry Xu --- fs/nova/bbuild.c | 307 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 307 insertions(+) diff --git a/fs/nova/bbuild.c b/fs/nova/bbuild.c index 35c661a..75dfcba 100644 --- a/fs/nova/bbuild.c +++ b/fs/nova/bbuild.c @@ -665,6 +665,313 @@ static int alloc_bm(struct super_block *sb, unsigned long initsize) return 0; } +/************************** NOVA recovery ****************************/ + +#define MAX_PGOFF 262144 + +struct task_ring { + u64 addr0[512]; + int num; + int inodes_used_count; + u64 *entry_array; + u64 *nvmm_array; +}; + +static int nova_traverse_inode_log(struct super_block *sb, + struct nova_inode *pi, struct scan_bitmap *bm, u64 head) +{ + u64 curr_p; + u64 next; + + curr_p = head; + + if (curr_p == 0) + return 0; + + WARN_ON(curr_p & (PAGE_SIZE - 1)); + set_bm(curr_p >> PAGE_SHIFT, bm, BM_4K); + + next = next_log_page(sb, curr_p); + while (next > 0) { + curr_p = next; + WARN_ON(curr_p & (PAGE_SIZE - 1)); + set_bm(curr_p >> PAGE_SHIFT, bm, BM_4K); + next = next_log_page(sb, curr_p); + } + + return 0; +} + +static void nova_traverse_dir_inode_log(struct super_block *sb, + struct nova_inode *pi, struct scan_bitmap *bm) +{ + nova_traverse_inode_log(sb, pi, bm, pi->log_head); +} + +static int nova_set_ring_array(struct super_block *sb, + struct nova_inode_info_header *sih, struct nova_file_write_entry *entry, + struct task_ring *ring, + unsigned long base, struct scan_bitmap *bm) +{ + unsigned long start, end; + unsigned long pgoff, old_pgoff = 0; + unsigned long index; + unsigned int num_free = 0; + u64 old_entry = 0; + + start = entry->pgoff; + if (start < base) + start = base; + + end = entry->pgoff + entry->num_pages; + if (end > base + MAX_PGOFF) + end = base + MAX_PGOFF; + + for (pgoff = start; pgoff < end; pgoff++) { + index = pgoff - base; + if (ring->nvmm_array[index]) { + if (ring->entry_array[index] != old_entry) { + old_entry = ring->entry_array[index]; + old_pgoff = pgoff; + num_free = 1; + } else { + num_free++; + } + } + } + + for (pgoff = start; pgoff < end; pgoff++) { + index = pgoff - base; + ring->entry_array[index] = (u64)entry; + ring->nvmm_array[index] = (u64)(entry->block >> PAGE_SHIFT) + + pgoff - entry->pgoff; + } + + return 0; +} + +static int nova_set_file_bm(struct super_block *sb, + struct nova_inode_info_header *sih, struct task_ring *ring, + struct scan_bitmap *bm, unsigned long base, unsigned long last_blocknr) +{ + unsigned long nvmm, pgoff; + + if (last_blocknr >= base + MAX_PGOFF) + last_blocknr = MAX_PGOFF - 1; + else + last_blocknr -= base; + + for (pgoff = 0; pgoff <= last_blocknr; pgoff++) { + nvmm = ring->nvmm_array[pgoff]; + if (nvmm) { + set_bm(nvmm, bm, BM_4K); + ring->nvmm_array[pgoff] = 0; + ring->entry_array[pgoff] = 0; + } + } + + return 0; +} + +/* entry given to this function is a copy in dram */ +static void nova_ring_setattr_entry(struct super_block *sb, + struct nova_inode_info_header *sih, + struct nova_setattr_logentry *entry, struct task_ring *ring, + unsigned long base, unsigned int data_bits, struct scan_bitmap *bm) +{ + unsigned long first_blocknr, last_blocknr; + unsigned long pgoff, old_pgoff = 0; + unsigned long index; + unsigned int num_free = 0; + u64 old_entry = 0; + loff_t start, end; + + if (sih->i_size <= entry->size) + goto out; + + start = entry->size; + end = sih->i_size; + + first_blocknr = (start + (1UL << data_bits) - 1) >> data_bits; + + if (end > 0) + last_blocknr = (end - 1) >> data_bits; + else + last_blocknr = 0; + + if (first_blocknr > last_blocknr) + goto out; + + if (first_blocknr < base) + first_blocknr = base; + + if (last_blocknr > base + MAX_PGOFF - 1) + last_blocknr = base + MAX_PGOFF - 1; + + for (pgoff = first_blocknr; pgoff <= last_blocknr; pgoff++) { + index = pgoff - base; + if (ring->nvmm_array[index]) { + if (ring->entry_array[index] != old_entry) { + old_entry = ring->entry_array[index]; + old_pgoff = pgoff; + num_free = 1; + } else { + num_free++; + } + } + } + + for (pgoff = first_blocknr; pgoff <= last_blocknr; pgoff++) { + index = pgoff - base; + ring->nvmm_array[index] = 0; + ring->entry_array[index] = 0; + } + +out: + sih->i_size = entry->size; +} + +static unsigned long nova_traverse_file_write_entry(struct super_block *sb, + struct nova_inode_info_header *sih, struct nova_file_write_entry *entry, + struct task_ring *ring, + unsigned long base, struct scan_bitmap *bm) +{ + unsigned long max_blocknr = 0; + sih->i_size = entry->size; + + if (entry->num_pages != entry->invalid_pages) { + max_blocknr = entry->pgoff + entry->num_pages - 1; + if (entry->pgoff < base + MAX_PGOFF && + entry->pgoff + entry->num_pages > base) + nova_set_ring_array(sb, sih, entry, + ring, base, bm); + } + + return max_blocknr; +} + +static int nova_traverse_file_inode_log(struct super_block *sb, + struct nova_inode *pi, struct nova_inode_info_header *sih, + struct task_ring *ring, struct scan_bitmap *bm) +{ + unsigned long base = 0; + unsigned long last_blocknr = 0, curr_last; + void *entry; + unsigned int btype; + unsigned int data_bits; + u64 curr_p; + u64 next; + u8 type; + + btype = pi->i_blk_type; + data_bits = blk_type_to_shift[btype]; + +again: + curr_p = pi->log_head; + nova_dbg_verbose("Log head 0x%llx, tail 0x%llx\n", + curr_p, pi->log_tail); + if (curr_p == 0 && pi->log_tail == 0) + return 0; + + if (base == 0) { + WARN_ON(curr_p & (PAGE_SIZE - 1)); + set_bm(curr_p >> PAGE_SHIFT, bm, BM_4K); + } + + while (curr_p != pi->log_tail) { + if (goto_next_page(sb, curr_p)) { + curr_p = next_log_page(sb, curr_p); + if (base == 0) { + WARN_ON(curr_p & (PAGE_SIZE - 1)); + set_bm(curr_p >> PAGE_SHIFT, bm, BM_4K); + } + } + + entry = (void *)nova_get_block(sb, curr_p); + + type = nova_get_entry_type(entry); + switch (type) { + case SET_ATTR: + nova_ring_setattr_entry(sb, sih, SENTRY(entry), + ring, base, data_bits, + bm); + curr_p += sizeof(struct nova_setattr_logentry); + break; + case LINK_CHANGE: + curr_p += sizeof(struct nova_link_change_entry); + break; + case FILE_WRITE: + curr_last = nova_traverse_file_write_entry(sb, sih, + WENTRY(entry), ring, base, bm); + curr_p += sizeof(struct nova_file_write_entry); + if (last_blocknr < curr_last) + last_blocknr = curr_last; + break; + default: + nova_dbg("%s: unknown type %d, 0x%llx\n", + __func__, type, curr_p); + NOVA_ASSERT(0); + } + + } + + if (base == 0) { + /* Keep traversing until log ends */ + curr_p &= PAGE_MASK; + next = next_log_page(sb, curr_p); + while (next > 0) { + curr_p = next; + WARN_ON(curr_p & (PAGE_SIZE - 1)); + set_bm(curr_p >> PAGE_SHIFT, bm, BM_4K); + next = next_log_page(sb, curr_p); + } + } + + nova_set_file_bm(sb, sih, ring, bm, base, last_blocknr); + if (last_blocknr >= base + MAX_PGOFF) { + base += MAX_PGOFF; + goto again; + } + + return 0; +} + +static int nova_recover_inode_pages(struct super_block *sb, + struct nova_inode_info_header *sih, struct task_ring *ring, + struct nova_inode *pi, struct scan_bitmap *bm) +{ + unsigned long nova_ino; + + if (pi->deleted == 1) + return 0; + + nova_ino = pi->nova_ino; + ring->inodes_used_count++; + + sih->i_mode = __le16_to_cpu(pi->i_mode); + sih->ino = nova_ino; + + nova_dbgv("%s: inode %lu, head 0x%llx, tail 0x%llx\n", + __func__, nova_ino, pi->log_head, pi->log_tail); + + switch (__le16_to_cpu(pi->i_mode) & S_IFMT) { + case S_IFDIR: + nova_traverse_dir_inode_log(sb, pi, bm); + break; + case S_IFLNK: + /* Treat symlink files as normal files */ + /* Fall through */ + case S_IFREG: + /* Fall through */ + default: + /* In case of special inode, walk the log */ + nova_traverse_file_inode_log(sb, pi, sih, ring, bm); + break; + } + + return 0; +} + /*********************** Recovery entrance *************************/