From patchwork Fri May 20 20:50:33 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 9130051 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8BCC360762 for ; Fri, 20 May 2016 20:50:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7CBB227B90 for ; Fri, 20 May 2016 20:50:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6E3A627D10; Fri, 20 May 2016 20:50:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9EF2A27B90 for ; Fri, 20 May 2016 20:50:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751168AbcETUus (ORCPT ); Fri, 20 May 2016 16:50:48 -0400 Received: from mail-pa0-f53.google.com ([209.85.220.53]:34398 "EHLO mail-pa0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751032AbcETUur (ORCPT ); Fri, 20 May 2016 16:50:47 -0400 Received: by mail-pa0-f53.google.com with SMTP id qo8so43138783pab.1 for ; Fri, 20 May 2016 13:50:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=UMuKhTInV95YnYMCvjdr4qNalErgGtvYSb7thjrUyL8=; b=OWSiBe3d86LXC7NrtPSEDMvSjO4J04oVJH6+SSFXFc1To5J07mX1nX8wd9KrFeo29a VicyiMzOIhT+Mt27LxnjleMxBX84CaA/ZR/huz2osk9/ouMcZ7Ki+3TsN691R1OB7eti Ht+nSGfYV52WywPsnL4YIi8RmBpPVhbE1M/Joz3eLpg46u8hx6DFvoF6cwwP6WxhrX7z 24/xlqZF3bVALIm8ScnAfR++vWLGaqjr8L+D61TEgQfAw1VHaup/Ih2uS2vgWeHOJHLf MwdPIgR1WF2feaHcDwYCHwnOdzW/pSQBKBZOX0XzDDHsCg7iNpiZRTs+pf8IcHB38HQn uzOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=UMuKhTInV95YnYMCvjdr4qNalErgGtvYSb7thjrUyL8=; b=J5HSxCfLVkETYQb7b5m+C/SdutDf+Rure4moHe6R8YKos0GnEDW5dm+owvSaN54uc+ ZKoQ0gb59yGB4/8lt2M5sm7QSlLHZl915Wk5uYf8CDyVvDmPHkTGa9LVMDwMrI8j5sKf 0hTzzF5ZBlFRAwlLP9ptM4bYbRH+LfosCxIccGAWe7qhnQB4BtlfXYvP0tuUyDBpPzNL ir4sYsgLknYv990iPhS8XZGUPr2tLhp0dwEYvijvS+sVt+Jg6SOYRYHVnYml8HqOB1ZF VBIxg6RcmDW0O4pv4TQPyrFLhS0rgWw1SbWTet7cGKhrrmukCMImIBxP3Lq9E6GzExWL HZ9A== X-Gm-Message-State: AOPr4FUj6DPc9h/Cz7ODMV9g/9FsbZM9/qJi0LHRrIVhnzOXzhLSqV3PVRW930ZrTfP15wYX X-Received: by 10.66.237.35 with SMTP id uz3mr7778984pac.145.1463777446246; Fri, 20 May 2016 13:50:46 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::b:b8e0]) by smtp.gmail.com with ESMTPSA id q129sm29218331pfb.16.2016.05.20.13.50.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 20 May 2016 13:50:45 -0700 (PDT) From: Omar Sandoval X-Google-Original-From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: Chris Mason , Al Viro , kernel-team@fb.com, Omar Sandoval Subject: [PATCH] Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes Date: Fri, 20 May 2016 13:50:33 -0700 Message-Id: <1e937bd41beebb68b1bf1201205059cb8f614dad.1463777299.git.osandov@fb.com> X-Mailer: git-send-email 2.8.2 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval Commit fe742fd4f90f ("Revert "btrfs: switch to ->iterate_shared()"") backed out the conversion to ->iterate_shared() for Btrfs because the delayed inode handling in btrfs_real_readdir() is racy. However, we can still do readdir in parallel if there are no delayed nodes. This is a temporary fix which upgrades the shared inode lock to an exclusive lock only when we have delayed items until we come up with a more complete solution. While we're here, rename the btrfs_{get,put}_delayed_items functions to make it very clear that they're just for readdir. Tested with xfstests and by doing a parallel kernel build: while make tinyconfig && make -j4 && git clean dqfx; do : done along with a bunch of parallel finds in another shell: while true; do for ((i=0; i<4; i++)); do find . >/dev/null & done wait done Signed-off-by: Omar Sandoval --- fs/btrfs/delayed-inode.c | 27 ++++++++++++++++++++++----- fs/btrfs/delayed-inode.h | 10 ++++++---- fs/btrfs/inode.c | 10 ++++++---- 3 files changed, 34 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 6cef0062f929..d60cd17ea66b 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1606,15 +1606,23 @@ int btrfs_inode_delayed_dir_index_count(struct inode *inode) return 0; } -void btrfs_get_delayed_items(struct inode *inode, struct list_head *ins_list, - struct list_head *del_list) +bool btrfs_readdir_get_delayed_items(struct inode *inode, + struct list_head *ins_list, + struct list_head *del_list) { struct btrfs_delayed_node *delayed_node; struct btrfs_delayed_item *item; delayed_node = btrfs_get_delayed_node(inode); if (!delayed_node) - return; + return false; + + /* + * We can only do one readdir with delayed items at a time because of + * item->readdir_list. + */ + inode_unlock_shared(inode); + inode_lock(inode); mutex_lock(&delayed_node->mutex); item = __btrfs_first_delayed_insertion_item(delayed_node); @@ -1641,10 +1649,13 @@ void btrfs_get_delayed_items(struct inode *inode, struct list_head *ins_list, * requeue or dequeue this delayed node. */ atomic_dec(&delayed_node->refs); + + return true; } -void btrfs_put_delayed_items(struct list_head *ins_list, - struct list_head *del_list) +void btrfs_readdir_put_delayed_items(struct inode *inode, + struct list_head *ins_list, + struct list_head *del_list) { struct btrfs_delayed_item *curr, *next; @@ -1659,6 +1670,12 @@ void btrfs_put_delayed_items(struct list_head *ins_list, if (atomic_dec_and_test(&curr->refs)) kfree(curr); } + + /* + * The VFS is going to do up_read(), so we need to downgrade back to a + * read lock. + */ + downgrade_write(&inode->i_rwsem); } int btrfs_should_delete_dir_index(struct list_head *del_list, diff --git a/fs/btrfs/delayed-inode.h b/fs/btrfs/delayed-inode.h index 0167853c84ae..2495b3d4075f 100644 --- a/fs/btrfs/delayed-inode.h +++ b/fs/btrfs/delayed-inode.h @@ -137,10 +137,12 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root); void btrfs_destroy_delayed_inodes(struct btrfs_root *root); /* Used for readdir() */ -void btrfs_get_delayed_items(struct inode *inode, struct list_head *ins_list, - struct list_head *del_list); -void btrfs_put_delayed_items(struct list_head *ins_list, - struct list_head *del_list); +bool btrfs_readdir_get_delayed_items(struct inode *inode, + struct list_head *ins_list, + struct list_head *del_list); +void btrfs_readdir_put_delayed_items(struct inode *inode, + struct list_head *ins_list, + struct list_head *del_list); int btrfs_should_delete_dir_index(struct list_head *del_list, u64 index); int btrfs_readdir_delayed_dir_index(struct dir_context *ctx, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6b7fe291a174..6ab6ca195f2f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5733,6 +5733,7 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx) int name_len; int is_curr = 0; /* ctx->pos points to the current index? */ bool emitted; + bool put = false; /* FIXME, use a real flag for deciding about the key type */ if (root->fs_info->tree_root == root) @@ -5750,7 +5751,8 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx) if (key_type == BTRFS_DIR_INDEX_KEY) { INIT_LIST_HEAD(&ins_list); INIT_LIST_HEAD(&del_list); - btrfs_get_delayed_items(inode, &ins_list, &del_list); + put = btrfs_readdir_get_delayed_items(inode, &ins_list, + &del_list); } key.type = key_type; @@ -5897,8 +5899,8 @@ next: nopos: ret = 0; err: - if (key_type == BTRFS_DIR_INDEX_KEY) - btrfs_put_delayed_items(&ins_list, &del_list); + if (put) + btrfs_readdir_put_delayed_items(inode, &ins_list, &del_list); btrfs_free_path(path); return ret; } @@ -10181,7 +10183,7 @@ static const struct inode_operations btrfs_dir_ro_inode_operations = { static const struct file_operations btrfs_dir_file_operations = { .llseek = generic_file_llseek, .read = generic_read_dir, - .iterate = btrfs_real_readdir, + .iterate_shared = btrfs_real_readdir, .unlocked_ioctl = btrfs_ioctl, #ifdef CONFIG_COMPAT .compat_ioctl = btrfs_ioctl,