From patchwork Wed Jul 11 10:19:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tetsuo Handa X-Patchwork-Id: 10519267 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F38D86054E for ; Wed, 11 Jul 2018 10:20:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D90AC28C1C for ; Wed, 11 Jul 2018 10:20:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CD65128C11; Wed, 11 Jul 2018 10:20:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C3912891C for ; Wed, 11 Jul 2018 10:20:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732468AbeGKKXl (ORCPT ); Wed, 11 Jul 2018 06:23:41 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:40725 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726463AbeGKKXl (ORCPT ); Wed, 11 Jul 2018 06:23:41 -0400 Received: from fsav403.sakura.ne.jp (fsav403.sakura.ne.jp [133.242.250.102]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id w6BAJtRr010327; Wed, 11 Jul 2018 19:19:55 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav403.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav403.sakura.ne.jp); Wed, 11 Jul 2018 19:19:54 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav403.sakura.ne.jp) Received: from [192.168.1.8] (softbank126074194044.bbtec.net [126.74.194.44]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id w6BAJntU010282 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 11 Jul 2018 19:19:54 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Subject: Re: INFO: task hung in iterate_supers To: viro@zeniv.linux.org.uk Cc: syzbot , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com References: <000000000000da8a9b0570a29c01@google.com> <0145d376-7ef8-3e17-5a24-94de946a01e5@I-love.SAKURA.ne.jp> From: Tetsuo Handa Message-ID: <7dc64116-f43d-3739-7b35-4eef7a80e343@I-love.SAKURA.ne.jp> Date: Wed, 11 Jul 2018 19:19:54 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <0145d376-7ef8-3e17-5a24-94de946a01e5@I-love.SAKURA.ne.jp> Content-Language: en-US Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2018/07/10 19:34, Tetsuo Handa wrote: > It turned out that, although the reason of stalling v9fs_mount() is currently > unknown, the reason of many processes stuck at iterate_supers() is that > they are unable to take s->s_umount object due to down_write_nested() below. > > /* > * sget() can have s_umount recursion. > * > * When it cannot find a suitable sb, it allocates a new > * one (this one), and tries again to find a suitable old > * one. > * > * In case that succeeds, it will acquire the s_umount > * lock of the old one. Since these are clearly distrinct > * locks, and this object isn't exposed yet, there's no > * risk of deadlocks. > * > * Annotate this by putting this lock in a different > * subclass. > */ > down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); > Guessing from commit dabe0dc194d5d56d ("vfs: fix the rest of sget() races"), this is an overlooked sget() race. iterate_supers() is calling down_read(&sb_s_umount) on all superblocks found by traversing super_blocks list. But sget_userns() adds "newly created superblock to super_blocks list with ->s_umount held by alloc_super() for write" before "mount_fs() sets SB_BORN flag after returning from type->mount()". Therefore, if type->mount() (e.g. v9fs_mount()) took long after sget() for some reason, processes will be blocked on a superblock on SB_BORN not yet set. Doing the test (which is done after ->s_umount is held) before holding ->s_umount (patch shown below) works for me. Although there seems to be many bugs in v9fs code, I think that making sure that other threads won't try to wait for superblock with SB_BORN not yet set is better. --- fs/super.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/fs/super.c b/fs/super.c index 50728d9..03c3b3f 100644 --- a/fs/super.c +++ b/fs/super.c @@ -37,6 +37,11 @@ #include #include "internal.h" +static inline bool sb_is_alive(struct super_block *sb) +{ + return sb->s_root && (sb->s_flags & SB_BORN); +} + static int thaw_super_locked(struct super_block *sb); static LIST_HEAD(super_blocks); @@ -407,8 +412,7 @@ static int grab_super(struct super_block *s) __releases(sb_lock) bool trylock_super(struct super_block *sb) { if (down_read_trylock(&sb->s_umount)) { - if (!hlist_unhashed(&sb->s_instances) && - sb->s_root && (sb->s_flags & SB_BORN)) + if (!hlist_unhashed(&sb->s_instances) && sb_is_alive(sb)) return true; up_read(&sb->s_umount); } @@ -590,6 +594,8 @@ static void __iterate_supers(void (*f)(struct super_block *)) spin_lock(&sb_lock); list_for_each_entry(sb, &super_blocks, s_list) { + if (!sb_is_alive(sb)) + continue; if (hlist_unhashed(&sb->s_instances)) continue; sb->s_count++; @@ -620,13 +626,15 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg) spin_lock(&sb_lock); list_for_each_entry(sb, &super_blocks, s_list) { + if (!sb_is_alive(sb)) + continue; if (hlist_unhashed(&sb->s_instances)) continue; sb->s_count++; spin_unlock(&sb_lock); down_read(&sb->s_umount); - if (sb->s_root && (sb->s_flags & SB_BORN)) + if (sb_is_alive(sb)) f(sb, arg); up_read(&sb->s_umount); @@ -656,11 +664,13 @@ void iterate_supers_type(struct file_system_type *type, spin_lock(&sb_lock); hlist_for_each_entry(sb, &type->fs_supers, s_instances) { + if (!sb_is_alive(sb)) + continue; sb->s_count++; spin_unlock(&sb_lock); down_read(&sb->s_umount); - if (sb->s_root && (sb->s_flags & SB_BORN)) + if (sb_is_alive(sb)) f(sb, arg); up_read(&sb->s_umount); @@ -689,6 +699,8 @@ static struct super_block *__get_super(struct block_device *bdev, bool excl) if (hlist_unhashed(&sb->s_instances)) continue; if (sb->s_bdev == bdev) { + if (!sb_is_alive(sb)) + continue; sb->s_count++; spin_unlock(&sb_lock); if (!excl) @@ -696,7 +708,7 @@ static struct super_block *__get_super(struct block_device *bdev, bool excl) else down_write(&sb->s_umount); /* still alive? */ - if (sb->s_root && (sb->s_flags & SB_BORN)) + if (sb_is_alive(sb)) return sb; if (!excl) up_read(&sb->s_umount); @@ -813,11 +825,13 @@ struct super_block *user_get_super(dev_t dev) if (hlist_unhashed(&sb->s_instances)) continue; if (sb->s_dev == dev) { + if (!sb_is_alive(sb)) + continue; sb->s_count++; spin_unlock(&sb_lock); down_read(&sb->s_umount); /* still alive? */ - if (sb->s_root && (sb->s_flags & SB_BORN)) + if (sb_is_alive(sb)) return sb; up_read(&sb->s_umount); /* nope, got unmounted */ @@ -916,8 +930,7 @@ int do_remount_sb(struct super_block *sb, int sb_flags, void *data, int force) static void do_emergency_remount_callback(struct super_block *sb) { down_write(&sb->s_umount); - if (sb->s_root && sb->s_bdev && (sb->s_flags & SB_BORN) && - !sb_rdonly(sb)) { + if (sb_is_alive(sb) && sb->s_bdev && !sb_rdonly(sb)) { /* * What lock protects sb->s_flags?? */ @@ -947,7 +960,7 @@ void emergency_remount(void) static void do_thaw_all_callback(struct super_block *sb) { down_write(&sb->s_umount); - if (sb->s_root && sb->s_flags & SB_BORN) { + if (sb_is_alive(sb)) { emergency_thaw_bdev(sb); thaw_super_locked(sb); } else {