Message ID | 20210414134737.2366971-7-yi.zhang@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ext4, jbd2: fix 3 issues about bdev_try_to_free_page() | expand |
On Wed, Apr 14, 2021 at 09:47:36PM +0800, Zhang Yi wrote: > Commit <87d8fe1ee6b8> ("add releasepage hooks to block devices which can > be used by file systems") introduce a hook that used by ext4 filesystem > to release journal buffers, but it doesn't add corresponding concurrency > protection that ->bdev_try_to_free_page() could be raced by umount > filesystem concurrently. This patch add a usage count on superblock that > filesystem can use it to prevent above race and make invoke > ->bdev_try_to_free_page() safe. We already have two refcounts in the superblock: s_active which counts the active refernce, and s_count which prevents the data structures from beeing freed. I don't think we need yet another one.
Hi, Christoph On 2021/4/15 22:40, Christoph Hellwig wrote: > On Wed, Apr 14, 2021 at 09:47:36PM +0800, Zhang Yi wrote: >> Commit <87d8fe1ee6b8> ("add releasepage hooks to block devices which can >> be used by file systems") introduce a hook that used by ext4 filesystem >> to release journal buffers, but it doesn't add corresponding concurrency >> protection that ->bdev_try_to_free_page() could be raced by umount >> filesystem concurrently. This patch add a usage count on superblock that >> filesystem can use it to prevent above race and make invoke >> ->bdev_try_to_free_page() safe. > > We already have two refcounts in the superblock: s_active which counts > the active refernce, and s_count which prevents the data structures > from beeing freed. I don't think we need yet another one. > . > Thanks you for your response. I checked the s_count and s_active refcounts, but it seems that we could not use these two refcounts directly. For the s_active. If we get s_active refcount in blkdev_releasepage(), we may put the final refcount when doing umount concurrently and have to do resource recycling, but we got page locked here and lead to deadlock. Maybe we could do async resource recycling through kworker, but I think it's not a good way. For the s_count, it is operated under the global sb_lock now, so get this refcount will serialize page release and affect performance. Besides, It's semantics are different from the 'usage count' by private fileststem, and we have to cooperate with sb->s_umount mutex lock to close the above race. So I introduce another refcount. Am I missing something or any suggestions? Thanks, Yi.
diff --git a/include/linux/fs.h b/include/linux/fs.h index ec8f3ddf4a6a..3c6a5c08c2df 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -41,6 +41,7 @@ #include <linux/stddef.h> #include <linux/mount.h> #include <linux/cred.h> +#include <linux/percpu-refcount.h> #include <asm/byteorder.h> #include <uapi/linux/fs.h> @@ -1547,6 +1548,13 @@ struct super_block { spinlock_t s_inode_wblist_lock; struct list_head s_inodes_wb; /* writeback inodes */ + + /* + * Count users who are using the super_block, used to protect + * umount filesystem concurrently with others. + */ + struct percpu_ref s_usage_counter; + wait_queue_head_t s_usage_waitq; } __randomize_layout; /* Helper functions so that in most cases filesystems will @@ -1765,6 +1773,27 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb) bool inode_owner_or_capable(struct user_namespace *mnt_userns, const struct inode *inode); +static inline void sb_usage_counter_release(struct percpu_ref *ref) +{ + struct super_block *sb; + + sb = container_of(ref, struct super_block, s_usage_counter); + wake_up(&sb->s_usage_waitq); +} + +static inline int sb_usage_counter_init(struct super_block *sb) +{ + init_waitqueue_head(&sb->s_usage_waitq); + return percpu_ref_init(&sb->s_usage_counter, sb_usage_counter_release, + 0, GFP_KERNEL); +} + +static inline void sb_usage_counter_wait(struct super_block *sb) +{ + percpu_ref_kill(&sb->s_usage_counter); + wait_event(sb->s_usage_waitq, percpu_ref_is_zero(&sb->s_usage_counter)); +} + /* * VFS helper functions.. */
Commit <87d8fe1ee6b8> ("add releasepage hooks to block devices which can be used by file systems") introduce a hook that used by ext4 filesystem to release journal buffers, but it doesn't add corresponding concurrency protection that ->bdev_try_to_free_page() could be raced by umount filesystem concurrently. This patch add a usage count on superblock that filesystem can use it to prevent above race and make invoke ->bdev_try_to_free_page() safe. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> --- include/linux/fs.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)