From patchwork Wed Jun 28 13:21:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 13295792 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A393AEB64DA for ; Wed, 28 Jun 2023 13:24:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231890AbjF1NYW (ORCPT ); Wed, 28 Jun 2023 09:24:22 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:19164 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231847AbjF1NYU (ORCPT ); Wed, 28 Jun 2023 09:24:20 -0400 Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Qrj345pnmz1HC5r; Wed, 28 Jun 2023 21:24:00 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 21:24:16 +0800 From: Baokun Li To: CC: , , , , , , , Subject: [PATCH v2 1/7] quota: factor out dquot_write_dquot() Date: Wed, 28 Jun 2023 21:21:49 +0800 Message-ID: <20230628132155.1560425-2-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230628132155.1560425-1-libaokun1@huawei.com> References: <20230628132155.1560425-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Refactor out dquot_write_dquot() to reduce duplicate code. Signed-off-by: Baokun Li --- fs/quota/dquot.c | 39 ++++++++++++++++----------------------- 1 file changed, 16 insertions(+), 23 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index e3e4f4047657..108ba9f1e420 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -628,6 +628,18 @@ int dquot_scan_active(struct super_block *sb, } EXPORT_SYMBOL(dquot_scan_active); +static inline int dquot_write_dquot(struct dquot *dquot) +{ + int ret = dquot->dq_sb->dq_op->write_dquot(dquot); + if (ret < 0) { + quota_error(dquot->dq_sb, "Can't write quota structure " + "(error %d). Quota may get out of sync!", ret); + /* Clear dirty bit anyway to avoid infinite loop. */ + clear_dquot_dirty(dquot); + } + return ret; +} + /* Write all dquot structures to quota files */ int dquot_writeback_dquots(struct super_block *sb, int type) { @@ -658,16 +670,9 @@ int dquot_writeback_dquots(struct super_block *sb, int type) * use count */ dqgrab(dquot); spin_unlock(&dq_list_lock); - err = sb->dq_op->write_dquot(dquot); - if (err) { - /* - * Clear dirty bit anyway to avoid infinite - * loop here. - */ - clear_dquot_dirty(dquot); - if (!ret) - ret = err; - } + err = dquot_write_dquot(dquot); + if (err && !ret) + ret = err; dqput(dquot); spin_lock(&dq_list_lock); } @@ -765,8 +770,6 @@ static struct shrinker dqcache_shrinker = { */ void dqput(struct dquot *dquot) { - int ret; - if (!dquot) return; #ifdef CONFIG_QUOTA_DEBUG @@ -794,17 +797,7 @@ void dqput(struct dquot *dquot) if (dquot_dirty(dquot)) { spin_unlock(&dq_list_lock); /* Commit dquot before releasing */ - ret = dquot->dq_sb->dq_op->write_dquot(dquot); - if (ret < 0) { - quota_error(dquot->dq_sb, "Can't write quota structure" - " (error %d). Quota may get out of sync!", - ret); - /* - * We clear dirty bit anyway, so that we avoid - * infinite loop here - */ - clear_dquot_dirty(dquot); - } + dquot_write_dquot(dquot); goto we_slept; } if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) { From patchwork Wed Jun 28 13:21:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 13295798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 476BEEB64DA for ; Wed, 28 Jun 2023 13:24:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231974AbjF1NYe (ORCPT ); Wed, 28 Jun 2023 09:24:34 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:19165 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231835AbjF1NYX (ORCPT ); Wed, 28 Jun 2023 09:24:23 -0400 Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Qrj3529Chz1HC8N; Wed, 28 Jun 2023 21:24:01 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 21:24:17 +0800 From: Baokun Li To: CC: , , , , , , , Subject: [PATCH v2 2/7] quota: add new global dquot list releasing_dquots Date: Wed, 28 Jun 2023 21:21:50 +0800 Message-ID: <20230628132155.1560425-3-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230628132155.1560425-1-libaokun1@huawei.com> References: <20230628132155.1560425-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a new global dquot list that obeys the following rules: 1). A dquot is added to this list when its last reference count is about to be dropped. 2). The reference count of the dquot in the list is greater than or equal to 1 ( due to possible race with dqget()). 3). When a dquot is removed from this list, a reference count is always subtracted, and if the reference count is then 0, the dquot is added to the free_dquots list. This list is used to safely perform the final cleanup before releasing the last reference count, to avoid various contention issues caused by performing cleanup directly in dqput(), and to avoid the performance impact caused by calling synchronize_srcu(&dquot_srcu) directly in dqput(). Here it is just defining the list and implementing the corresponding operation function, which we will use later. Suggested-by: Jan Kara Signed-off-by: Baokun Li --- fs/quota/dquot.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 108ba9f1e420..a8b43b5b5623 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -226,12 +226,21 @@ static void put_quota_format(struct quota_format_type *fmt) /* * Dquot List Management: * The quota code uses four lists for dquot management: the inuse_list, - * free_dquots, dqi_dirty_list, and dquot_hash[] array. A single dquot - * structure may be on some of those lists, depending on its current state. + * releasing_dquots, free_dquots, dqi_dirty_list, and dquot_hash[] array. + * A single dquot structure may be on some of those lists, depending on + * its current state. * * All dquots are placed to the end of inuse_list when first created, and this * list is used for invalidate operation, which must look at every dquot. * + * When the last reference of a dquot will be dropped, the dquot will be + * added to releasing_dquots. We'd then queue work item which would call + * synchronize_srcu() and after that perform the final cleanup of all the + * dquots on the list. Both releasing_dquots and free_dquots use the + * dq_free list_head in the dquot struct. when a dquot is removed from + * releasing_dquots, a reference count is always subtracted, and if + * dq_count == 0 at that point, the dquot will be added to the free_dquots. + * * Unused dquots (dq_count == 0) are added to the free_dquots list when freed, * and this list is searched whenever we need an available dquot. Dquots are * removed from the list as soon as they are used again, and @@ -250,6 +259,7 @@ static void put_quota_format(struct quota_format_type *fmt) static LIST_HEAD(inuse_list); static LIST_HEAD(free_dquots); +static LIST_HEAD(releasing_dquots); static unsigned int dq_hash_bits, dq_hash_mask; static struct hlist_head *dquot_hash; @@ -305,6 +315,13 @@ static inline void put_dquot_last(struct dquot *dquot) dqstats_inc(DQST_FREE_DQUOTS); } +static inline void put_releasing_dquots(struct dquot *dquot) +{ + list_add_tail(&dquot->dq_free, &releasing_dquots); + /* dquot will be moved to free_dquots during shrink. */ + dqstats_inc(DQST_FREE_DQUOTS); +} + static inline void remove_free_dquot(struct dquot *dquot) { if (list_empty(&dquot->dq_free)) From patchwork Wed Jun 28 13:21:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 13295794 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 583D6EB64DA for ; Wed, 28 Jun 2023 13:24:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231899AbjF1NYY (ORCPT ); Wed, 28 Jun 2023 09:24:24 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:15439 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231834AbjF1NYU (ORCPT ); Wed, 28 Jun 2023 09:24:20 -0400 Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Qrj2P6VwqzTkxd; Wed, 28 Jun 2023 21:23:25 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 21:24:17 +0800 From: Baokun Li To: CC: , , , , , , , Subject: [PATCH v2 3/7] quota: rename dquot_active() to inode_dquot_active() Date: Wed, 28 Jun 2023 21:21:51 +0800 Message-ID: <20230628132155.1560425-4-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230628132155.1560425-1-libaokun1@huawei.com> References: <20230628132155.1560425-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now we have a helper function dquot_dirty() to determine if dquot has DQ_MOD_B bit. dquot_active() can easily be misunderstood as a helper function to determine if dquot has DQ_ACTIVE_B bit. So we avoid this by adding the "inode_" prefix and later on we will add the helper function dquot_active() to determine if dquot has DQ_ACTIVE_B bit. Signed-off-by: Baokun Li --- fs/quota/dquot.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index a8b43b5b5623..b21f5e888482 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -1435,7 +1435,7 @@ static int info_bdq_free(struct dquot *dquot, qsize_t space) return QUOTA_NL_NOWARN; } -static int dquot_active(const struct inode *inode) +static int inode_dquot_active(const struct inode *inode) { struct super_block *sb = inode->i_sb; @@ -1458,7 +1458,7 @@ static int __dquot_initialize(struct inode *inode, int type) qsize_t rsv; int ret = 0; - if (!dquot_active(inode)) + if (!inode_dquot_active(inode)) return 0; dquots = i_dquot(inode); @@ -1566,7 +1566,7 @@ bool dquot_initialize_needed(struct inode *inode) struct dquot **dquots; int i; - if (!dquot_active(inode)) + if (!inode_dquot_active(inode)) return false; dquots = i_dquot(inode); @@ -1677,7 +1677,7 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags) int reserve = flags & DQUOT_SPACE_RESERVE; struct dquot **dquots; - if (!dquot_active(inode)) { + if (!inode_dquot_active(inode)) { if (reserve) { spin_lock(&inode->i_lock); *inode_reserved_space(inode) += number; @@ -1747,7 +1747,7 @@ int dquot_alloc_inode(struct inode *inode) struct dquot_warn warn[MAXQUOTAS]; struct dquot * const *dquots; - if (!dquot_active(inode)) + if (!inode_dquot_active(inode)) return 0; for (cnt = 0; cnt < MAXQUOTAS; cnt++) warn[cnt].w_type = QUOTA_NL_NOWARN; @@ -1790,7 +1790,7 @@ int dquot_claim_space_nodirty(struct inode *inode, qsize_t number) struct dquot **dquots; int cnt, index; - if (!dquot_active(inode)) { + if (!inode_dquot_active(inode)) { spin_lock(&inode->i_lock); *inode_reserved_space(inode) -= number; __inode_add_bytes(inode, number); @@ -1832,7 +1832,7 @@ void dquot_reclaim_space_nodirty(struct inode *inode, qsize_t number) struct dquot **dquots; int cnt, index; - if (!dquot_active(inode)) { + if (!inode_dquot_active(inode)) { spin_lock(&inode->i_lock); *inode_reserved_space(inode) += number; __inode_sub_bytes(inode, number); @@ -1876,7 +1876,7 @@ void __dquot_free_space(struct inode *inode, qsize_t number, int flags) struct dquot **dquots; int reserve = flags & DQUOT_SPACE_RESERVE, index; - if (!dquot_active(inode)) { + if (!inode_dquot_active(inode)) { if (reserve) { spin_lock(&inode->i_lock); *inode_reserved_space(inode) -= number; @@ -1931,7 +1931,7 @@ void dquot_free_inode(struct inode *inode) struct dquot * const *dquots; int index; - if (!dquot_active(inode)) + if (!inode_dquot_active(inode)) return; dquots = i_dquot(inode); @@ -2103,7 +2103,7 @@ int dquot_transfer(struct mnt_idmap *idmap, struct inode *inode, struct super_block *sb = inode->i_sb; int ret; - if (!dquot_active(inode)) + if (!inode_dquot_active(inode)) return 0; if (i_uid_needs_update(idmap, iattr, inode)) { From patchwork Wed Jun 28 13:21:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 13295799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DB3FEB64DC for ; Wed, 28 Jun 2023 13:24:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232026AbjF1NYk (ORCPT ); Wed, 28 Jun 2023 09:24:40 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:21998 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231879AbjF1NY2 (ORCPT ); Wed, 28 Jun 2023 09:24:28 -0400 Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Qrj0G0vmjzlW5m; Wed, 28 Jun 2023 21:21:34 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 21:24:18 +0800 From: Baokun Li To: CC: , , , , , , , Subject: [PATCH v2 4/7] quota: add new helper dquot_active() Date: Wed, 28 Jun 2023 21:21:52 +0800 Message-ID: <20230628132155.1560425-5-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230628132155.1560425-1-libaokun1@huawei.com> References: <20230628132155.1560425-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add new helper function dquot_active() to make the code more concise. Signed-off-by: Baokun Li --- fs/quota/dquot.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index b21f5e888482..09787e4f6a00 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -353,6 +353,11 @@ static void wait_on_dquot(struct dquot *dquot) mutex_unlock(&dquot->dq_lock); } +static inline int dquot_active(struct dquot *dquot) +{ + return test_bit(DQ_ACTIVE_B, &dquot->dq_flags); +} + static inline int dquot_dirty(struct dquot *dquot) { return test_bit(DQ_MOD_B, &dquot->dq_flags); @@ -368,14 +373,14 @@ int dquot_mark_dquot_dirty(struct dquot *dquot) { int ret = 1; - if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) + if (!dquot_active(dquot)) return 0; if (sb_dqopt(dquot->dq_sb)->flags & DQUOT_NOLIST_DIRTY) return test_and_set_bit(DQ_MOD_B, &dquot->dq_flags); /* If quota is dirty already, we don't have to acquire dq_list_lock */ - if (test_bit(DQ_MOD_B, &dquot->dq_flags)) + if (dquot_dirty(dquot)) return 1; spin_lock(&dq_list_lock); @@ -457,7 +462,7 @@ int dquot_acquire(struct dquot *dquot) smp_mb__before_atomic(); set_bit(DQ_READ_B, &dquot->dq_flags); /* Instantiate dquot if needed */ - if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && !dquot->dq_off) { + if (!dquot_active(dquot) && !dquot->dq_off) { ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot); /* Write the info if needed */ if (info_dirty(&dqopt->info[dquot->dq_id.type])) { @@ -499,7 +504,7 @@ int dquot_commit(struct dquot *dquot) goto out_lock; /* Inactive dquot can be only if there was error during read/init * => we have better not writing it */ - if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) + if (dquot_active(dquot)) ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot); else ret = -EIO; @@ -614,7 +619,7 @@ int dquot_scan_active(struct super_block *sb, spin_lock(&dq_list_lock); list_for_each_entry(dquot, &inuse_list, dq_inuse) { - if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) + if (!dquot_active(dquot)) continue; if (dquot->dq_sb != sb) continue; @@ -629,7 +634,7 @@ int dquot_scan_active(struct super_block *sb, * outstanding call and recheck the DQ_ACTIVE_B after that. */ wait_on_dquot(dquot); - if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) { + if (dquot_active(dquot)) { ret = fn(dquot, priv); if (ret < 0) goto out; @@ -680,7 +685,7 @@ int dquot_writeback_dquots(struct super_block *sb, int type) dquot = list_first_entry(&dirty, struct dquot, dq_dirty); - WARN_ON(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)); + WARN_ON(!dquot_active(dquot)); /* Now we have active dquot from which someone is * holding reference so we can safely just increase @@ -817,7 +822,7 @@ void dqput(struct dquot *dquot) dquot_write_dquot(dquot); goto we_slept; } - if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) { + if (dquot_active(dquot)) { spin_unlock(&dq_list_lock); dquot->dq_sb->dq_op->release_dquot(dquot); goto we_slept; @@ -918,7 +923,7 @@ struct dquot *dqget(struct super_block *sb, struct kqid qid) * already finished or it will be canceled due to dq_count > 1 test */ wait_on_dquot(dquot); /* Read the dquot / allocate space in quota file */ - if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) { + if (!dquot_active(dquot)) { int err; err = sb->dq_op->acquire_dquot(dquot); From patchwork Wed Jun 28 13:21:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 13295795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D909AEB64D7 for ; Wed, 28 Jun 2023 13:24:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231926AbjF1NY2 (ORCPT ); Wed, 28 Jun 2023 09:24:28 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:16307 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231678AbjF1NYV (ORCPT ); Wed, 28 Jun 2023 09:24:21 -0400 Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.53]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4Qrj115qDXzLmyW; Wed, 28 Jun 2023 21:22:13 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 21:24:18 +0800 From: Baokun Li To: CC: , , , , , , , Subject: [PATCH v2 5/7] quota: fix dqput() to follow the guarantees dquot_srcu should provide Date: Wed, 28 Jun 2023 21:21:53 +0800 Message-ID: <20230628132155.1560425-6-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230628132155.1560425-1-libaokun1@huawei.com> References: <20230628132155.1560425-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The dquot_mark_dquot_dirty() using dquot references from the inode should be protected by dquot_srcu. quota_off code takes care to call synchronize_srcu(&dquot_srcu) to not drop dquot references while they are used by other users. But dquot_transfer() breaks this assumption. We call dquot_transfer() to drop the last reference of dquot and add it to free_dquots, but there may still be other users using the dquot at this time, as shown in the function graph below: cpu1 cpu2 _________________|_________________ wb_do_writeback CHOWN(1) ... ext4_da_update_reserve_space dquot_claim_block ... dquot_mark_dquot_dirty // try to dirty old quota test_bit(DQ_ACTIVE_B, &dquot->dq_flags) // still ACTIVE if (test_bit(DQ_MOD_B, &dquot->dq_flags)) // test no dirty, wait dq_list_lock ... dquot_transfer __dquot_transfer dqput_all(transfer_from) // rls old dquot dqput // last dqput dquot_release clear_bit(DQ_ACTIVE_B, &dquot->dq_flags) atomic_dec(&dquot->dq_count) put_dquot_last(dquot) list_add_tail(&dquot->dq_free, &free_dquots) // add the dquot to free_dquots if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags)) add dqi_dirty_list // add released dquot to dirty_list This can cause various issues, such as dquot being destroyed by dqcache_shrink_scan() after being added to free_dquots, which can trigger a UAF in dquot_mark_dquot_dirty(); or after dquot is added to free_dquots and then to dirty_list, it is added to free_dquots again after dquot_writeback_dquots() is executed, which causes the free_dquots list to be corrupted and triggers a UAF when dqcache_shrink_scan() is called for freeing dquot twice. As Honza said, we need to fix dquot_transfer() to follow the guarantees dquot_srcu should provide. But calling synchronize_srcu() directly from dquot_transfer() is too expensive (and mostly unnecessary). So we add dquot whose last reference should be dropped to the new global dquot list releasing_dquots, and then queue work item which would call synchronize_srcu() and after that perform the final cleanup of all the dquots on releasing_dquots. Fixes: 4580b30ea887 ("quota: Do not dirty bad dquots") Suggested-by: Jan Kara Signed-off-by: Baokun Li --- fs/quota/dquot.c | 85 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 71 insertions(+), 14 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 09787e4f6a00..e8e702ac64e5 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -270,6 +270,9 @@ static qsize_t inode_get_rsv_space(struct inode *inode); static qsize_t __inode_get_rsv_space(struct inode *inode); static int __dquot_initialize(struct inode *inode, int type); +static void quota_release_workfn(struct work_struct *work); +static DECLARE_DELAYED_WORK(quota_release_work, quota_release_workfn); + static inline unsigned int hashfn(const struct super_block *sb, struct kqid qid) { @@ -569,6 +572,8 @@ static void invalidate_dquots(struct super_block *sb, int type) struct dquot *dquot, *tmp; restart: + flush_delayed_work("a_release_work); + spin_lock(&dq_list_lock); list_for_each_entry_safe(dquot, tmp, &inuse_list, dq_inuse) { if (dquot->dq_sb != sb) @@ -577,6 +582,12 @@ static void invalidate_dquots(struct super_block *sb, int type) continue; /* Wait for dquot users */ if (atomic_read(&dquot->dq_count)) { + /* dquot in releasing_dquots, flush and retry */ + if (!list_empty(&dquot->dq_free)) { + spin_unlock(&dq_list_lock); + goto restart; + } + atomic_inc(&dquot->dq_count); spin_unlock(&dq_list_lock); /* @@ -760,6 +771,8 @@ dqcache_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) struct dquot *dquot; unsigned long freed = 0; + flush_delayed_work("a_release_work); + spin_lock(&dq_list_lock); while (!list_empty(&free_dquots) && sc->nr_to_scan) { dquot = list_first_entry(&free_dquots, struct dquot, dq_free); @@ -787,6 +800,60 @@ static struct shrinker dqcache_shrinker = { .seeks = DEFAULT_SEEKS, }; +/* + * Safely release dquot and put reference to dquot. + */ +static void quota_release_workfn(struct work_struct *work) +{ + struct dquot *dquot; + struct list_head rls_head; + + spin_lock(&dq_list_lock); + /* Exchange the list head to avoid livelock. */ + list_replace_init(&releasing_dquots, &rls_head); + spin_unlock(&dq_list_lock); + +restart: + synchronize_srcu(&dquot_srcu); + spin_lock(&dq_list_lock); + while (!list_empty(&rls_head)) { + dquot = list_first_entry(&rls_head, struct dquot, dq_free); + if (dquot_dirty(dquot)) { + spin_unlock(&dq_list_lock); + /* Commit dquot before releasing */ + dquot_write_dquot(dquot); + goto restart; + } + /* Always clear DQ_ACTIVE_B, unless racing with dqget() */ + if (dquot_active(dquot)) { + spin_unlock(&dq_list_lock); + dquot->dq_sb->dq_op->release_dquot(dquot); + spin_lock(&dq_list_lock); + } + /* + * During the execution of dquot_release() outside the + * dq_list_lock, another process may have completed + * dqget()/dqput()/mark_dirty(). + */ + if (atomic_read(&dquot->dq_count) == 1 && + (dquot_active(dquot) || dquot_dirty(dquot))) { + spin_unlock(&dq_list_lock); + goto restart; + } + /* + * Now it is safe to remove this dquot from releasing_dquots + * and reduce its reference count. + */ + remove_free_dquot(dquot); + atomic_dec(&dquot->dq_count); + + /* We may be racing with some other dqget(). */ + if (!atomic_read(&dquot->dq_count)) + put_dquot_last(dquot); + } + spin_unlock(&dq_list_lock); +} + /* * Put reference to dquot */ @@ -803,7 +870,7 @@ void dqput(struct dquot *dquot) } #endif dqstats_inc(DQST_DROPS); -we_slept: + spin_lock(&dq_list_lock); if (atomic_read(&dquot->dq_count) > 1) { /* We have more than one user... nothing to do */ @@ -815,25 +882,15 @@ void dqput(struct dquot *dquot) spin_unlock(&dq_list_lock); return; } + /* Need to release dquot? */ - if (dquot_dirty(dquot)) { - spin_unlock(&dq_list_lock); - /* Commit dquot before releasing */ - dquot_write_dquot(dquot); - goto we_slept; - } - if (dquot_active(dquot)) { - spin_unlock(&dq_list_lock); - dquot->dq_sb->dq_op->release_dquot(dquot); - goto we_slept; - } - atomic_dec(&dquot->dq_count); #ifdef CONFIG_QUOTA_DEBUG /* sanity check */ BUG_ON(!list_empty(&dquot->dq_free)); #endif - put_dquot_last(dquot); + put_releasing_dquots(dquot); spin_unlock(&dq_list_lock); + queue_delayed_work(system_unbound_wq, "a_release_work, 1); } EXPORT_SYMBOL(dqput); From patchwork Wed Jun 28 13:21:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 13295796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4034EEB64DA for ; Wed, 28 Jun 2023 13:24:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231944AbjF1NYa (ORCPT ); Wed, 28 Jun 2023 09:24:30 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:44130 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231825AbjF1NYW (ORCPT ); Wed, 28 Jun 2023 09:24:22 -0400 Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Qrhzm3pmnzMpbx; Wed, 28 Jun 2023 21:21:08 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 21:24:19 +0800 From: Baokun Li To: CC: , , , , , , , Subject: [PATCH v2 6/7] quota: simplify drop_dquot_ref() Date: Wed, 28 Jun 2023 21:21:54 +0800 Message-ID: <20230628132155.1560425-7-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230628132155.1560425-1-libaokun1@huawei.com> References: <20230628132155.1560425-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now when dqput() drops the last reference count, it will call synchronize_srcu(&dquot_srcu) in quota_release_workfn() to ensure that no other user will use the dquot after the last reference count is dropped, so we don't need to call synchronize_srcu(&dquot_srcu) in drop_dquot_ref() and remove the corresponding logic directly to simplify the code. Signed-off-by: Baokun Li --- fs/quota/dquot.c | 33 ++++++--------------------------- 1 file changed, 6 insertions(+), 27 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index e8e702ac64e5..df028fb2ce72 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -1090,8 +1090,7 @@ static int add_dquot_ref(struct super_block *sb, int type) * Remove references to dquots from inode and add dquot to list for freeing * if we have the last reference to dquot */ -static void remove_inode_dquot_ref(struct inode *inode, int type, - struct list_head *tofree_head) +static void remove_inode_dquot_ref(struct inode *inode, int type) { struct dquot **dquots = i_dquot(inode); struct dquot *dquot = dquots[type]; @@ -1100,21 +1099,7 @@ static void remove_inode_dquot_ref(struct inode *inode, int type, return; dquots[type] = NULL; - if (list_empty(&dquot->dq_free)) { - /* - * The inode still has reference to dquot so it can't be in the - * free list - */ - spin_lock(&dq_list_lock); - list_add(&dquot->dq_free, tofree_head); - spin_unlock(&dq_list_lock); - } else { - /* - * Dquot is already in a list to put so we won't drop the last - * reference here. - */ - dqput(dquot); - } + dqput(dquot); } /* @@ -1137,8 +1122,7 @@ static void put_dquot_list(struct list_head *tofree_head) } } -static void remove_dquot_ref(struct super_block *sb, int type, - struct list_head *tofree_head) +static void remove_dquot_ref(struct super_block *sb, int type) { struct inode *inode; #ifdef CONFIG_QUOTA_DEBUG @@ -1159,7 +1143,7 @@ static void remove_dquot_ref(struct super_block *sb, int type, if (unlikely(inode_get_rsv_space(inode) > 0)) reserved = 1; #endif - remove_inode_dquot_ref(inode, type, tofree_head); + remove_inode_dquot_ref(inode, type); } spin_unlock(&dq_data_lock); } @@ -1176,13 +1160,8 @@ static void remove_dquot_ref(struct super_block *sb, int type, /* Gather all references from inodes and drop them */ static void drop_dquot_ref(struct super_block *sb, int type) { - LIST_HEAD(tofree_head); - - if (sb->dq_op) { - remove_dquot_ref(sb, type, &tofree_head); - synchronize_srcu(&dquot_srcu); - put_dquot_list(&tofree_head); - } + if (sb->dq_op) + remove_dquot_ref(sb, type); } static inline From patchwork Wed Jun 28 13:21:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baokun Li X-Patchwork-Id: 13295797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7AC6C0015E for ; Wed, 28 Jun 2023 13:24:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231924AbjF1NYd (ORCPT ); Wed, 28 Jun 2023 09:24:33 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:21999 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231869AbjF1NYW (ORCPT ); Wed, 28 Jun 2023 09:24:22 -0400 Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Qrj0H48qjzlWNs; Wed, 28 Jun 2023 21:21:35 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 28 Jun 2023 21:24:19 +0800 From: Baokun Li To: CC: , , , , , , , Subject: [PATCH v2 7/7] quota: remove unused function put_dquot_list() Date: Wed, 28 Jun 2023 21:21:55 +0800 Message-ID: <20230628132155.1560425-8-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230628132155.1560425-1-libaokun1@huawei.com> References: <20230628132155.1560425-1-libaokun1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that no one is calling put_dquot_list(), remove it. Signed-off-by: Baokun Li --- fs/quota/dquot.c | 20 -------------------- 1 file changed, 20 deletions(-) diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index df028fb2ce72..ba0125473be3 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -1102,26 +1102,6 @@ static void remove_inode_dquot_ref(struct inode *inode, int type) dqput(dquot); } -/* - * Free list of dquots - * Dquots are removed from inodes and no new references can be got so we are - * the only ones holding reference - */ -static void put_dquot_list(struct list_head *tofree_head) -{ - struct list_head *act_head; - struct dquot *dquot; - - act_head = tofree_head->next; - while (act_head != tofree_head) { - dquot = list_entry(act_head, struct dquot, dq_free); - act_head = act_head->next; - /* Remove dquot from the list so we won't have problems... */ - list_del_init(&dquot->dq_free); - dqput(dquot); - } -} - static void remove_dquot_ref(struct super_block *sb, int type) { struct inode *inode;