From patchwork Thu Mar 27 14:06:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Bottomley X-Patchwork-Id: 14031185 Received: from lamorak.hansenpartnership.com (lamorak.hansenpartnership.com [198.37.111.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3118620CCFD; Thu, 27 Mar 2025 14:15:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.37.111.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743084944; cv=none; b=DfV18fqPQOMlDdHD6+kdt82E7PNcz8kX1XtieP+ivuI8/XMdHPnmxRsiiNxcXb1tKVqem1yCvppOZ79n6QTkZBXoyMYHLWYCyUX+4sZRfQvLxKh/CAb//xwl02UhfE/0t3YLQcsayBh9ExvYWEdX/oQZXjYwaVgslzIZdg/l8xs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743084944; c=relaxed/simple; bh=aK0BXGeEBRrg9wG5GiYW/OtSFnBWJxKTOuBDkNguWr0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=u6fYp4UejD6eB63wTF9w79wMISyqroGmPRDcvSGGX72CcjVdDi9v2eixPnKkjujYXfZ6xeziCZP9MZGQwDxJ9WUxJW6ZV+G5PkZypm2H4tejPbX+QSynexHq6ZeVcL1xVI8E1ni6gNmSVYmBrHnFTMQsB5JAR1tVgUTWOYmN1t0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com; spf=pass smtp.mailfrom=HansenPartnership.com; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b=shRbftR1; arc=none smtp.client-ip=198.37.111.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b="shRbftR1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1743084942; bh=aK0BXGeEBRrg9wG5GiYW/OtSFnBWJxKTOuBDkNguWr0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References:From; b=shRbftR19ePwWslzn5X95N7WRTkAI72WybEGPuBkqjQX70WrkKGfEnAPG2H4dzCkR iSpIXv3AH1SqJGIiGvbjg0F/zlowfaydppjbb2oeFfGGxpZgWfAuhgp6tuw2IrnJuh NB64Sp8BimDM1/6VUMNQQeqsZixRL38jDzNVnTbA= Received: from lingrow.int.hansenpartnership.com (unknown [153.66.160.227]) by lamorak.hansenpartnership.com (Postfix) with ESMTP id 8F3DD1C0015; Thu, 27 Mar 2025 10:15:41 -0400 (EDT) From: James Bottomley To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mcgrof@kernel.org, jack@suse.cz, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [RFC PATCH 1/4] locking/percpu-rwsem: add freezable alternative to down_read Date: Thu, 27 Mar 2025 10:06:10 -0400 Message-ID: <20250327140613.25178-2-James.Bottomley@HansenPartnership.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> References: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Percpu-rwsems are used for superblock locking. However, we know the read percpu-rwsem we take for sb_start_write() on a frozen filesystem needs not to inhibit system from suspending or hibernating. That means it needs to wait with TASK_UNINTERRUPTIBLE | TASK_FREEZABLE. Introduce a new percpu_down_read_freezable() that allows us to control whether TASK_FREEZABLE is added to the wait flags. Signed-off-by: James Bottomley --- Since this is an RFC, added the percpu-rwsem maintainers for information and guidance to check if we're on the right track or whether they would prefer an alternative API. --- include/linux/percpu-rwsem.h | 20 ++++++++++++++++---- kernel/locking/percpu-rwsem.c | 13 ++++++++----- 2 files changed, 24 insertions(+), 9 deletions(-) diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index c012df33a9f0..a55fe709b832 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -42,9 +42,10 @@ is_static struct percpu_rw_semaphore name = { \ #define DEFINE_STATIC_PERCPU_RWSEM(name) \ __DEFINE_PERCPU_RWSEM(name, static) -extern bool __percpu_down_read(struct percpu_rw_semaphore *, bool); +extern bool __percpu_down_read(struct percpu_rw_semaphore *, bool, bool); -static inline void percpu_down_read(struct percpu_rw_semaphore *sem) +static inline void percpu_down_read_internal(struct percpu_rw_semaphore *sem, + bool freezable) { might_sleep(); @@ -62,7 +63,7 @@ static inline void percpu_down_read(struct percpu_rw_semaphore *sem) if (likely(rcu_sync_is_idle(&sem->rss))) this_cpu_inc(*sem->read_count); else - __percpu_down_read(sem, false); /* Unconditional memory barrier */ + __percpu_down_read(sem, false, freezable); /* Unconditional memory barrier */ /* * The preempt_enable() prevents the compiler from * bleeding the critical section out. @@ -70,6 +71,17 @@ static inline void percpu_down_read(struct percpu_rw_semaphore *sem) preempt_enable(); } +static inline void percpu_down_read(struct percpu_rw_semaphore *sem) +{ + percpu_down_read_internal(sem, false); +} + +static inline void percpu_down_read_freezable(struct percpu_rw_semaphore *sem, + bool freeze) +{ + percpu_down_read_internal(sem, freeze); +} + static inline bool percpu_down_read_trylock(struct percpu_rw_semaphore *sem) { bool ret = true; @@ -81,7 +93,7 @@ static inline bool percpu_down_read_trylock(struct percpu_rw_semaphore *sem) if (likely(rcu_sync_is_idle(&sem->rss))) this_cpu_inc(*sem->read_count); else - ret = __percpu_down_read(sem, true); /* Unconditional memory barrier */ + ret = __percpu_down_read(sem, true, false); /* Unconditional memory barrier */ preempt_enable(); /* * The barrier() from preempt_enable() prevents the compiler from diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c index 6083883c4fe0..890837b73476 100644 --- a/kernel/locking/percpu-rwsem.c +++ b/kernel/locking/percpu-rwsem.c @@ -138,7 +138,8 @@ static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry, return !reader; /* wake (readers until) 1 writer */ } -static void percpu_rwsem_wait(struct percpu_rw_semaphore *sem, bool reader) +static void percpu_rwsem_wait(struct percpu_rw_semaphore *sem, bool reader, + bool freeze) { DEFINE_WAIT_FUNC(wq_entry, percpu_rwsem_wake_function); bool wait; @@ -156,7 +157,8 @@ static void percpu_rwsem_wait(struct percpu_rw_semaphore *sem, bool reader) spin_unlock_irq(&sem->waiters.lock); while (wait) { - set_current_state(TASK_UNINTERRUPTIBLE); + set_current_state(TASK_UNINTERRUPTIBLE | + freeze ? TASK_FREEZABLE : 0); if (!smp_load_acquire(&wq_entry.private)) break; schedule(); @@ -164,7 +166,8 @@ static void percpu_rwsem_wait(struct percpu_rw_semaphore *sem, bool reader) __set_current_state(TASK_RUNNING); } -bool __sched __percpu_down_read(struct percpu_rw_semaphore *sem, bool try) +bool __sched __percpu_down_read(struct percpu_rw_semaphore *sem, bool try, + bool freeze) { if (__percpu_down_read_trylock(sem)) return true; @@ -174,7 +177,7 @@ bool __sched __percpu_down_read(struct percpu_rw_semaphore *sem, bool try) trace_contention_begin(sem, LCB_F_PERCPU | LCB_F_READ); preempt_enable(); - percpu_rwsem_wait(sem, /* .reader = */ true); + percpu_rwsem_wait(sem, /* .reader = */ true, freeze); preempt_disable(); trace_contention_end(sem, 0); @@ -237,7 +240,7 @@ void __sched percpu_down_write(struct percpu_rw_semaphore *sem) */ if (!__percpu_down_write_trylock(sem)) { trace_contention_begin(sem, LCB_F_PERCPU | LCB_F_WRITE); - percpu_rwsem_wait(sem, /* .reader = */ false); + percpu_rwsem_wait(sem, /* .reader = */ false, false); contended = true; } From patchwork Thu Mar 27 14:06:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Bottomley X-Patchwork-Id: 14031186 Received: from lamorak.hansenpartnership.com (lamorak.hansenpartnership.com [198.37.111.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9051021767D; Thu, 27 Mar 2025 14:16:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.37.111.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743084990; cv=none; b=jhmpCR21sVzUotJXCWXk4KMQs4B4+ggiPASMyJ2hknJubCtHHs+dvF7nxQ8ml3ZTHPDnKYDpzowMcXoW9x+UUBi56wg1p6fnDscXmYfT2LjHBmDdOH5g2PfnmbPyohb81hwEuNXlhq5QVCN64mZ6hvtCnubSyZvcYvXWgW1rZhU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743084990; c=relaxed/simple; bh=EKnWR1cNbnttkTTkCD6l2485+E+pvLEWZlJfUMU3Ysc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ONdTHZhieQJwZVOdMnIGVF0LjUuifEiJmer5zGGuhq1xBzkx0W+Eh1ali/6/xcwYSvSHzgFIlyCqnM6n9mRORdEFdh/gFhHiXOFk0iFsrBmQrlVTiluRqIeAOGzPdMOGGnTENvTYH3k/PRu3L+SV6GtXx43SC0Ym51XhAn8zALw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com; spf=pass smtp.mailfrom=HansenPartnership.com; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b=uDGesOdd; arc=none smtp.client-ip=198.37.111.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b="uDGesOdd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1743084987; bh=EKnWR1cNbnttkTTkCD6l2485+E+pvLEWZlJfUMU3Ysc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References:From; b=uDGesOddWvKl2/NnfMSRx4I/Ntgt1v/NDKLi+9L75cc5iCdB3NtAxnAHMjqrVXbuc JTo/9v+6S417x1XnLtEHWGOKTjednMmGZNf31RuhCUR85WxJTZQL0ztVTNgeK8leh7 jsOY83r3nM11NfwE/fbkojP7R9Q1gn6FGth8W6ow= Received: from lingrow.int.hansenpartnership.com (unknown [153.66.160.227]) by lamorak.hansenpartnership.com (Postfix) with ESMTP id 0C9DF1C0015; Thu, 27 Mar 2025 10:16:27 -0400 (EDT) From: James Bottomley To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mcgrof@kernel.org, jack@suse.cz, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [RFC PATCH 2/4] vfs: make sb_start_write freezable Date: Thu, 27 Mar 2025 10:06:11 -0400 Message-ID: <20250327140613.25178-3-James.Bottomley@HansenPartnership.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> References: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If a write happens on a frozen filesystem, the s_writers.rw_sem gets stuck in TASK_UNINTERRUPTIBLE and inhibits suspending or hibernating the system. Since we want to freeze filesystems first then tasks, we need this condition not to inhibit suspend/hibernate, which means the wait has to have the TASK_FREEZABLE flag as well. Use the freezable version of percpu-rwsem to ensure this. Signed-off-by: James Bottomley Reviewed-by: Jan Kara --- include/linux/fs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index dd84d1c3b8af..cbbb704eff74 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1782,7 +1782,8 @@ static inline void __sb_end_write(struct super_block *sb, int level) static inline void __sb_start_write(struct super_block *sb, int level) { - percpu_down_read(sb->s_writers.rw_sem + level - 1); + percpu_down_read_freezable(sb->s_writers.rw_sem + level - 1, + level == SB_FREEZE_WRITE); } static inline bool __sb_start_write_trylock(struct super_block *sb, int level) From patchwork Thu Mar 27 14:06:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Bottomley X-Patchwork-Id: 14031187 Received: from lamorak.hansenpartnership.com (lamorak.hansenpartnership.com [198.37.111.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 471D2216399; Thu, 27 Mar 2025 14:17:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.37.111.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743085050; cv=none; b=K0QdiujSi8aUqdiaLTAQawkS/Nh20eVWYuXUODXC/zojDa3nB5l0gwmcnKUvWELYCdEOwKnMs+Dz8vOYTZsOOwuWxhg8OUvneT7dETFd4pj2HGhJMBJE7ctqzjLJNXN50c8bK8el+ByiCrSVff/MhitW/vYn7Q52EcXKVi1SkSs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743085050; c=relaxed/simple; bh=yxoa31pGOxJ2FohF6xJlYSCyW0JGAGDAJTDpStaj7Ds=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UyVHBhaGOIkKWkwt1esb0oUTI3uaRq0FFmd+t6xDarphz+z2TT0C20KwmQJmYe9MBHc8Mj6Q9ST2FajnpkncdsQ/2HzZJVNRjzGTVdLmzKYMloLYYgmk/mFXojnJzYfwhoqN9bWVNllpnA/Zwebvu9njYw5fvZDfrmwimQi4lgg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com; spf=pass smtp.mailfrom=HansenPartnership.com; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b=XvuiNMPn; arc=none smtp.client-ip=198.37.111.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b="XvuiNMPn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1743085048; bh=yxoa31pGOxJ2FohF6xJlYSCyW0JGAGDAJTDpStaj7Ds=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References:From; b=XvuiNMPnkRkL6jE7B5vEs8Jp0ypn4qVIKbrwq+U/W7hbCMcUwHdEGAyK2ongJeVmp 58Cfv4Ui7BD+rRdKniCP4uMnPAmXPKXplQwsTP/q9yA/8or6RTROp5VG/gyxewVL1f CaNWH1v8fEe6NX2uwU5cVlA8CoXAArJUDJCECn18= Received: from lingrow.int.hansenpartnership.com (unknown [153.66.160.227]) by lamorak.hansenpartnership.com (Postfix) with ESMTP id 882421C0015; Thu, 27 Mar 2025 10:17:27 -0400 (EDT) From: James Bottomley To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mcgrof@kernel.org, jack@suse.cz, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [RFC PATCH 3/4] fs/super.c: introduce reverse superblock iterator and use it in emergency remount Date: Thu, 27 Mar 2025 10:06:12 -0400 Message-ID: <20250327140613.25178-4-James.Bottomley@HansenPartnership.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> References: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Originally proposed by Amir as an extract from the android kernel: https://lore.kernel.org/linux-fsdevel/CAA2m6vfatWKS1CQFpaRbii2AXiZFvQUjVvYhGxWTSpz+2rxDyg@mail.gmail.com/ Since suspend/resume requires a reverse iterator, I'm dusting it off. Signed-off-by: James Bottomley --- fs/super.c | 48 +++++++++++++++++++++++++++++------------------- 1 file changed, 29 insertions(+), 19 deletions(-) diff --git a/fs/super.c b/fs/super.c index 5a7db4a556e3..76785509d906 100644 --- a/fs/super.c +++ b/fs/super.c @@ -887,28 +887,38 @@ void drop_super_exclusive(struct super_block *sb) } EXPORT_SYMBOL(drop_super_exclusive); +#define ITERATE_SUPERS(f, rev) \ + struct super_block *sb, *p = NULL; \ + \ + spin_lock(&sb_lock); \ + \ + list_for_each_entry##rev(sb, &super_blocks, s_list) { \ + if (super_flags(sb, SB_DYING)) \ + continue; \ + sb->s_count++; \ + spin_unlock(&sb_lock); \ + \ + f(sb); \ + \ + spin_lock(&sb_lock); \ + if (p) \ + __put_super(p); \ + p = sb; \ + } \ + if (p) \ + __put_super(p); \ + spin_unlock(&sb_lock); + static void __iterate_supers(void (*f)(struct super_block *)) { - struct super_block *sb, *p = NULL; - - spin_lock(&sb_lock); - list_for_each_entry(sb, &super_blocks, s_list) { - if (super_flags(sb, SB_DYING)) - continue; - sb->s_count++; - spin_unlock(&sb_lock); - - f(sb); + ITERATE_SUPERS(f,) +} - spin_lock(&sb_lock); - if (p) - __put_super(p); - p = sb; - } - if (p) - __put_super(p); - spin_unlock(&sb_lock); +static void __iterate_supers_rev(void (*f)(struct super_block *)) +{ + ITERATE_SUPERS(f, _reverse) } + /** * iterate_supers - call function for all active superblocks * @f: function to call @@ -1132,7 +1142,7 @@ static void do_emergency_remount_callback(struct super_block *sb) static void do_emergency_remount(struct work_struct *work) { - __iterate_supers(do_emergency_remount_callback); + __iterate_supers_rev(do_emergency_remount_callback); kfree(work); printk("Emergency Remount complete\n"); } From patchwork Thu Mar 27 14:06:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Bottomley X-Patchwork-Id: 14031189 Received: from lamorak.hansenpartnership.com (lamorak.hansenpartnership.com [198.37.111.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD46120F071; Thu, 27 Mar 2025 14:18:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.37.111.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743085102; cv=none; b=a+HY1eDhdeMkd/srD1PDlJieUdGzjyH8YRnajxLYyyRufgvtSR92+OLj2tXAPhPpWFNCraXuV+y/AhMnunZKru+re1P1zIPlFSxivQmEy5VL3hD4cpVPHVJhLI73ZolOrvaJ3Iua8kzcFNy++sxH3/5b/3kfJSx0vd266fZ+ACA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743085102; c=relaxed/simple; bh=HFAUvP0qKmHTS37uLlxwq4DaGeKS/405aLiixg0iZa0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JMLixA42fx+9e0PkhPm3ABcUAQV53vTf9CoMnbnSBnhpCrt+SeQrhZj5gPQFnBHc11zIE6S8DIQPlxXTqwKXex+1ljvJFvas6kEp61PqKnaKFkpJwbaduNiDce4T2D4GN3q/U6fMmj0jaTpjop7y96tjDsNOs5RDZBKTwhEalWI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com; spf=pass smtp.mailfrom=HansenPartnership.com; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b=My6WhwEC; arc=none smtp.client-ip=198.37.111.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=HansenPartnership.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=hansenpartnership.com header.i=@hansenpartnership.com header.b="My6WhwEC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1743085099; bh=HFAUvP0qKmHTS37uLlxwq4DaGeKS/405aLiixg0iZa0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References:From; b=My6WhwECfAc9RZve0UUGsNiydEvSgEkuKLSlqjpnxs3rqEsLvnWGzJ0JN/fuoXm74 jTnH1Fju+h9DBZKap/pjaJaY6c9fakSUfih+DprHEfxVAgIzJEBc/PReU1CfxpynJO ctB4YaNRkLnbtPWa/LJDrQzuccYWhrcvNMIn/iqY= Received: from lingrow.int.hansenpartnership.com (unknown [153.66.160.227]) by lamorak.hansenpartnership.com (Postfix) with ESMTP id 485001C0078; Thu, 27 Mar 2025 10:18:19 -0400 (EDT) From: James Bottomley To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mcgrof@kernel.org, jack@suse.cz, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [RFC PATCH 4/4] vfs: add filesystem freeze/thaw callbacks for power management Date: Thu, 27 Mar 2025 10:06:13 -0400 Message-ID: <20250327140613.25178-5-James.Bottomley@HansenPartnership.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> References: <20250327140613.25178-1-James.Bottomley@HansenPartnership.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Introduce a freeze function, which iterates superblocks in reverse order freezing filesystems. The indicator a filesystem is freezable is either possessing a s_bdev or a freeze_super method. So this can be used in efivarfs, whether the freeze is for hibernate is also passed in via the new FREEZE_FOR_HIBERNATE flag. Thawing is done opposite to freezing (so superblock traversal in regular order) and the whole thing is plumbed into power management. The original ksys_sync() is preserved so the whole freezing step is optional (if it fails we're no worse off than we are today) so it doesn't inhibit suspend/hibernate if there's a failure. Signed-off-by: James Bottomley --- fs/super.c | 61 ++++++++++++++++++++++++++++++++++++++++ include/linux/fs.h | 5 ++++ kernel/power/hibernate.c | 12 ++++++++ kernel/power/suspend.c | 4 +++ 4 files changed, 82 insertions(+) diff --git a/fs/super.c b/fs/super.c index 76785509d906..b4b0986414b0 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1461,6 +1461,67 @@ static struct super_block *get_bdev_super(struct block_device *bdev) return sb; } +/* + * Kernel freezing and thawing is only done in the power management + * subsystem and is thus single threaded (so we don't have to worry + * here about multiple calls to filesystems_freeze/thaw(). + */ + +static int freeze_flags; + +static void filesystems_freeze_callback(struct super_block *sb) +{ + /* errors don't fail suspend so ignore them */ + if (sb->s_op->freeze_super) + sb->s_op->freeze_super(sb, FREEZE_MAY_NEST + | FREEZE_HOLDER_KERNEL + | freeze_flags); + else if (sb->s_bdev) + freeze_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_KERNEL + | freeze_flags); + else { + pr_info("Ignoring filesystem %s\n", sb->s_type->name); + return; + } + + pr_info("frozen %s, now syncing block ...", sb->s_type->name); + sync_blockdev(sb->s_bdev); + pr_info("done."); +} + +/** + * filesystems_freeze - freeze callback for power management + * + * Freeze all active filesystems (in reverse superblock order) + */ +void filesystems_freeze(bool for_hibernate) +{ + freeze_flags = for_hibernate ? FREEZE_FOR_HIBERNATE : 0; + __iterate_supers_rev(filesystems_freeze_callback); +} + +static void filesystems_thaw_callback(struct super_block *sb) +{ + if (sb->s_op->thaw_super) + sb->s_op->thaw_super(sb, FREEZE_MAY_NEST + | FREEZE_HOLDER_KERNEL + | freeze_flags); + else if (sb->s_bdev) + thaw_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_KERNEL + | freeze_flags); +} + +/** + * filesystems_thaw - thaw callback for power management + * + * Thaw all active filesystems (in forward superblock order) + */ +void filesystems_thaw(bool for_hibernate) +{ + freeze_flags = for_hibernate ? FREEZE_FOR_HIBERNATE : 0; + __iterate_supers(filesystems_thaw_callback); +} + /** * fs_bdev_freeze - freeze owning filesystem of block device * @bdev: block device diff --git a/include/linux/fs.h b/include/linux/fs.h index cbbb704eff74..de154e9379ec 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2272,6 +2272,7 @@ extern loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, * @FREEZE_HOLDER_KERNEL: kernel wants to freeze or thaw filesystem * @FREEZE_HOLDER_USERSPACE: userspace wants to freeze or thaw filesystem * @FREEZE_MAY_NEST: whether nesting freeze and thaw requests is allowed + * @FREEZE_FOR_HIBERNATE: set if freeze is from power management hibernate * * Indicate who the owner of the freeze or thaw request is and whether * the freeze needs to be exclusive or can nest. @@ -2285,6 +2286,7 @@ enum freeze_holder { FREEZE_HOLDER_KERNEL = (1U << 0), FREEZE_HOLDER_USERSPACE = (1U << 1), FREEZE_MAY_NEST = (1U << 2), + FREEZE_FOR_HIBERNATE = (1U << 3), }; struct super_operations { @@ -3919,4 +3921,7 @@ static inline bool vfs_empty_path(int dfd, const char __user *path) int generic_atomic_write_valid(struct kiocb *iocb, struct iov_iter *iter); +void filesystems_freeze(bool for_hibernate); +void filesystems_thaw(bool for_hibernate); + #endif /* _LINUX_FS_H */ diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 10a01af63a80..fc2106e6685a 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -778,7 +778,12 @@ int hibernate(void) ksys_sync_helper(); + pr_info("about to freeze filesystems\n"); + filesystems_freeze(true); + pr_info("filesystem freeze done\n"); + error = freeze_processes(); + pr_info("process freeze done\n"); if (error) goto Exit; @@ -788,7 +793,9 @@ int hibernate(void) if (error) goto Thaw; + pr_info("About to create snapshot\n"); error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM); + pr_info("snapshot done\n"); if (error || freezer_test_done) goto Free_bitmaps; @@ -842,6 +849,8 @@ int hibernate(void) } thaw_processes(); + filesystems_thaw(true); + /* Don't bother checking whether freezer_test_done is true */ freezer_test_done = false; Exit: @@ -939,6 +948,8 @@ int hibernate_quiet_exec(int (*func)(void *data), void *data) thaw_processes(); + filesystems_thaw(true); + exit: pm_notifier_call_chain(PM_POST_HIBERNATION); @@ -1041,6 +1052,7 @@ static int software_resume(void) error = load_image_and_restore(); thaw_processes(); + filesystems_thaw(true); Finish: pm_notifier_call_chain(PM_POST_RESTORE); Restore: diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 09f8397bae15..34cc5b0c408c 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -544,6 +544,7 @@ int suspend_devices_and_enter(suspend_state_t state) static void suspend_finish(void) { suspend_thaw_processes(); + filesystems_thaw(false); pm_notifier_call_chain(PM_POST_SUSPEND); pm_restore_console(); } @@ -581,6 +582,7 @@ static int enter_state(suspend_state_t state) trace_suspend_resume(TPS("sync_filesystems"), 0, true); ksys_sync_helper(); trace_suspend_resume(TPS("sync_filesystems"), 0, false); + filesystems_freeze(false); } pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]); @@ -603,6 +605,8 @@ static int enter_state(suspend_state_t state) pm_pr_dbg("Finishing wakeup.\n"); suspend_finish(); Unlock: + if (sync_on_suspend_enabled) + filesystems_thaw(false); mutex_unlock(&system_transition_mutex); return error; }