Message ID | 20230531125535.676098-5-hch@lst.de (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | [01/24] driver core: return bool from driver_probe_done | expand |
On 5/31/23 14:55, Christoph Hellwig wrote: > software_resume can be called either from an init call in the boot code, > or from sysfs once the system has finished booting, and the two > invocation methods this can't race with each other. > > For the latter case we did just parse the suspend device manually, while > the former might not have one. Split software_resume so that the search > only happens for the boot case, which also means the special lockdep > nesting annotation can go away as the system transition mutex can be > taken a little later and doesn't have the sysfs locking nest inside it. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > Acked-by: Rafael J. Wysocki <rafael@kernel.org> This caused a regression for me in 6.5-rc1+, fix below. ----8<---- From 95a310ae6cfae9b3cab61e54a1bce488c3ab93a1 Mon Sep 17 00:00:00 2001 From: Vlastimil Babka <vbabka@suse.cz> Date: Wed, 2 Aug 2023 15:46:18 +0200 Subject: [PATCH] PM: hibernate: fix resume_store() return value when hibernation not available On a laptop with hibernation set up but not actively used, and with secure boot and lockdown enabled kernel, 6.5-rc1 gets stuck on boot with the following repeated messages: A start job is running for Resume from hibernation using device /dev/system/swap (24s / no limit) lockdown_is_locked_down: 25311154 callbacks suppressed Lockdown: systemd-hiberna: hibernation is restricted; see man kernel_lockdown.7 ... Checking the resume code leads to commit cc89c63e2fe3 ("PM: hibernate: move finding the resume device out of software_resume") which inadvertently changed the return value from resume_store() to 0 when !hibernation_available(). This apparently translates to userspace write() returning 0 as in number of bytes written, and userspace looping indefinitely in the attempt to write the intended value. Fix this by returning the full number of bytes that were to be written, as that's what was done before the commit. Fixes: cc89c63e2fe3 ("PM: hibernate: move finding the resume device out of software_resume") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> --- kernel/power/hibernate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index e1b4bfa938dd..2b4a946a6ff5 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -1166,7 +1166,7 @@ static ssize_t resume_store(struct kobject *kobj, struct kobj_attribute *attr, int error; if (!hibernation_available()) - return 0; + return n; if (len && buf[len-1] == '\n') len--;
Looks good, thanks!
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Fri, Aug 04, 2023 at 12:31:01PM +0200, Christoph Hellwig wrote: > Looks good, thanks! > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[TLDR: I'm adding this report to the list of tracked Linux kernel regressions; the text you find below is based on a few templates paragraphs you might have encountered already in similar form. See link in footer if these mails annoy you.] On 03.08.23 10:27, Vlastimil Babka wrote: > On 5/31/23 14:55, Christoph Hellwig wrote: >> software_resume can be called either from an init call in the boot code, >> or from sysfs once the system has finished booting, and the two >> invocation methods this can't race with each other. >> >> For the latter case we did just parse the suspend device manually, while >> the former might not have one. Split software_resume so that the search >> only happens for the boot case, which also means the special lockdep >> nesting annotation can go away as the system transition mutex can be >> taken a little later and doesn't have the sysfs locking nest inside it. >> >> Signed-off-by: Christoph Hellwig <hch@lst.de> >> Acked-by: Rafael J. Wysocki <rafael@kernel.org> > > This caused a regression for me in 6.5-rc1+, fix below. > > ----8<---- >>From 95a310ae6cfae9b3cab61e54a1bce488c3ab93a1 Mon Sep 17 00:00:00 2001 > From: Vlastimil Babka <vbabka@suse.cz> > Date: Wed, 2 Aug 2023 15:46:18 +0200 > Subject: [PATCH] PM: hibernate: fix resume_store() return value when > hibernation not available > > On a laptop with hibernation set up but not actively used, and with > secure boot and lockdown enabled kernel, 6.5-rc1 gets stuck on boot with > the following repeated messages: > > A start job is running for Resume from hibernation using device /dev/system/swap (24s / no limit) > lockdown_is_locked_down: 25311154 callbacks suppressed > Lockdown: systemd-hiberna: hibernation is restricted; see man kernel_lockdown.7 > ... > > Checking the resume code leads to commit cc89c63e2fe3 ("PM: hibernate: > move finding the resume device out of software_resume") which > inadvertently changed the return value from resume_store() to 0 when > !hibernation_available(). This apparently translates to userspace > write() returning 0 as in number of bytes written, and userspace looping > indefinitely in the attempt to write the intended value. > > Fix this by returning the full number of bytes that were to be written, > as that's what was done before the commit. > > Fixes: cc89c63e2fe3 ("PM: hibernate: move finding the resume device out of software_resume") > [...] Thanks for the report. To be sure the issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression tracking bot: #regzbot ^introduced cc89c63e2fe3 #regzbot title pm: boot problems when hibernate is configured and kernel locked down #regzbot fix: PM: hibernate: fix resume_store() return value when hibernation not available #regzbot ignore-activity This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply and tell me -- ideally while also telling regzbot about it, as explained by the page listed in the footer of this mail. Developers: When fixing the issue, remember to add 'Link:' tags pointing to the report (the parent of this mail). See page linked in footer for details. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr That page also explains what to do if mails like this annoy you.
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 78696aa04f5ca3..45e24b02cd50b6 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -907,7 +907,7 @@ int hibernate_quiet_exec(int (*func)(void *data), void *data) } EXPORT_SYMBOL_GPL(hibernate_quiet_exec); -static int find_resume_device(void) +static int __init find_resume_device(void) { if (!strlen(resume_file)) return -ENOENT; @@ -942,53 +942,16 @@ static int find_resume_device(void) return 0; } -/** - * software_resume - Resume from a saved hibernation image. - * - * This routine is called as a late initcall, when all devices have been - * discovered and initialized already. - * - * The image reading code is called to see if there is a hibernation image - * available for reading. If that is the case, devices are quiesced and the - * contents of memory is restored from the saved image. - * - * If this is successful, control reappears in the restored target kernel in - * hibernation_snapshot() which returns to hibernate(). Otherwise, the routine - * attempts to recover gracefully and make the kernel return to the normal mode - * of operation. - */ static int software_resume(void) { int error; - /* - * If the user said "noresume".. bail out early. - */ - if (noresume || !hibernation_available()) - return 0; - - /* - * name_to_dev_t() below takes a sysfs buffer mutex when sysfs - * is configured into the kernel. Since the regular hibernate - * trigger path is via sysfs which takes a buffer mutex before - * calling hibernate functions (which take system_transition_mutex) - * this can cause lockdep to complain about a possible ABBA deadlock - * which cannot happen since we're in the boot code here and - * sysfs can't be invoked yet. Therefore, we use a subclass - * here to avoid lockdep complaining. - */ - mutex_lock_nested(&system_transition_mutex, SINGLE_DEPTH_NESTING); - - if (!swsusp_resume_device) { - error = find_resume_device(); - if (error) - goto Unlock; - } - pm_pr_dbg("Hibernation image partition %d:%d present\n", MAJOR(swsusp_resume_device), MINOR(swsusp_resume_device)); pm_pr_dbg("Looking for hibernation image.\n"); + + mutex_lock(&system_transition_mutex); error = swsusp_check(false); if (error) goto Unlock; @@ -1035,7 +998,39 @@ static int software_resume(void) goto Finish; } -late_initcall_sync(software_resume); +/** + * software_resume_initcall - Resume from a saved hibernation image. + * + * This routine is called as a late initcall, when all devices have been + * discovered and initialized already. + * + * The image reading code is called to see if there is a hibernation image + * available for reading. If that is the case, devices are quiesced and the + * contents of memory is restored from the saved image. + * + * If this is successful, control reappears in the restored target kernel in + * hibernation_snapshot() which returns to hibernate(). Otherwise, the routine + * attempts to recover gracefully and make the kernel return to the normal mode + * of operation. + */ +static int __init software_resume_initcall(void) +{ + /* + * If the user said "noresume".. bail out early. + */ + if (noresume || !hibernation_available()) + return 0; + + if (!swsusp_resume_device) { + int error = find_resume_device(); + + if (error) + return error; + } + + return software_resume(); +} +late_initcall_sync(software_resume_initcall); static const char * const hibernation_modes[] = { @@ -1176,6 +1171,9 @@ static ssize_t resume_store(struct kobject *kobj, struct kobj_attribute *attr, char *name; dev_t res; + if (!hibernation_available()) + return 0; + if (len && buf[len-1] == '\n') len--; name = kstrndup(buf, len, GFP_KERNEL);