Message ID | 1440429203-4039-1-git-send-email-pmladek@suse.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Mon, 24 Aug 2015 17:13:23 +0200 Petr Mladek <pmladek@suse.com> wrote: > The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing > latency") fixed khugepaged to do not block a system suspend. But > the result is that it could not get interrupted before the given > timeout because the condition for the wait event is "false". What are the userspace-visible effects of this bug? > This patch puts back the original approach but it uses > freezable_schedule_timeout_interruptible() instead of > schedule_timeout_interruptible(). It does the right thing. > I am pretty sure that the freezable variant was not used in > the original fix only because it was not available at that time. > > The regression has been there for ages. It was not critical. It just > did the allocation throttling a little bit more aggressively. > > I found this problem when converting the kthread to kthread worker API > and trying to understand the code. > > ... -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon 2015-08-24 13:30:43, Andrew Morton wrote: > On Mon, 24 Aug 2015 17:13:23 +0200 Petr Mladek <pmladek@suse.com> wrote: > > > The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing > > latency") fixed khugepaged to do not block a system suspend. But > > the result is that it could not get interrupted before the given > > timeout because the condition for the wait event is "false". > > What are the userspace-visible effects of this bug? I believe that the change will not make any visible difference. It is just a bit cleaner code. If I get it correctly. This function is called when the daemon is not able to allocate any new huge page. It is used to throttle the attempts. Then the thread is waken in the following situations: + when user modifies "alloc_sleep" or "scan_sleep" from sysfs; this is rare + in __khugepaged_enter() when there is a new page to scan and the list was empty before. This is because the same waitqueue is used to wait between scans. IMHO, it is kind of bug to mix these two things. But I guess that this wake is rare as well. Also I guess that it will be solved by Vlastimil's rework. + when the kthread is stopped; this is the only place when it could make a visible difference if the sleep is longer; but this is rare situation as well Best Regards, Petr > > This patch puts back the original approach but it uses > > freezable_schedule_timeout_interruptible() instead of > > schedule_timeout_interruptible(). It does the right thing. > > I am pretty sure that the freezable variant was not used in > > the original fix only because it was not available at that time. > > > > The regression has been there for ages. It was not critical. It just > > did the allocation throttling a little bit more aggressively. > > > > I found this problem when converting the kthread to kthread worker API > > and trying to understand the code. > > > > ... -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/24/2015 05:13 PM, Petr Mladek wrote: > The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing > latency") fixed khugepaged to do not block a system suspend. But > the result is that it could not get interrupted before the given > timeout because the condition for the wait event is "false". > > This patch puts back the original approach but it uses > freezable_schedule_timeout_interruptible() instead of > schedule_timeout_interruptible(). It does the right thing. > I am pretty sure that the freezable variant was not used in > the original fix only because it was not available at that time. > > The regression has been there for ages. It was not critical. It just > did the allocation throttling a little bit more aggressively. > > I found this problem when converting the kthread to kthread worker API > and trying to understand the code. > > Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> > --- > mm/huge_memory.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 7109330c5911..eb115aaa429c 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2368,8 +2368,12 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page, > > static void khugepaged_alloc_sleep(void) > { > - wait_event_freezable_timeout(khugepaged_wait, false, > - msecs_to_jiffies(khugepaged_alloc_sleep_millisecs)); > + DEFINE_WAIT(wait); > + > + add_wait_queue(&khugepaged_wait, &wait); > + freezable_schedule_timeout_interruptible( > + msecs_to_jiffies(khugepaged_alloc_sleep_millisecs)); > + remove_wait_queue(&khugepaged_wait, &wait); > } > > static int khugepaged_node_load[MAX_NUMNODES]; > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/25/2015 10:42 AM, Petr Mladek wrote: > On Mon 2015-08-24 13:30:43, Andrew Morton wrote: >> On Mon, 24 Aug 2015 17:13:23 +0200 Petr Mladek <pmladek@suse.com> wrote: >> >>> The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing >>> latency") fixed khugepaged to do not block a system suspend. But >>> the result is that it could not get interrupted before the given >>> timeout because the condition for the wait event is "false". >> >> What are the userspace-visible effects of this bug? > > I believe that the change will not make any visible difference. It > is just a bit cleaner code. > > If I get it correctly. This function is called when the daemon > is not able to allocate any new huge page. It is used to throttle the > attempts. Then the thread is waken in the following situations: > > + when user modifies "alloc_sleep" or "scan_sleep" from sysfs; > this is rare I guess somebody could set a high alloc_sleep value by mistake, and then try to fix it back, but khugepaged would keep sleeping until the high value expires. > + in __khugepaged_enter() when there is a new page to scan and > the list was empty before. This is because the same waitqueue > is used to wait between scans. IMHO, it is kind of bug to mix > these two things. But I guess that this wake is rare as well. > Also I guess that it will be solved by Vlastimil's rework. Yeah this shouldn't matter much. > + when the kthread is stopped; this is the only place when it could > make a visible difference if the sleep is longer; but this is > rare situation as well If this is what happens on shutdown, it could be indeed annoying to wait 30 seconds (but I don't know if that's how shutdown works?). > > Best Regards, > Petr > >>> This patch puts back the original approach but it uses >>> freezable_schedule_timeout_interruptible() instead of >>> schedule_timeout_interruptible(). It does the right thing. >>> I am pretty sure that the freezable variant was not used in >>> the original fix only because it was not available at that time. >>> >>> The regression has been there for ages. It was not critical. It just >>> did the allocation throttling a little bit more aggressively. >>> >>> I found this problem when converting the kthread to kthread worker API >>> and trying to understand the code. >>> >>> ... -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 7109330c5911..eb115aaa429c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2368,8 +2368,12 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page, static void khugepaged_alloc_sleep(void) { - wait_event_freezable_timeout(khugepaged_wait, false, - msecs_to_jiffies(khugepaged_alloc_sleep_millisecs)); + DEFINE_WAIT(wait); + + add_wait_queue(&khugepaged_wait, &wait); + freezable_schedule_timeout_interruptible( + msecs_to_jiffies(khugepaged_alloc_sleep_millisecs)); + remove_wait_queue(&khugepaged_wait, &wait); } static int khugepaged_node_load[MAX_NUMNODES];
The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing latency") fixed khugepaged to do not block a system suspend. But the result is that it could not get interrupted before the given timeout because the condition for the wait event is "false". This patch puts back the original approach but it uses freezable_schedule_timeout_interruptible() instead of schedule_timeout_interruptible(). It does the right thing. I am pretty sure that the freezable variant was not used in the original fix only because it was not available at that time. The regression has been there for ages. It was not critical. It just did the allocation throttling a little bit more aggressively. I found this problem when converting the kthread to kthread worker API and trying to understand the code. Signed-off-by: Petr Mladek <pmladek@suse.com> --- mm/huge_memory.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)