diff mbox

mm/khugepaged: Allow to interrupt allocation sleep again

Message ID 1440429203-4039-1-git-send-email-pmladek@suse.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Petr Mladek Aug. 24, 2015, 3:13 p.m. UTC
The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing
latency") fixed khugepaged to do not block a system suspend. But
the result is that it could not get interrupted before the given
timeout because the condition for the wait event is "false".

This patch puts back the original approach but it uses
freezable_schedule_timeout_interruptible() instead of
schedule_timeout_interruptible(). It does the right thing.
I am pretty sure that the freezable variant was not used in
the original fix only because it was not available at that time.

The regression has been there for ages. It was not critical. It just
did the allocation throttling a little bit more aggressively.

I found this problem when converting the kthread to kthread worker API
and trying to understand the code.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 mm/huge_memory.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Andrew Morton Aug. 24, 2015, 8:30 p.m. UTC | #1
On Mon, 24 Aug 2015 17:13:23 +0200 Petr Mladek <pmladek@suse.com> wrote:

> The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing
> latency") fixed khugepaged to do not block a system suspend. But
> the result is that it could not get interrupted before the given
> timeout because the condition for the wait event is "false".

What are the userspace-visible effects of this bug?

> This patch puts back the original approach but it uses
> freezable_schedule_timeout_interruptible() instead of
> schedule_timeout_interruptible(). It does the right thing.
> I am pretty sure that the freezable variant was not used in
> the original fix only because it was not available at that time.
> 
> The regression has been there for ages. It was not critical. It just
> did the allocation throttling a little bit more aggressively.
> 
> I found this problem when converting the kthread to kthread worker API
> and trying to understand the code.
> 
> ...
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Petr Mladek Aug. 25, 2015, 8:42 a.m. UTC | #2
On Mon 2015-08-24 13:30:43, Andrew Morton wrote:
> On Mon, 24 Aug 2015 17:13:23 +0200 Petr Mladek <pmladek@suse.com> wrote:
> 
> > The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing
> > latency") fixed khugepaged to do not block a system suspend. But
> > the result is that it could not get interrupted before the given
> > timeout because the condition for the wait event is "false".
> 
> What are the userspace-visible effects of this bug?

I believe that the change will not make any visible difference. It
is just a bit cleaner code.

If I get it correctly. This function is called when the daemon
is not able to allocate any new huge page. It is used to throttle the
attempts. Then the thread is waken in the following situations:

   + when user modifies "alloc_sleep" or "scan_sleep" from sysfs;
     this is rare

   + in __khugepaged_enter() when there is a new page to scan and
     the list was empty before. This is because the same waitqueue
     is used to wait between scans. IMHO, it is kind of bug to mix
     these two things. But I guess that this wake is rare as well.
     Also I guess that it will be solved by Vlastimil's rework.

   + when the kthread is stopped; this is the only place when it could
     make a visible difference if the sleep is longer; but this is
     rare situation as well

Best Regards,
Petr

> > This patch puts back the original approach but it uses
> > freezable_schedule_timeout_interruptible() instead of
> > schedule_timeout_interruptible(). It does the right thing.
> > I am pretty sure that the freezable variant was not used in
> > the original fix only because it was not available at that time.
> > 
> > The regression has been there for ages. It was not critical. It just
> > did the allocation throttling a little bit more aggressively.
> > 
> > I found this problem when converting the kthread to kthread worker API
> > and trying to understand the code.
> > 
> > ...
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vlastimil Babka Aug. 25, 2015, 9:08 a.m. UTC | #3
On 08/24/2015 05:13 PM, Petr Mladek wrote:
> The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing
> latency") fixed khugepaged to do not block a system suspend. But
> the result is that it could not get interrupted before the given
> timeout because the condition for the wait event is "false".
>
> This patch puts back the original approach but it uses
> freezable_schedule_timeout_interruptible() instead of
> schedule_timeout_interruptible(). It does the right thing.
> I am pretty sure that the freezable variant was not used in
> the original fix only because it was not available at that time.
>
> The regression has been there for ages. It was not critical. It just
> did the allocation throttling a little bit more aggressively.
>
> I found this problem when converting the kthread to kthread worker API
> and trying to understand the code.
>
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>   mm/huge_memory.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 7109330c5911..eb115aaa429c 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2368,8 +2368,12 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page,
>
>   static void khugepaged_alloc_sleep(void)
>   {
> -	wait_event_freezable_timeout(khugepaged_wait, false,
> -			msecs_to_jiffies(khugepaged_alloc_sleep_millisecs));
> +	DEFINE_WAIT(wait);
> +
> +	add_wait_queue(&khugepaged_wait, &wait);
> +	freezable_schedule_timeout_interruptible(
> +		msecs_to_jiffies(khugepaged_alloc_sleep_millisecs));
> +	remove_wait_queue(&khugepaged_wait, &wait);
>   }
>
>   static int khugepaged_node_load[MAX_NUMNODES];
>

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vlastimil Babka Aug. 25, 2015, 9:12 a.m. UTC | #4
On 08/25/2015 10:42 AM, Petr Mladek wrote:
> On Mon 2015-08-24 13:30:43, Andrew Morton wrote:
>> On Mon, 24 Aug 2015 17:13:23 +0200 Petr Mladek <pmladek@suse.com> wrote:
>>
>>> The commit 1dfb059b9438633b0546 ("thp: reduce khugepaged freezing
>>> latency") fixed khugepaged to do not block a system suspend. But
>>> the result is that it could not get interrupted before the given
>>> timeout because the condition for the wait event is "false".
>>
>> What are the userspace-visible effects of this bug?
>
> I believe that the change will not make any visible difference. It
> is just a bit cleaner code.
>
> If I get it correctly. This function is called when the daemon
> is not able to allocate any new huge page. It is used to throttle the
> attempts. Then the thread is waken in the following situations:
>
>     + when user modifies "alloc_sleep" or "scan_sleep" from sysfs;
>       this is rare

I guess somebody could set a high alloc_sleep value by mistake, and then 
try to fix it back, but khugepaged would keep sleeping until the high 
value expires.

>     + in __khugepaged_enter() when there is a new page to scan and
>       the list was empty before. This is because the same waitqueue
>       is used to wait between scans. IMHO, it is kind of bug to mix
>       these two things. But I guess that this wake is rare as well.
>       Also I guess that it will be solved by Vlastimil's rework.

Yeah this shouldn't matter much.

>     + when the kthread is stopped; this is the only place when it could
>       make a visible difference if the sleep is longer; but this is
>       rare situation as well

If this is what happens on shutdown, it could be indeed annoying to wait 
30 seconds (but I don't know if that's how shutdown works?).

>
> Best Regards,
> Petr
>
>>> This patch puts back the original approach but it uses
>>> freezable_schedule_timeout_interruptible() instead of
>>> schedule_timeout_interruptible(). It does the right thing.
>>> I am pretty sure that the freezable variant was not used in
>>> the original fix only because it was not available at that time.
>>>
>>> The regression has been there for ages. It was not critical. It just
>>> did the allocation throttling a little bit more aggressively.
>>>
>>> I found this problem when converting the kthread to kthread worker API
>>> and trying to understand the code.
>>>
>>> ...

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 7109330c5911..eb115aaa429c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2368,8 +2368,12 @@  static void __collapse_huge_page_copy(pte_t *pte, struct page *page,
 
 static void khugepaged_alloc_sleep(void)
 {
-	wait_event_freezable_timeout(khugepaged_wait, false,
-			msecs_to_jiffies(khugepaged_alloc_sleep_millisecs));
+	DEFINE_WAIT(wait);
+
+	add_wait_queue(&khugepaged_wait, &wait);
+	freezable_schedule_timeout_interruptible(
+		msecs_to_jiffies(khugepaged_alloc_sleep_millisecs));
+	remove_wait_queue(&khugepaged_wait, &wait);
 }
 
 static int khugepaged_node_load[MAX_NUMNODES];