Message ID | 20180528124313.GC27180@dhcp22.suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Michal Hocko wrote: > On Fri 25-05-18 20:46:21, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Fri 25-05-18 19:57:32, Tetsuo Handa wrote: > > > > Michal Hocko wrote: > > > > > What is wrong with the folliwing? should_reclaim_retry should be a > > > > > natural reschedule point. PF_WQ_WORKER is a special case which needs a > > > > > stronger rescheduling policy. Doing that unconditionally seems more > > > > > straightforward than depending on a zone being a good candidate for a > > > > > further reclaim. > > > > > > > > Where is schedule_timeout_uninterruptible(1) for !PF_KTHREAD threads? > > > > > > Re-read what I've said. > > > > Please show me as a complete patch. Then, I will test your patch. > > So how about we start as simple as the following? If we _really_ need to > touch should_reclaim_retry then it should be done in a separate patch > with some numbers/tracing data backing that story. This patch is incorrect that it ignores the bug in Roman's "mm, oom: cgroup-aware OOM killer" patch in linux-next. I suggest applying this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch.
On Tue 29-05-18 05:57:16, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Fri 25-05-18 20:46:21, Tetsuo Handa wrote: > > > Michal Hocko wrote: > > > > On Fri 25-05-18 19:57:32, Tetsuo Handa wrote: > > > > > Michal Hocko wrote: > > > > > > What is wrong with the folliwing? should_reclaim_retry should be a > > > > > > natural reschedule point. PF_WQ_WORKER is a special case which needs a > > > > > > stronger rescheduling policy. Doing that unconditionally seems more > > > > > > straightforward than depending on a zone being a good candidate for a > > > > > > further reclaim. > > > > > > > > > > Where is schedule_timeout_uninterruptible(1) for !PF_KTHREAD threads? > > > > > > > > Re-read what I've said. > > > > > > Please show me as a complete patch. Then, I will test your patch. > > > > So how about we start as simple as the following? If we _really_ need to > > touch should_reclaim_retry then it should be done in a separate patch > > with some numbers/tracing data backing that story. > > This patch is incorrect that it ignores the bug in Roman's > "mm, oom: cgroup-aware OOM killer" patch in linux-next. I've expected Roman to comment on that. The fix should be trivial. But does that prevent from further testing of this patch? Are you actually using cgroup OOM killer? If not the bug should be a non-issue, right? > I suggest applying > this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. Well, I hope the whole pile gets merged in the upcoming merge window rather than stall even more. This patch however can wait some more. There is no hurry to merge it right away.
On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@kernel.org> wrote: > > I suggest applying > > this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. > > Well, I hope the whole pile gets merged in the upcoming merge window > rather than stall even more. I'm more inclined to drop it all. David has identified significant shortcomings and I'm not seeing a way of addressing those shortcomings in a backward-compatible fashion. Therefore there is no way forward at present.
On Tue 29-05-18 16:07:00, Andrew Morton wrote: > On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@kernel.org> wrote: > > > > I suggest applying > > > this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. > > > > Well, I hope the whole pile gets merged in the upcoming merge window > > rather than stall even more. > > I'm more inclined to drop it all. David has identified significant > shortcomings and I'm not seeing a way of addressing those shortcomings > in a backward-compatible fashion. Therefore there is no way forward > at present. Well, I thought we have argued about those "shortcomings" back and forth and expressed that they are not really a problem for workloads which are going to use the feature. The backward compatibility has been explained as well AFAICT. Anyway if this is your position on the matter then I just give up. I've tried to do my best to review the feature (as !author nor the end user) and I cannot do really much more. I find it quite sad though to be honest.
On Fri, 1 Jun 2018 17:28:01 +0200 Michal Hocko <mhocko@kernel.org> wrote: > On Tue 29-05-18 16:07:00, Andrew Morton wrote: > > On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@kernel.org> wrote: > > > > > > I suggest applying > > > > this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. > > > > > > Well, I hope the whole pile gets merged in the upcoming merge window > > > rather than stall even more. > > > > I'm more inclined to drop it all. David has identified significant > > shortcomings and I'm not seeing a way of addressing those shortcomings > > in a backward-compatible fashion. Therefore there is no way forward > > at present. > > Well, I thought we have argued about those "shortcomings" back and forth > and expressed that they are not really a problem for workloads which are > going to use the feature. The backward compatibility has been explained > as well AFAICT. Feel free to re-explain. It's the only way we'll get there. David has proposed an alternative patchset. IIRC Roman gave that a one-line positive response but I don't think it has seen a lot of attention?
On Fri 01-06-18 14:11:10, Andrew Morton wrote: > On Fri, 1 Jun 2018 17:28:01 +0200 Michal Hocko <mhocko@kernel.org> wrote: > > > On Tue 29-05-18 16:07:00, Andrew Morton wrote: > > > On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@kernel.org> wrote: > > > > > > > > I suggest applying > > > > > this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. > > > > > > > > Well, I hope the whole pile gets merged in the upcoming merge window > > > > rather than stall even more. > > > > > > I'm more inclined to drop it all. David has identified significant > > > shortcomings and I'm not seeing a way of addressing those shortcomings > > > in a backward-compatible fashion. Therefore there is no way forward > > > at present. > > > > Well, I thought we have argued about those "shortcomings" back and forth > > and expressed that they are not really a problem for workloads which are > > going to use the feature. The backward compatibility has been explained > > as well AFAICT. > > Feel free to re-explain. It's the only way we'll get there. OK, I will go and my points to the last version of the patchset. > David has proposed an alternative patchset. IIRC Roman gave that a > one-line positive response but I don't think it has seen a lot of > attention? I plan to go and revisit that. My preliminary feedback is that a more generic policy API is really tricky and the patchset has many holes there. But I will come with a more specific feedback in the respective thread.
On 2018/06/04 16:04, Michal Hocko wrote: > On Fri 01-06-18 14:11:10, Andrew Morton wrote: >> On Fri, 1 Jun 2018 17:28:01 +0200 Michal Hocko <mhocko@kernel.org> wrote: >> >>> On Tue 29-05-18 16:07:00, Andrew Morton wrote: >>>> On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@kernel.org> wrote: >>>> >>>>>> I suggest applying >>>>>> this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. >>>>> >>>>> Well, I hope the whole pile gets merged in the upcoming merge window >>>>> rather than stall even more. >>>> >>>> I'm more inclined to drop it all. David has identified significant >>>> shortcomings and I'm not seeing a way of addressing those shortcomings >>>> in a backward-compatible fashion. Therefore there is no way forward >>>> at present. >>> >>> Well, I thought we have argued about those "shortcomings" back and forth >>> and expressed that they are not really a problem for workloads which are >>> going to use the feature. The backward compatibility has been explained >>> as well AFAICT. >> >> Feel free to re-explain. It's the only way we'll get there. > > OK, I will go and my points to the last version of the patchset. > >> David has proposed an alternative patchset. IIRC Roman gave that a >> one-line positive response but I don't think it has seen a lot of >> attention? > > I plan to go and revisit that. My preliminary feedback is that a more > generic policy API is really tricky and the patchset has many holes > there. But I will come with a more specific feedback in the respective > thread. > Is current version of "mm, oom: cgroup-aware OOM killer" patchset going to be dropped for now? I want to know which state should I use for baseline for my patch.
On Mon 04-06-18 19:41:01, Tetsuo Handa wrote: > On 2018/06/04 16:04, Michal Hocko wrote: > > On Fri 01-06-18 14:11:10, Andrew Morton wrote: > >> On Fri, 1 Jun 2018 17:28:01 +0200 Michal Hocko <mhocko@kernel.org> wrote: > >> > >>> On Tue 29-05-18 16:07:00, Andrew Morton wrote: > >>>> On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@kernel.org> wrote: > >>>> > >>>>>> I suggest applying > >>>>>> this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. > >>>>> > >>>>> Well, I hope the whole pile gets merged in the upcoming merge window > >>>>> rather than stall even more. > >>>> > >>>> I'm more inclined to drop it all. David has identified significant > >>>> shortcomings and I'm not seeing a way of addressing those shortcomings > >>>> in a backward-compatible fashion. Therefore there is no way forward > >>>> at present. > >>> > >>> Well, I thought we have argued about those "shortcomings" back and forth > >>> and expressed that they are not really a problem for workloads which are > >>> going to use the feature. The backward compatibility has been explained > >>> as well AFAICT. > >> > >> Feel free to re-explain. It's the only way we'll get there. > > > > OK, I will go and my points to the last version of the patchset. > > > >> David has proposed an alternative patchset. IIRC Roman gave that a > >> one-line positive response but I don't think it has seen a lot of > >> attention? > > > > I plan to go and revisit that. My preliminary feedback is that a more > > generic policy API is really tricky and the patchset has many holes > > there. But I will come with a more specific feedback in the respective > > thread. > > > Is current version of "mm, oom: cgroup-aware OOM killer" patchset going to be > dropped for now? I want to know which state should I use for baseline for my patch. Is it that urgent that it cannot wait until after the merge window when thing should settle down?
On 2018/06/04 20:22, Michal Hocko wrote: > On Mon 04-06-18 19:41:01, Tetsuo Handa wrote: >> On 2018/06/04 16:04, Michal Hocko wrote: >>> On Fri 01-06-18 14:11:10, Andrew Morton wrote: >>>> On Fri, 1 Jun 2018 17:28:01 +0200 Michal Hocko <mhocko@kernel.org> wrote: >>>> >>>>> On Tue 29-05-18 16:07:00, Andrew Morton wrote: >>>>>> On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@kernel.org> wrote: >>>>>> >>>>>>>> I suggest applying >>>>>>>> this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch. >>>>>>> >>>>>>> Well, I hope the whole pile gets merged in the upcoming merge window >>>>>>> rather than stall even more. >>>>>> >>>>>> I'm more inclined to drop it all. David has identified significant >>>>>> shortcomings and I'm not seeing a way of addressing those shortcomings >>>>>> in a backward-compatible fashion. Therefore there is no way forward >>>>>> at present. >>>>> >>>>> Well, I thought we have argued about those "shortcomings" back and forth >>>>> and expressed that they are not really a problem for workloads which are >>>>> going to use the feature. The backward compatibility has been explained >>>>> as well AFAICT. >>>> >>>> Feel free to re-explain. It's the only way we'll get there. >>> >>> OK, I will go and my points to the last version of the patchset. >>> >>>> David has proposed an alternative patchset. IIRC Roman gave that a >>>> one-line positive response but I don't think it has seen a lot of >>>> attention? >>> >>> I plan to go and revisit that. My preliminary feedback is that a more >>> generic policy API is really tricky and the patchset has many holes >>> there. But I will come with a more specific feedback in the respective >>> thread. >>> >> Is current version of "mm, oom: cgroup-aware OOM killer" patchset going to be >> dropped for now? I want to know which state should I use for baseline for my patch. > > Is it that urgent that it cannot wait until after the merge window when > thing should settle down? > Yes, for it is a regression fix which I expected to be in time for 4.17. I want to apply it before OOM killer code gets complicated by cgroup-aware OOM killer.
On Mon, 4 Jun 2018, Tetsuo Handa wrote: > Is current version of "mm, oom: cgroup-aware OOM killer" patchset going to be > dropped for now? I want to know which state should I use for baseline for my patch. > My patchset to fix the issues with regard to the cgroup-aware oom killer to fix its calculations (current version in -mm is completey buggy for oom_score_adj, fixed in my patch 4/6), its context based errors (discounting mempolicy oom kills, fixed in my patch 6/6) and make it generally useful beyond highly specialized usecases in a backwards compatible way was posted on March 22 at https://marc.info/?l=linux-kernel&m=152175564104466. The base patchset is seemingly abandoned in -mm, unfortunately, so I think all oom killer patches should be based on Linus's tree.
On 2018/06/06 18:02, David Rientjes wrote: > On Mon, 4 Jun 2018, Tetsuo Handa wrote: > >> Is current version of "mm, oom: cgroup-aware OOM killer" patchset going to be >> dropped for now? I want to know which state should I use for baseline for my patch. >> > > My patchset to fix the issues with regard to the cgroup-aware oom killer > to fix its calculations (current version in -mm is completey buggy for > oom_score_adj, fixed in my patch 4/6), its context based errors > (discounting mempolicy oom kills, fixed in my patch 6/6) and make it > generally useful beyond highly specialized usecases in a backwards > compatible way was posted on March 22 at > https://marc.info/?l=linux-kernel&m=152175564104466. > > The base patchset is seemingly abandoned in -mm, unfortunately, so I think > all oom killer patches should be based on Linus's tree. > OK. I will use linux.git as a base. By the way, does "[RFC] Getting rid of INFLIGHT_VICTIM" simplify or break your cgroup-aware oom killer? If it simplifies your work, I'd like to apply it as well.
On Wed, 6 Jun 2018, Tetsuo Handa wrote: > OK. I will use linux.git as a base. > > By the way, does "[RFC] Getting rid of INFLIGHT_VICTIM" simplify or break > your cgroup-aware oom killer? If it simplifies your work, I'd like to apply > it as well. > I think it impacts the proposal to allow the oom reaper to operate over several different mm's in its list without processing one, waiting to give up, removing it, and moving on to the next one. It doesn't impact the cgroup-aware oom killer extension that I made other than trivial patch conflicts. I think if we can iterate over the oom reaper list to determine inflight victims it's simpler.
diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 565e7da55318..7f12e346e477 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1099,7 +1099,6 @@ bool out_of_memory(struct oom_control *oc) { unsigned long freed = 0; enum oom_constraint constraint = CONSTRAINT_NONE; - bool delay = false; /* if set, delay next allocation attempt */ if (oom_killer_disabled) return false; @@ -1149,10 +1148,8 @@ bool out_of_memory(struct oom_control *oc) return true; } - if (mem_cgroup_select_oom_victim(oc) && oom_kill_memcg_victim(oc)) { - delay = true; + if (mem_cgroup_select_oom_victim(oc) && oom_kill_memcg_victim(oc)) goto out; - } select_bad_process(oc); /* Found nothing?!?! Either we hang forever, or we panic. */ @@ -1160,20 +1157,10 @@ bool out_of_memory(struct oom_control *oc) dump_header(oc, NULL); panic("Out of memory and no killable processes...\n"); } - if (oc->chosen_task && oc->chosen_task != INFLIGHT_VICTIM) { + if (oc->chosen_task && oc->chosen_task != INFLIGHT_VICTIM) oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" : "Memory cgroup out of memory"); - delay = true; - } - out: - /* - * Give the killed process a good chance to exit before trying - * to allocate memory again. - */ - if (delay) - schedule_timeout_killable(1); - return !!oc->chosen_task; }