Message ID | 7de14c6cac4a486c04149f37948e3a76028f3fa5.1530461087.git.rfreire@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun 01-07-18 13:09:40, Rodrigo Freire wrote: > The default page memory unit of OOM task dump events might not be > intuitive for the non-initiated when debugging OOM events. Add > a small printk prior to the task dump informing that the memory > units are actually memory _pages_. Does this really help? I understand the the oom report might be not the easiest thing to grasp but wouldn't it be much better to actually add documentation with clarification of each part of it? > Signed-off-by: Rodrigo Freire <rfreire@redhat.com> > --- > mm/oom_kill.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 84081e7..b4d9557 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask) > struct task_struct *p; > struct task_struct *task; > > + pr_info("Tasks state (memory values in pages):\n"); > pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); > rcu_read_lock(); > for_each_process(p) { > -- > 1.8.3.1
Hello Michal, ----- Original Message ----- > From: "Michal Hocko" <mhocko@kernel.org> > To: "Rodrigo Freire" <rfreire@redhat.com> > Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org > Sent: Monday, July 2, 2018 6:30:43 AM > Subject: Re: [PATCH] mm: be more informative in OOM task list > > On Sun 01-07-18 13:09:40, Rodrigo Freire wrote: > > The default page memory unit of OOM task dump events might not be > > intuitive for the non-initiated when debugging OOM events. Add > > a small printk prior to the task dump informing that the memory > > units are actually memory _pages_. > > Does this really help? I understand the the oom report might be not the > easiest thing to grasp but wouldn't it be much better to actually add > documentation with clarification of each part of it? That would be great: After a quick grep -ri for oom in Documentation, I found several other files containing its own OOM behaviour modifier configurations. But it indeed lacks a central and canonical Doc file which documents the OOM Killer behavior and workflows. However, I still stand by my proposed patch: It is unobtrusive, infers no performance issue and clarifying: I recently worked in a case (for full disclosure: I am a far cry from a MM expert) where the sum of the RSS pages made sense when interpreted as real kB pages. Reason: There were processes sharing (a good amount of) memory regions, misleading the interpretation and that misled not only me, but some other colleagues a well: The pages was only sorted out after actually inspecting the source code. This patch is user-friendly and can be a great time saver to others in the community. I kindly request the ACKed-by ;-) Have a great week, - RF.
On Mon 02-07-18 07:22:13, Rodrigo Freire wrote: > Hello Michal, > > ----- Original Message ----- > > From: "Michal Hocko" <mhocko@kernel.org> > > To: "Rodrigo Freire" <rfreire@redhat.com> > > Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org > > Sent: Monday, July 2, 2018 6:30:43 AM > > Subject: Re: [PATCH] mm: be more informative in OOM task list > > > > On Sun 01-07-18 13:09:40, Rodrigo Freire wrote: > > > The default page memory unit of OOM task dump events might not be > > > intuitive for the non-initiated when debugging OOM events. Add > > > a small printk prior to the task dump informing that the memory > > > units are actually memory _pages_. > > > > Does this really help? I understand the the oom report might be not the > > easiest thing to grasp but wouldn't it be much better to actually add > > documentation with clarification of each part of it? > > That would be great: After a quick grep -ri for oom in Documentation, > I found several other files containing its own OOM behaviour modifier > configurations. But it indeed lacks a central and canonical Doc file > which documents the OOM Killer behavior and workflows. > > However, I still stand by my proposed patch: It is unobtrusive, infers > no performance issue and clarifying: I recently worked in a case (for > full disclosure: I am a far cry from a MM expert) where the sum of the > RSS pages made sense when interpreted as real kB pages. Reason: There > were processes sharing (a good amount of) memory regions, misleading > the interpretation and that misled not only me, but some other > colleagues a well: The pages was only sorted out after actually > inspecting the source code. > > This patch is user-friendly and can be a great time saver to others in > the community. Well, all other counters we print are in page units unless explicitly kB. So I am not sure we really need to do anything but document the output better. Maybe others will find it more important though.
Hello Michal! ----- Original Message ----- > From: "Michal Hocko" <mhocko@kernel.org> > To: "Rodrigo Freire" <rfreire@redhat.com> > Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org > Sent: Monday, July 2, 2018 8:29:06 AM > Subject: Re: [PATCH] mm: be more informative in OOM task list > > On Mon 02-07-18 07:22:13, Rodrigo Freire wrote: > > Hello Michal, > > > > ----- Original Message ----- > > > From: "Michal Hocko" <mhocko@kernel.org> > > > To: "Rodrigo Freire" <rfreire@redhat.com> > > > Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org > > > Sent: Monday, July 2, 2018 6:30:43 AM > > > Subject: Re: [PATCH] mm: be more informative in OOM task list > > > > > > On Sun 01-07-18 13:09:40, Rodrigo Freire wrote: > > > > The default page memory unit of OOM task dump events might not be > > > > intuitive for the non-initiated when debugging OOM events. Add > > > > a small printk prior to the task dump informing that the memory > > > > units are actually memory _pages_. > > > > > > Does this really help? I understand the the oom report might be not the > > > easiest thing to grasp but wouldn't it be much better to actually add > > > documentation with clarification of each part of it? > > > > That would be great: After a quick grep -ri for oom in Documentation, > > I found several other files containing its own OOM behaviour modifier > > configurations. But it indeed lacks a central and canonical Doc file > > which documents the OOM Killer behavior and workflows. > > > > However, I still stand by my proposed patch: It is unobtrusive, infers > > no performance issue and clarifying: I recently worked in a case (for > > full disclosure: I am a far cry from a MM expert) where the sum of the > > RSS pages made sense when interpreted as real kB pages. Reason: There > > were processes sharing (a good amount of) memory regions, misleading > > the interpretation and that misled not only me, but some other > > colleagues a well: The pages was only sorted out after actually > > inspecting the source code. > > > > This patch is user-friendly and can be a great time saver to others in > > the community. > > Well, all other counters we print are in page units unless explicitly > kB. Your statement is correct. And I thought about that too. And then the doubt: * Maybe someone forgot to state that these values are in kB? > So I am not sure we really need to do anything but document the > output better. Maybe others will find it more important though. The thing is, it also led some other colleagues (a few!) to think the very same as me: That raised the flag and made me write the patch: That was indeed misleading. And you may not have a MM and OOM-versed specialist available all the time! ;-) Still ask you to reconsider. My best regards, - RF.
On Sun, 1 Jul 2018, Rodrigo Freire wrote: > The default page memory unit of OOM task dump events might not be > intuitive for the non-initiated when debugging OOM events. Add > a small printk prior to the task dump informing that the memory > units are actually memory _pages_. > > Signed-off-by: Rodrigo Freire <rfreire@redhat.com> > --- > mm/oom_kill.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 84081e7..b4d9557 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask) > struct task_struct *p; > struct task_struct *task; > > + pr_info("Tasks state (memory values in pages):\n"); > pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); > rcu_read_lock(); > for_each_process(p) { As the author of dump_tasks(), and having seen these values misinterpreted on more than one occassion, I think this is a valuable addition. Could you also expand out the "pid" field to allow for seven digits instead of five? I think everything else is aligned. Feel free to add Acked-by: David Rientjes <rientjes@google.com> to a v2.
On Tue, Jul 03, 2018 at 06:34:48PM -0700, David Rientjes wrote: > On Sun, 1 Jul 2018, Rodrigo Freire wrote: > > > The default page memory unit of OOM task dump events might not be > > intuitive for the non-initiated when debugging OOM events. Add > > a small printk prior to the task dump informing that the memory > > units are actually memory _pages_. > > > > Signed-off-by: Rodrigo Freire <rfreire@redhat.com> > > --- > > mm/oom_kill.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 84081e7..b4d9557 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask) > > struct task_struct *p; > > struct task_struct *task; > > > > + pr_info("Tasks state (memory values in pages):\n"); > > pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); > > rcu_read_lock(); > > for_each_process(p) { > > As the author of dump_tasks(), and having seen these values misinterpreted > on more than one occassion, I think this is a valuable addition. > > Could you also expand out the "pid" field to allow for seven digits > instead of five? I think everything else is aligned. > > Feel free to add > > Acked-by: David Rientjes <rientjes@google.com> > > to a v2. > Same here, for a v2: Acked-by: Rafael Aquini <aquini@redhat.com>
diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 84081e7..b4d9557 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -392,6 +392,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask) struct task_struct *p; struct task_struct *task; + pr_info("Tasks state (memory values in pages):\n"); pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); rcu_read_lock(); for_each_process(p) {
The default page memory unit of OOM task dump events might not be intuitive for the non-initiated when debugging OOM events. Add a small printk prior to the task dump informing that the memory units are actually memory _pages_. Signed-off-by: Rodrigo Freire <rfreire@redhat.com> --- mm/oom_kill.c | 1 + 1 file changed, 1 insertion(+)