mbox series

[0/2] mm, memory_hotplug: fix uninitialized pages fallouts.

Message ID 20190128144506.15603-1-mhocko@kernel.org (mailing list archive)
Headers show
Series mm, memory_hotplug: fix uninitialized pages fallouts. | expand

Message

Michal Hocko Jan. 28, 2019, 2:45 p.m. UTC
Hi,
Mikhail has posted fixes for the two bugs quite some time ago [1]. I
have pushed back on those fixes because I believed that it is much
better to plug the problem at the initialization time rather than play
whack-a-mole all over the hotplug code and find all the places which
expect the full memory section to be initialized. We have ended up with
2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
memory section") merged and cause a regression [2][3]. The reason is
that there might be memory layouts when two NUMA nodes share the same
memory section so the merged fix is simply incorrect.

In order to plug this hole we really have to be zone range aware in
those handlers. I have split up the original patch into two. One is
unchanged (patch 2) and I took a different approach for `removable'
crash. It would be great if Mikhail could test it still works for his
memory layout.

[1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
[3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz

Comments

Andrew Morton Jan. 28, 2019, 5:50 p.m. UTC | #1
On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko <mhocko@kernel.org> wrote:

> Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> have pushed back on those fixes because I believed that it is much
> better to plug the problem at the initialization time rather than play
> whack-a-mole all over the hotplug code and find all the places which
> expect the full memory section to be initialized. We have ended up with
> 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> memory section") merged and cause a regression [2][3]. The reason is
> that there might be memory layouts when two NUMA nodes share the same
> memory section so the merged fix is simply incorrect.
> 
> In order to plug this hole we really have to be zone range aware in
> those handlers. I have split up the original patch into two. One is
> unchanged (patch 2) and I took a different approach for `removable'
> crash. It would be great if Mikhail could test it still works for his
> memory layout.
> 
> [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz

Any thoughts on which kernel version(s) need these patches?
Michal Hocko Jan. 28, 2019, 6:41 p.m. UTC | #2
On Mon 28-01-19 09:50:54, Andrew Morton wrote:
> On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko <mhocko@kernel.org> wrote:
> 
> > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > have pushed back on those fixes because I believed that it is much
> > better to plug the problem at the initialization time rather than play
> > whack-a-mole all over the hotplug code and find all the places which
> > expect the full memory section to be initialized. We have ended up with
> > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > memory section") merged and cause a regression [2][3]. The reason is
> > that there might be memory layouts when two NUMA nodes share the same
> > memory section so the merged fix is simply incorrect.
> > 
> > In order to plug this hole we really have to be zone range aware in
> > those handlers. I have split up the original patch into two. One is
> > unchanged (patch 2) and I took a different approach for `removable'
> > crash. It would be great if Mikhail could test it still works for his
> > memory layout.
> > 
> > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz
> 
> Any thoughts on which kernel version(s) need these patches?

My remark in 2830bf6f05fb still holds
    : This has alwways been problem AFAIU.  It just went unnoticed because we
    : have zeroed memmaps during allocation before f7f99100d8d9 ("mm: stop
    : zeroing memory during allocation in vmemmap") and so the above test
    : would simply skip these ranges as belonging to zone 0 or provided a
    : garbage.
    :
    : So I guess we do care for post f7f99100d8d9 kernels mostly and
    : therefore Fixes: f7f99100d8d9 ("mm: stop zeroing memory during
    : allocation in vmemmap")

But, please let's wait for the patch 1 to be confirmed to fix the issue.
Michal Hocko Jan. 28, 2019, 6:45 p.m. UTC | #3
On Mon 28-01-19 19:41:39, Michal Hocko wrote:
> On Mon 28-01-19 09:50:54, Andrew Morton wrote:
> > On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko <mhocko@kernel.org> wrote:
> > 
> > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > have pushed back on those fixes because I believed that it is much
> > > better to plug the problem at the initialization time rather than play
> > > whack-a-mole all over the hotplug code and find all the places which
> > > expect the full memory section to be initialized. We have ended up with
> > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > memory section") merged and cause a regression [2][3]. The reason is
> > > that there might be memory layouts when two NUMA nodes share the same
> > > memory section so the merged fix is simply incorrect.
> > > 
> > > In order to plug this hole we really have to be zone range aware in
> > > those handlers. I have split up the original patch into two. One is
> > > unchanged (patch 2) and I took a different approach for `removable'
> > > crash. It would be great if Mikhail could test it still works for his
> > > memory layout.
> > > 
> > > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz
> > 
> > Any thoughts on which kernel version(s) need these patches?
> 
> My remark in 2830bf6f05fb still holds
>     : This has alwways been problem AFAIU.  It just went unnoticed because we
>     : have zeroed memmaps during allocation before f7f99100d8d9 ("mm: stop
>     : zeroing memory during allocation in vmemmap") and so the above test
>     : would simply skip these ranges as belonging to zone 0 or provided a
>     : garbage.
>     :
>     : So I guess we do care for post f7f99100d8d9 kernels mostly and
>     : therefore Fixes: f7f99100d8d9 ("mm: stop zeroing memory during
>     : allocation in vmemmap")
> 
> But, please let's wait for the patch 1 to be confirmed to fix the issue.

Also the revert [1] should be applied first. I thought Linus would pick
it up but he hasn't done so yet.

[1] http://lkml.kernel.org/r/20190125181549.GE20411@dhcp22.suse.cz
Gerald Schaefer Jan. 29, 2019, 1:14 p.m. UTC | #4
On Mon, 28 Jan 2019 15:45:04 +0100
Michal Hocko <mhocko@kernel.org> wrote:

> Hi,
> Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> have pushed back on those fixes because I believed that it is much
> better to plug the problem at the initialization time rather than play
> whack-a-mole all over the hotplug code and find all the places which
> expect the full memory section to be initialized. We have ended up with
> 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> memory section") merged and cause a regression [2][3]. The reason is
> that there might be memory layouts when two NUMA nodes share the same
> memory section so the merged fix is simply incorrect.
> 
> In order to plug this hole we really have to be zone range aware in
> those handlers. I have split up the original patch into two. One is
> unchanged (patch 2) and I took a different approach for `removable'
> crash. It would be great if Mikhail could test it still works for his
> memory layout.
> 
> [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz

I verified that both patches fix the issues we had with valid_zones
(with mem=2050M) and removable (with mem=3075M).

However, the call trace in the description of your patch 1 is wrong.
You basically have the same call trace for test_pages_in_a_zone in
both patches. The "removable" patch should have the call trace for
is_mem_section_removable from Mikhails original patches:

 CONFIG_DEBUG_VM_PGFLAGS=y
 kernel parameter mem=3075M
 --------------------------
 page:000003d08300c000 is uninitialized and poisoned
 page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
 Call Trace:
 ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
  [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
  [<00000000008cf9c4>] dev_attr_show+0x34/0x70
  [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
  [<00000000003e4194>] seq_read+0x204/0x480
  [<00000000003b53ea>] __vfs_read+0x32/0x178
  [<00000000003b55b2>] vfs_read+0x82/0x138
  [<00000000003b5be2>] ksys_read+0x5a/0xb0
  [<0000000000b86ba0>] system_call+0xdc/0x2d8
 Last Breaking-Event-Address:
  [<000000000038596c>] is_mem_section_removable+0xb4/0x190
 Kernel panic - not syncing: Fatal exception: panic_on_oops
Michal Hocko Jan. 29, 2019, 1:49 p.m. UTC | #5
On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> On Mon, 28 Jan 2019 15:45:04 +0100
> Michal Hocko <mhocko@kernel.org> wrote:
> 
> > Hi,
> > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > have pushed back on those fixes because I believed that it is much
> > better to plug the problem at the initialization time rather than play
> > whack-a-mole all over the hotplug code and find all the places which
> > expect the full memory section to be initialized. We have ended up with
> > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > memory section") merged and cause a regression [2][3]. The reason is
> > that there might be memory layouts when two NUMA nodes share the same
> > memory section so the merged fix is simply incorrect.
> > 
> > In order to plug this hole we really have to be zone range aware in
> > those handlers. I have split up the original patch into two. One is
> > unchanged (patch 2) and I took a different approach for `removable'
> > crash. It would be great if Mikhail could test it still works for his
> > memory layout.
> > 
> > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz
> 
> I verified that both patches fix the issues we had with valid_zones
> (with mem=2050M) and removable (with mem=3075M).
> 
> However, the call trace in the description of your patch 1 is wrong.
> You basically have the same call trace for test_pages_in_a_zone in
> both patches. The "removable" patch should have the call trace for
> is_mem_section_removable from Mikhails original patches:

Thanks for testing. Can I use you Tested-by?

>  CONFIG_DEBUG_VM_PGFLAGS=y
>  kernel parameter mem=3075M
>  --------------------------
>  page:000003d08300c000 is uninitialized and poisoned
>  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
>  Call Trace:
>  ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
>   [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
>   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
>   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
>   [<00000000003e4194>] seq_read+0x204/0x480
>   [<00000000003b53ea>] __vfs_read+0x32/0x178
>   [<00000000003b55b2>] vfs_read+0x82/0x138
>   [<00000000003b5be2>] ksys_read+0x5a/0xb0
>   [<0000000000b86ba0>] system_call+0xdc/0x2d8
>  Last Breaking-Event-Address:
>   [<000000000038596c>] is_mem_section_removable+0xb4/0x190
>  Kernel panic - not syncing: Fatal exception: panic_on_oops

Yeah, this is c&p mistake on my end. I will use this trace instead.
Thanks for spotting.
Gerald Schaefer Jan. 29, 2019, 2:08 p.m. UTC | #6
On Tue, 29 Jan 2019 14:49:20 +0100
Michal Hocko <mhocko@kernel.org> wrote:

> On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> > On Mon, 28 Jan 2019 15:45:04 +0100
> > Michal Hocko <mhocko@kernel.org> wrote:
> > 
> > > Hi,
> > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > have pushed back on those fixes because I believed that it is much
> > > better to plug the problem at the initialization time rather than play
> > > whack-a-mole all over the hotplug code and find all the places which
> > > expect the full memory section to be initialized. We have ended up with
> > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > memory section") merged and cause a regression [2][3]. The reason is
> > > that there might be memory layouts when two NUMA nodes share the same
> > > memory section so the merged fix is simply incorrect.
> > > 
> > > In order to plug this hole we really have to be zone range aware in
> > > those handlers. I have split up the original patch into two. One is
> > > unchanged (patch 2) and I took a different approach for `removable'
> > > crash. It would be great if Mikhail could test it still works for his
> > > memory layout.
> > > 
> > > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz
> > 
> > I verified that both patches fix the issues we had with valid_zones
> > (with mem=2050M) and removable (with mem=3075M).
> > 
> > However, the call trace in the description of your patch 1 is wrong.
> > You basically have the same call trace for test_pages_in_a_zone in
> > both patches. The "removable" patch should have the call trace for
> > is_mem_section_removable from Mikhails original patches:
> 
> Thanks for testing. Can I use you Tested-by?

Sure, forgot to add this:
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

> 
> >  CONFIG_DEBUG_VM_PGFLAGS=y
> >  kernel parameter mem=3075M
> >  --------------------------
> >  page:000003d08300c000 is uninitialized and poisoned
> >  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> >  Call Trace:
> >  ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> >   [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> >   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> >   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> >   [<00000000003e4194>] seq_read+0x204/0x480
> >   [<00000000003b53ea>] __vfs_read+0x32/0x178
> >   [<00000000003b55b2>] vfs_read+0x82/0x138
> >   [<00000000003b5be2>] ksys_read+0x5a/0xb0
> >   [<0000000000b86ba0>] system_call+0xdc/0x2d8
> >  Last Breaking-Event-Address:
> >   [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> >  Kernel panic - not syncing: Fatal exception: panic_on_oops
> 
> Yeah, this is c&p mistake on my end. I will use this trace instead.
> Thanks for spotting.
Mikhail Gavrilov Jan. 29, 2019, 5:38 p.m. UTC | #7
On Tue, 29 Jan 2019 at 18:49, Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> > On Mon, 28 Jan 2019 15:45:04 +0100
> > Michal Hocko <mhocko@kernel.org> wrote:
> >
> > > Hi,
> > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > have pushed back on those fixes because I believed that it is much
> > > better to plug the problem at the initialization time rather than play
> > > whack-a-mole all over the hotplug code and find all the places which
> > > expect the full memory section to be initialized. We have ended up with
> > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > memory section") merged and cause a regression [2][3]. The reason is
> > > that there might be memory layouts when two NUMA nodes share the same
> > > memory section so the merged fix is simply incorrect.
> > >
> > > In order to plug this hole we really have to be zone range aware in
> > > those handlers. I have split up the original patch into two. One is
> > > unchanged (patch 2) and I took a different approach for `removable'
> > > crash. It would be great if Mikhail could test it still works for his
> > > memory layout.
> > >
> > > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz
> >
> > I verified that both patches fix the issues we had with valid_zones
> > (with mem=2050M) and removable (with mem=3075M).
> >
> > However, the call trace in the description of your patch 1 is wrong.
> > You basically have the same call trace for test_pages_in_a_zone in
> > both patches. The "removable" patch should have the call trace for
> > is_mem_section_removable from Mikhails original patches:
>
> Thanks for testing. Can I use you Tested-by?
>
> >  CONFIG_DEBUG_VM_PGFLAGS=y
> >  kernel parameter mem=3075M
> >  --------------------------
> >  page:000003d08300c000 is uninitialized and poisoned
> >  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> >  Call Trace:
> >  ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> >   [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> >   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> >   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> >   [<00000000003e4194>] seq_read+0x204/0x480
> >   [<00000000003b53ea>] __vfs_read+0x32/0x178
> >   [<00000000003b55b2>] vfs_read+0x82/0x138
> >   [<00000000003b5be2>] ksys_read+0x5a/0xb0
> >   [<0000000000b86ba0>] system_call+0xdc/0x2d8
> >  Last Breaking-Event-Address:
> >   [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> >  Kernel panic - not syncing: Fatal exception: panic_on_oops
>
> Yeah, this is c&p mistake on my end. I will use this trace instead.
> Thanks for spotting.


Michal, I am late?
I am also tested these patches and can confirm that issue fixed again
with new approach.
I also attach two dmesg first when issue was reproduced and second
with applied patch (problem not reproduced).

--
Best Regards,
Mike Gavrilov.
Michal Hocko Jan. 29, 2019, 8:24 p.m. UTC | #8
On Tue 29-01-19 22:38:19, Mikhail Gavrilov wrote:
> On Tue, 29 Jan 2019 at 18:49, Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Tue 29-01-19 14:14:47, Gerald Schaefer wrote:
> > > On Mon, 28 Jan 2019 15:45:04 +0100
> > > Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > > Hi,
> > > > Mikhail has posted fixes for the two bugs quite some time ago [1]. I
> > > > have pushed back on those fixes because I believed that it is much
> > > > better to plug the problem at the initialization time rather than play
> > > > whack-a-mole all over the hotplug code and find all the places which
> > > > expect the full memory section to be initialized. We have ended up with
> > > > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
> > > > memory section") merged and cause a regression [2][3]. The reason is
> > > > that there might be memory layouts when two NUMA nodes share the same
> > > > memory section so the merged fix is simply incorrect.
> > > >
> > > > In order to plug this hole we really have to be zone range aware in
> > > > those handlers. I have split up the original patch into two. One is
> > > > unchanged (patch 2) and I took a different approach for `removable'
> > > > crash. It would be great if Mikhail could test it still works for his
> > > > memory layout.
> > > >
> > > > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
> > > > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
> > > > [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz
> > >
> > > I verified that both patches fix the issues we had with valid_zones
> > > (with mem=2050M) and removable (with mem=3075M).
> > >
> > > However, the call trace in the description of your patch 1 is wrong.
> > > You basically have the same call trace for test_pages_in_a_zone in
> > > both patches. The "removable" patch should have the call trace for
> > > is_mem_section_removable from Mikhails original patches:
> >
> > Thanks for testing. Can I use you Tested-by?
> >
> > >  CONFIG_DEBUG_VM_PGFLAGS=y
> > >  kernel parameter mem=3075M
> > >  --------------------------
> > >  page:000003d08300c000 is uninitialized and poisoned
> > >  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > >  Call Trace:
> > >  ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
> > >   [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
> > >   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
> > >   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
> > >   [<00000000003e4194>] seq_read+0x204/0x480
> > >   [<00000000003b53ea>] __vfs_read+0x32/0x178
> > >   [<00000000003b55b2>] vfs_read+0x82/0x138
> > >   [<00000000003b5be2>] ksys_read+0x5a/0xb0
> > >   [<0000000000b86ba0>] system_call+0xdc/0x2d8
> > >  Last Breaking-Event-Address:
> > >   [<000000000038596c>] is_mem_section_removable+0xb4/0x190
> > >  Kernel panic - not syncing: Fatal exception: panic_on_oops
> >
> > Yeah, this is c&p mistake on my end. I will use this trace instead.
> > Thanks for spotting.
> 
> 
> Michal, I am late?

I do not think so. I plan to repost tomorrow with the updated changelog
and gathered review and tested-by tags. Can I assume yours as well?

> I am also tested these patches and can confirm that issue fixed again
> with new approach.
> I also attach two dmesg first when issue was reproduced and second
> with applied patch (problem not reproduced).

Thanks!
Mikhail Gavrilov Jan. 29, 2019, 8:56 p.m. UTC | #9
On Wed, 30 Jan 2019 at 01:24, Michal Hocko <mhocko@kernel.org> wrote:
> I do not think so. I plan to repost tomorrow with the updated changelog
> and gathered review and tested-by tags. Can I assume yours as well?

Sure

--
Best Regards,
Mike Gavrilov.