diff mbox series

[v2,RESEND] mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM

Message ID 20200219030454.4844-1-bhe@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2,RESEND] mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM | expand

Commit Message

Baoquan He Feb. 19, 2020, 3:04 a.m. UTC
From: Wei Yang <richardw.yang@linux.intel.com>

When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
doesn't work before sparse_init_one_section() is called. This leads to a
crash when hotplug memory:

[   41.839170] BUG: unable to handle page fault for address: 0000000006400000
[   41.840663] #PF: supervisor write access in kernel mode
[   41.841822] #PF: error_code(0x0002) - not-present page
[   41.842970] PGD 0 P4D 0
[   41.843538] Oops: 0002 [#1] SMP PTI
[   41.844125] CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G        W         5.5.0-next-20200205+ #343
[   41.845659] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[   41.846977] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[   41.847904] RIP: 0010:__memset+0x24/0x30
[   41.848660] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[   41.851836] RSP: 0018:ffffb43ac0373c80 EFLAGS: 00010a87
[   41.852686] RAX: ffffffffffffffff RBX: ffff8a1518800000 RCX: 0000000000050000
[   41.853824] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000006400000
[   41.854967] RBP: 0000000000140000 R08: 0000000000100000 R09: 0000000006400000
[   41.856107] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
[   41.857255] R13: 0000000000000028 R14: 0000000000000000 R15: ffff8a153ffd9280
[   41.858414] FS:  0000000000000000(0000) GS:ffff8a153ab00000(0000) knlGS:0000000000000000
[   41.859703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   41.860627] CR2: 0000000006400000 CR3: 0000000136fca000 CR4: 00000000000006e0
[   41.861716] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   41.862680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   41.863628] Call Trace:
[   41.863983]  sparse_add_section+0x1c9/0x26a
[   41.864570]  __add_pages+0xbf/0x150
[   41.865057]  add_pages+0x12/0x60
[   41.865489]  add_memory_resource+0xc8/0x210
[   41.866017]  ? wake_up_q+0xa0/0xa0
[   41.866416]  __add_memory+0x62/0xb0
[   41.866825]  acpi_memory_device_add+0x13f/0x300
[   41.867410]  acpi_bus_attach+0xf6/0x200
[   41.867890]  acpi_bus_scan+0x43/0x90
[   41.868448]  acpi_device_hotplug+0x275/0x3d0
[   41.868972]  acpi_hotplug_work_fn+0x1a/0x30
[   41.869473]  process_one_work+0x1a7/0x370
[   41.869953]  worker_thread+0x30/0x380
[   41.870396]  ? flush_rcu_work+0x30/0x30
[   41.870846]  kthread+0x112/0x130
[   41.871236]  ? kthread_create_on_node+0x60/0x60
[   41.871770]  ret_from_fork+0x35/0x40

We should use memmap as it did.

Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
CC: Dan Williams <dan.j.williams@intel.com>
---
v1->v2:
  Add trace printed by triggered bug to log.

v1 can be found here:
http://lkml.kernel.org/r/20200206125343.9070-1-richardw.yang@linux.intel.com

 mm/sparse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dan Williams Feb. 19, 2020, 3:25 a.m. UTC | #1
On Tue, Feb 18, 2020 at 7:05 PM Baoquan He <bhe@redhat.com> wrote:
>
> From: Wei Yang <richardw.yang@linux.intel.com>
>
> When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> doesn't work before sparse_init_one_section() is called. This leads to a
> crash when hotplug memory:

I'd also add:

"On x86 the impact is limited to x86_32 builds, or x86_64
configurations that override the default setting for
SPARSEMEM_VMEMMAP".

Other than that:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Baoquan He Feb. 19, 2020, 3:46 a.m. UTC | #2
On 02/18/20 at 07:25pm, Dan Williams wrote:
> On Tue, Feb 18, 2020 at 7:05 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > From: Wei Yang <richardw.yang@linux.intel.com>
> >
> > When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> > doesn't work before sparse_init_one_section() is called. This leads to a
> > crash when hotplug memory:
> 
> I'd also add:
> 
> "On x86 the impact is limited to x86_32 builds, or x86_64
> configurations that override the default setting for
> SPARSEMEM_VMEMMAP".


'When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP' in the current log
is a little duplicated with the overriding saying?
Mike Rapoport Feb. 19, 2020, 8:57 a.m. UTC | #3
On Tue, Feb 18, 2020 at 07:25:15PM -0800, Dan Williams wrote:
> On Tue, Feb 18, 2020 at 7:05 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > From: Wei Yang <richardw.yang@linux.intel.com>
> >
> > When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> > doesn't work before sparse_init_one_section() is called. This leads to a
> > crash when hotplug memory:
> 
> I'd also add:
> 
> "On x86 the impact is limited to x86_32 builds, or x86_64
> configurations that override the default setting for
> SPARSEMEM_VMEMMAP".

Do we also want to check how it affects, say, arm64, ia64 and ppc? ;-)
 
> Other than that:
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>
Dan Williams Feb. 19, 2020, 4:37 p.m. UTC | #4
On Wed, Feb 19, 2020 at 12:57 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> On Tue, Feb 18, 2020 at 07:25:15PM -0800, Dan Williams wrote:
> > On Tue, Feb 18, 2020 at 7:05 PM Baoquan He <bhe@redhat.com> wrote:
> > >
> > > From: Wei Yang <richardw.yang@linux.intel.com>
> > >
> > > When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> > > doesn't work before sparse_init_one_section() is called. This leads to a
> > > crash when hotplug memory:
> >
> > I'd also add:
> >
> > "On x86 the impact is limited to x86_32 builds, or x86_64
> > configurations that override the default setting for
> > SPARSEMEM_VMEMMAP".
>
> Do we also want to check how it affects, say, arm64, ia64 and ppc? ;-)

Sure, I just did not take the time to look up their respective default
stances on SPARSEMEM_VMEMMAP. For a distro looking to backport this
commit I think it's helpful for them to understand if they are exposed
or not.
Mike Rapoport Feb. 19, 2020, 6:31 p.m. UTC | #5
On Wed, Feb 19, 2020 at 08:37:25AM -0800, Dan Williams wrote:
> On Wed, Feb 19, 2020 at 12:57 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
> >
> > On Tue, Feb 18, 2020 at 07:25:15PM -0800, Dan Williams wrote:
> > > On Tue, Feb 18, 2020 at 7:05 PM Baoquan He <bhe@redhat.com> wrote:
> > > >
> > > > From: Wei Yang <richardw.yang@linux.intel.com>
> > > >
> > > > When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> > > > doesn't work before sparse_init_one_section() is called. This leads to a
> > > > crash when hotplug memory:
> > >
> > > I'd also add:
> > >
> > > "On x86 the impact is limited to x86_32 builds, or x86_64
> > > configurations that override the default setting for
> > > SPARSEMEM_VMEMMAP".
> >
> > Do we also want to check how it affects, say, arm64, ia64 and ppc? ;-)
> 
> Sure, I just did not take the time to look up their respective default
> stances on SPARSEMEM_VMEMMAP. For a distro looking to backport this
> commit I think it's helpful for them to understand if they are exposed
> or not.

Looks like only i386_defconfig does not enable SPARSEMEM_VMEMMAP. All the
rest may have it disabled only with manual override.
Andrew Morton Feb. 19, 2020, 7:50 p.m. UTC | #6
On Wed, 19 Feb 2020 11:04:54 +0800 Baoquan He <bhe@redhat.com> wrote:

> From: Wei Yang <richardw.yang@linux.intel.com>
> 
> When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> doesn't work before sparse_init_one_section() is called. This leads to a
> crash when hotplug memory:
> 
> [   41.839170] BUG: unable to handle page fault for address: 0000000006400000
> [   41.840663] #PF: supervisor write access in kernel mode
> [   41.841822] #PF: error_code(0x0002) - not-present page
> [   41.842970] PGD 0 P4D 0
> [   41.843538] Oops: 0002 [#1] SMP PTI
> [   41.844125] CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G        W         5.5.0-next-20200205+ #343
> [   41.845659] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> [   41.846977] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> [   41.847904] RIP: 0010:__memset+0x24/0x30
> [   41.848660] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
> [   41.851836] RSP: 0018:ffffb43ac0373c80 EFLAGS: 00010a87
> [   41.852686] RAX: ffffffffffffffff RBX: ffff8a1518800000 RCX: 0000000000050000
> [   41.853824] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000006400000
> [   41.854967] RBP: 0000000000140000 R08: 0000000000100000 R09: 0000000006400000
> [   41.856107] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
> [   41.857255] R13: 0000000000000028 R14: 0000000000000000 R15: ffff8a153ffd9280
> [   41.858414] FS:  0000000000000000(0000) GS:ffff8a153ab00000(0000) knlGS:0000000000000000
> [   41.859703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   41.860627] CR2: 0000000006400000 CR3: 0000000136fca000 CR4: 00000000000006e0
> [   41.861716] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   41.862680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   41.863628] Call Trace:
> [   41.863983]  sparse_add_section+0x1c9/0x26a
> [   41.864570]  __add_pages+0xbf/0x150
> [   41.865057]  add_pages+0x12/0x60
> [   41.865489]  add_memory_resource+0xc8/0x210
> [   41.866017]  ? wake_up_q+0xa0/0xa0
> [   41.866416]  __add_memory+0x62/0xb0
> [   41.866825]  acpi_memory_device_add+0x13f/0x300
> [   41.867410]  acpi_bus_attach+0xf6/0x200
> [   41.867890]  acpi_bus_scan+0x43/0x90
> [   41.868448]  acpi_device_hotplug+0x275/0x3d0
> [   41.868972]  acpi_hotplug_work_fn+0x1a/0x30
> [   41.869473]  process_one_work+0x1a7/0x370
> [   41.869953]  worker_thread+0x30/0x380
> [   41.870396]  ? flush_rcu_work+0x30/0x30
> [   41.870846]  kthread+0x112/0x130
> [   41.871236]  ? kthread_create_on_node+0x60/0x60
> [   41.871770]  ret_from_fork+0x35/0x40
> 
> We should use memmap as it did.
> 
> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> Acked-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Baoquan He <bhe@redhat.com>
> CC: Dan Williams <dan.j.williams@intel.com>

This should have included your signed-off-by, as you were on the patch
delivery path.  I have made that change to my copy of the patch - is
that OK?

I also added a cc:stable.  Do we agree this is appropriate?

I added Dan's "On x86 the impact is limited to x86_32 builds, or x86_64
configurations that override the default setting for
SPARSEMEM_VMEMMAP." to the changelog.
Dan Williams Feb. 19, 2020, 8:13 p.m. UTC | #7
On Wed, Feb 19, 2020 at 11:50 AM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> On Wed, 19 Feb 2020 11:04:54 +0800 Baoquan He <bhe@redhat.com> wrote:
>
> > From: Wei Yang <richardw.yang@linux.intel.com>
> >
> > When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> > doesn't work before sparse_init_one_section() is called. This leads to a
> > crash when hotplug memory:
> >
> > [   41.839170] BUG: unable to handle page fault for address: 0000000006400000
> > [   41.840663] #PF: supervisor write access in kernel mode
> > [   41.841822] #PF: error_code(0x0002) - not-present page
> > [   41.842970] PGD 0 P4D 0
> > [   41.843538] Oops: 0002 [#1] SMP PTI
> > [   41.844125] CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G        W         5.5.0-next-20200205+ #343
> > [   41.845659] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> > [   41.846977] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> > [   41.847904] RIP: 0010:__memset+0x24/0x30
> > [   41.848660] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
> > [   41.851836] RSP: 0018:ffffb43ac0373c80 EFLAGS: 00010a87
> > [   41.852686] RAX: ffffffffffffffff RBX: ffff8a1518800000 RCX: 0000000000050000
> > [   41.853824] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000006400000
> > [   41.854967] RBP: 0000000000140000 R08: 0000000000100000 R09: 0000000006400000
> > [   41.856107] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
> > [   41.857255] R13: 0000000000000028 R14: 0000000000000000 R15: ffff8a153ffd9280
> > [   41.858414] FS:  0000000000000000(0000) GS:ffff8a153ab00000(0000) knlGS:0000000000000000
> > [   41.859703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   41.860627] CR2: 0000000006400000 CR3: 0000000136fca000 CR4: 00000000000006e0
> > [   41.861716] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   41.862680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   41.863628] Call Trace:
> > [   41.863983]  sparse_add_section+0x1c9/0x26a
> > [   41.864570]  __add_pages+0xbf/0x150
> > [   41.865057]  add_pages+0x12/0x60
> > [   41.865489]  add_memory_resource+0xc8/0x210
> > [   41.866017]  ? wake_up_q+0xa0/0xa0
> > [   41.866416]  __add_memory+0x62/0xb0
> > [   41.866825]  acpi_memory_device_add+0x13f/0x300
> > [   41.867410]  acpi_bus_attach+0xf6/0x200
> > [   41.867890]  acpi_bus_scan+0x43/0x90
> > [   41.868448]  acpi_device_hotplug+0x275/0x3d0
> > [   41.868972]  acpi_hotplug_work_fn+0x1a/0x30
> > [   41.869473]  process_one_work+0x1a7/0x370
> > [   41.869953]  worker_thread+0x30/0x380
> > [   41.870396]  ? flush_rcu_work+0x30/0x30
> > [   41.870846]  kthread+0x112/0x130
> > [   41.871236]  ? kthread_create_on_node+0x60/0x60
> > [   41.871770]  ret_from_fork+0x35/0x40
> >
> > We should use memmap as it did.
> >
> > Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
> > Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Reviewed-by: Baoquan He <bhe@redhat.com>
> > CC: Dan Williams <dan.j.williams@intel.com>
>
> This should have included your signed-off-by, as you were on the patch
> delivery path.  I have made that change to my copy of the patch - is
> that OK?
>
> I also added a cc:stable.  Do we agree this is appropriate?

I see no harm.

> I added Dan's "On x86 the impact is limited to x86_32 builds, or x86_64
> configurations that override the default setting for
> SPARSEMEM_VMEMMAP." to the changelog.

Can also add with Mike's sleuthing. "Other memory hotplug archs
(arm64, ia64, and ppc) also default to SPARSEMEM_VMEMMAP=y."
Wei Yang Feb. 19, 2020, 9:45 p.m. UTC | #8
On Wed, Feb 19, 2020 at 11:50:42AM -0800, Andrew Morton wrote:
>On Wed, 19 Feb 2020 11:04:54 +0800 Baoquan He <bhe@redhat.com> wrote:
>
>> From: Wei Yang <richardw.yang@linux.intel.com>
>> 
>> When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
>> doesn't work before sparse_init_one_section() is called. This leads to a
>> crash when hotplug memory:
>> 
>> [   41.839170] BUG: unable to handle page fault for address: 0000000006400000
>> [   41.840663] #PF: supervisor write access in kernel mode
>> [   41.841822] #PF: error_code(0x0002) - not-present page
>> [   41.842970] PGD 0 P4D 0
>> [   41.843538] Oops: 0002 [#1] SMP PTI
>> [   41.844125] CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G        W         5.5.0-next-20200205+ #343
>> [   41.845659] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
>> [   41.846977] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
>> [   41.847904] RIP: 0010:__memset+0x24/0x30
>> [   41.848660] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
>> [   41.851836] RSP: 0018:ffffb43ac0373c80 EFLAGS: 00010a87
>> [   41.852686] RAX: ffffffffffffffff RBX: ffff8a1518800000 RCX: 0000000000050000
>> [   41.853824] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000006400000
>> [   41.854967] RBP: 0000000000140000 R08: 0000000000100000 R09: 0000000006400000
>> [   41.856107] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
>> [   41.857255] R13: 0000000000000028 R14: 0000000000000000 R15: ffff8a153ffd9280
>> [   41.858414] FS:  0000000000000000(0000) GS:ffff8a153ab00000(0000) knlGS:0000000000000000
>> [   41.859703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   41.860627] CR2: 0000000006400000 CR3: 0000000136fca000 CR4: 00000000000006e0
>> [   41.861716] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   41.862680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   41.863628] Call Trace:
>> [   41.863983]  sparse_add_section+0x1c9/0x26a
>> [   41.864570]  __add_pages+0xbf/0x150
>> [   41.865057]  add_pages+0x12/0x60
>> [   41.865489]  add_memory_resource+0xc8/0x210
>> [   41.866017]  ? wake_up_q+0xa0/0xa0
>> [   41.866416]  __add_memory+0x62/0xb0
>> [   41.866825]  acpi_memory_device_add+0x13f/0x300
>> [   41.867410]  acpi_bus_attach+0xf6/0x200
>> [   41.867890]  acpi_bus_scan+0x43/0x90
>> [   41.868448]  acpi_device_hotplug+0x275/0x3d0
>> [   41.868972]  acpi_hotplug_work_fn+0x1a/0x30
>> [   41.869473]  process_one_work+0x1a7/0x370
>> [   41.869953]  worker_thread+0x30/0x380
>> [   41.870396]  ? flush_rcu_work+0x30/0x30
>> [   41.870846]  kthread+0x112/0x130
>> [   41.871236]  ? kthread_create_on_node+0x60/0x60
>> [   41.871770]  ret_from_fork+0x35/0x40
>> 
>> We should use memmap as it did.
>> 
>> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> Acked-by: David Hildenbrand <david@redhat.com>
>> Reviewed-by: Baoquan He <bhe@redhat.com>
>> CC: Dan Williams <dan.j.williams@intel.com>
>
>This should have included your signed-off-by, as you were on the patch
>delivery path.  I have made that change to my copy of the patch - is
>that OK?
>
>I also added a cc:stable.  Do we agree this is appropriate?

Agree with this.

>
>I added Dan's "On x86 the impact is limited to x86_32 builds, or x86_64
>configurations that override the default setting for
>SPARSEMEM_VMEMMAP." to the changelog.
Baoquan He Feb. 19, 2020, 11:55 p.m. UTC | #9
On 02/19/20 at 11:50am, Andrew Morton wrote:
> On Wed, 19 Feb 2020 11:04:54 +0800 Baoquan He <bhe@redhat.com> wrote:
> 
> > From: Wei Yang <richardw.yang@linux.intel.com>
> > 
> > When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> > doesn't work before sparse_init_one_section() is called. This leads to a
> > crash when hotplug memory:
> > 
> > [   41.839170] BUG: unable to handle page fault for address: 0000000006400000
> > [   41.840663] #PF: supervisor write access in kernel mode
> > [   41.841822] #PF: error_code(0x0002) - not-present page
> > [   41.842970] PGD 0 P4D 0
> > [   41.843538] Oops: 0002 [#1] SMP PTI
> > [   41.844125] CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G        W         5.5.0-next-20200205+ #343
> > [   41.845659] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> > [   41.846977] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> > [   41.847904] RIP: 0010:__memset+0x24/0x30
> > [   41.848660] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
> > [   41.851836] RSP: 0018:ffffb43ac0373c80 EFLAGS: 00010a87
> > [   41.852686] RAX: ffffffffffffffff RBX: ffff8a1518800000 RCX: 0000000000050000
> > [   41.853824] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000006400000
> > [   41.854967] RBP: 0000000000140000 R08: 0000000000100000 R09: 0000000006400000
> > [   41.856107] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
> > [   41.857255] R13: 0000000000000028 R14: 0000000000000000 R15: ffff8a153ffd9280
> > [   41.858414] FS:  0000000000000000(0000) GS:ffff8a153ab00000(0000) knlGS:0000000000000000
> > [   41.859703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   41.860627] CR2: 0000000006400000 CR3: 0000000136fca000 CR4: 00000000000006e0
> > [   41.861716] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   41.862680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   41.863628] Call Trace:
> > [   41.863983]  sparse_add_section+0x1c9/0x26a
> > [   41.864570]  __add_pages+0xbf/0x150
> > [   41.865057]  add_pages+0x12/0x60
> > [   41.865489]  add_memory_resource+0xc8/0x210
> > [   41.866017]  ? wake_up_q+0xa0/0xa0
> > [   41.866416]  __add_memory+0x62/0xb0
> > [   41.866825]  acpi_memory_device_add+0x13f/0x300
> > [   41.867410]  acpi_bus_attach+0xf6/0x200
> > [   41.867890]  acpi_bus_scan+0x43/0x90
> > [   41.868448]  acpi_device_hotplug+0x275/0x3d0
> > [   41.868972]  acpi_hotplug_work_fn+0x1a/0x30
> > [   41.869473]  process_one_work+0x1a7/0x370
> > [   41.869953]  worker_thread+0x30/0x380
> > [   41.870396]  ? flush_rcu_work+0x30/0x30
> > [   41.870846]  kthread+0x112/0x130
> > [   41.871236]  ? kthread_create_on_node+0x60/0x60
> > [   41.871770]  ret_from_fork+0x35/0x40
> > 
> > We should use memmap as it did.
> > 
> > Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
> > Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Reviewed-by: Baoquan He <bhe@redhat.com>
> > CC: Dan Williams <dan.j.williams@intel.com>
> 
> This should have included your signed-off-by, as you were on the patch
> delivery path.  I have made that change to my copy of the patch - is
> that OK?

OK, thanks.

> 
> I also added a cc:stable.  Do we agree this is appropriate?

Agree, this breakage happened since subsection hotplug.

> 
> I added Dan's "On x86 the impact is limited to x86_32 builds, or x86_64
> configurations that override the default setting for
> SPARSEMEM_VMEMMAP." to the changelog.

Looks good, thanks.
Michal Hocko Feb. 20, 2020, 10:41 a.m. UTC | #10
On Wed 19-02-20 11:04:54, Baoquan He wrote:
> From: Wei Yang <richardw.yang@linux.intel.com>
> 
> When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
> doesn't work before sparse_init_one_section() is called. This leads to a
> crash when hotplug memory:
> 
> [   41.839170] BUG: unable to handle page fault for address: 0000000006400000
> [   41.840663] #PF: supervisor write access in kernel mode
> [   41.841822] #PF: error_code(0x0002) - not-present page
> [   41.842970] PGD 0 P4D 0
> [   41.843538] Oops: 0002 [#1] SMP PTI
> [   41.844125] CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G        W         5.5.0-next-20200205+ #343
> [   41.845659] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> [   41.846977] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> [   41.847904] RIP: 0010:__memset+0x24/0x30
> [   41.848660] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
> [   41.851836] RSP: 0018:ffffb43ac0373c80 EFLAGS: 00010a87
> [   41.852686] RAX: ffffffffffffffff RBX: ffff8a1518800000 RCX: 0000000000050000
> [   41.853824] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000006400000
> [   41.854967] RBP: 0000000000140000 R08: 0000000000100000 R09: 0000000006400000
> [   41.856107] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
> [   41.857255] R13: 0000000000000028 R14: 0000000000000000 R15: ffff8a153ffd9280
> [   41.858414] FS:  0000000000000000(0000) GS:ffff8a153ab00000(0000) knlGS:0000000000000000
> [   41.859703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   41.860627] CR2: 0000000006400000 CR3: 0000000136fca000 CR4: 00000000000006e0
> [   41.861716] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   41.862680] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   41.863628] Call Trace:
> [   41.863983]  sparse_add_section+0x1c9/0x26a
> [   41.864570]  __add_pages+0xbf/0x150
> [   41.865057]  add_pages+0x12/0x60
> [   41.865489]  add_memory_resource+0xc8/0x210
> [   41.866017]  ? wake_up_q+0xa0/0xa0
> [   41.866416]  __add_memory+0x62/0xb0
> [   41.866825]  acpi_memory_device_add+0x13f/0x300
> [   41.867410]  acpi_bus_attach+0xf6/0x200
> [   41.867890]  acpi_bus_scan+0x43/0x90
> [   41.868448]  acpi_device_hotplug+0x275/0x3d0
> [   41.868972]  acpi_hotplug_work_fn+0x1a/0x30
> [   41.869473]  process_one_work+0x1a7/0x370
> [   41.869953]  worker_thread+0x30/0x380
> [   41.870396]  ? flush_rcu_work+0x30/0x30
> [   41.870846]  kthread+0x112/0x130
> [   41.871236]  ? kthread_create_on_node+0x60/0x60
> [   41.871770]  ret_from_fork+0x35/0x40
> 
> We should use memmap as it did.
> 
> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> Acked-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Baoquan He <bhe@redhat.com>
> CC: Dan Williams <dan.j.williams@intel.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
> v1->v2:
>   Add trace printed by triggered bug to log.
> 
> v1 can be found here:
> http://lkml.kernel.org/r/20200206125343.9070-1-richardw.yang@linux.intel.com
> 
>  mm/sparse.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index c184b69460b7..596b2a45b100 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -876,7 +876,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
>  	 * Poison uninitialized struct pages in order to catch invalid flags
>  	 * combinations.
>  	 */
> -	page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages);
> +	page_init_poison(memmap, sizeof(struct page) * nr_pages);
>  
>  	ms = __nr_to_section(section_nr);
>  	set_section_nid(section_nr, nid);
> -- 
> 2.17.2
>
diff mbox series

Patch

diff --git a/mm/sparse.c b/mm/sparse.c
index c184b69460b7..596b2a45b100 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -876,7 +876,7 @@  int __meminit sparse_add_section(int nid, unsigned long start_pfn,
 	 * Poison uninitialized struct pages in order to catch invalid flags
 	 * combinations.
 	 */
-	page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages);
+	page_init_poison(memmap, sizeof(struct page) * nr_pages);
 
 	ms = __nr_to_section(section_nr);
 	set_section_nid(section_nr, nid);