Message ID | 20190926122552.17905-1-aneesh.kumar@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [1/2] mm/memunmap: Use the correct start and end pfn when removing pages from zone | expand |
On 26.09.19 14:25, Aneesh Kumar K.V wrote: > With altmap, all the resource pfns are not initialized. While initializing > pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip > pfns that were never initialized. > > Update memunmap_pages to calculate start and end pfn based on altmap > values. This fixes a kernel crash that is observed when destroying namespace. > > [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 > [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 > cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] > pc: c0000000000b58b0: memset+0x68/0x104 > lr: c0000000003eb008: page_init_poison+0x38/0x50 > ... > current = 0xc000000271c67d80 > paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 > pid = 3665, comm = ndctl > [link register ] c0000000003eb008 page_init_poison+0x38/0x50 > [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 > [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 > [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 > ... > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > --- > mm/memremap.c | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/mm/memremap.c b/mm/memremap.c > index 390bb3544589..76b98110031e 100644 > --- a/mm/memremap.c > +++ b/mm/memremap.c > @@ -113,7 +113,8 @@ static void dev_pagemap_cleanup(struct dev_pagemap *pgmap) > void memunmap_pages(struct dev_pagemap *pgmap) > { > struct resource *res = &pgmap->res; > - unsigned long pfn = PHYS_PFN(res->start); > + unsigned long start_pfn, end_pfn; > + unsigned long pfn, nr_pages; > int nid; > > dev_pagemap_kill(pgmap); > @@ -121,14 +122,18 @@ void memunmap_pages(struct dev_pagemap *pgmap) > put_page(pfn_to_page(pfn)); > dev_pagemap_cleanup(pgmap); > > + start_pfn = pfn_first(pgmap); > + end_pfn = pfn_end(pgmap); > + nr_pages = end_pfn - start_pfn; > + > /* pages are dead and unused, undo the arch mapping */ > - nid = page_to_nid(pfn_to_page(pfn)); > + nid = page_to_nid(pfn_to_page(start_pfn)); > > mem_hotplug_begin(); > - remove_pfn_range_from_zone(page_zone(pfn_to_page(pfn)), pfn, > - PHYS_PFN(resource_size(res))); > + remove_pfn_range_from_zone(page_zone(pfn_to_page(start_pfn)), > + start_pfn, nr_pages); > if (pgmap->type == MEMORY_DEVICE_PRIVATE) { > - __remove_pages(pfn, PHYS_PFN(resource_size(res)), NULL); > + __remove_pages(start_pfn, nr_pages, NULL); > } else { > arch_remove_memory(nid, res->start, resource_size(res), > pgmap_altmap(pgmap)); > Just to make sure, my patches did not break that, right (IOW, broken upstream)?
On 9/26/19 6:13 PM, David Hildenbrand wrote: > On 26.09.19 14:25, Aneesh Kumar K.V wrote: >> With altmap, all the resource pfns are not initialized. While initializing >> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >> pfns that were never initialized. >> >> Update memunmap_pages to calculate start and end pfn based on altmap >> values. This fixes a kernel crash that is observed when destroying namespace. >> >> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >> pc: c0000000000b58b0: memset+0x68/0x104 >> lr: c0000000003eb008: page_init_poison+0x38/0x50 >> ... >> current = 0xc000000271c67d80 >> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >> pid = 3665, comm = ndctl >> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 >> ... >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >> --- >> mm/memremap.c | 15 ++++++++++----- >> 1 file changed, 10 insertions(+), 5 deletions(-) >> >> diff --git a/mm/memremap.c b/mm/memremap.c >> index 390bb3544589..76b98110031e 100644 >> --- a/mm/memremap.c >> +++ b/mm/memremap.c >> @@ -113,7 +113,8 @@ static void dev_pagemap_cleanup(struct dev_pagemap *pgmap) >> void memunmap_pages(struct dev_pagemap *pgmap) >> { >> struct resource *res = &pgmap->res; >> - unsigned long pfn = PHYS_PFN(res->start); >> + unsigned long start_pfn, end_pfn; >> + unsigned long pfn, nr_pages; >> int nid; >> >> dev_pagemap_kill(pgmap); >> @@ -121,14 +122,18 @@ void memunmap_pages(struct dev_pagemap *pgmap) >> put_page(pfn_to_page(pfn)); >> dev_pagemap_cleanup(pgmap); >> >> + start_pfn = pfn_first(pgmap); >> + end_pfn = pfn_end(pgmap); >> + nr_pages = end_pfn - start_pfn; >> + >> /* pages are dead and unused, undo the arch mapping */ >> - nid = page_to_nid(pfn_to_page(pfn)); >> + nid = page_to_nid(pfn_to_page(start_pfn)); >> >> mem_hotplug_begin(); >> - remove_pfn_range_from_zone(page_zone(pfn_to_page(pfn)), pfn, >> - PHYS_PFN(resource_size(res))); >> + remove_pfn_range_from_zone(page_zone(pfn_to_page(start_pfn)), >> + start_pfn, nr_pages); >> if (pgmap->type == MEMORY_DEVICE_PRIVATE) { >> - __remove_pages(pfn, PHYS_PFN(resource_size(res)), NULL); >> + __remove_pages(start_pfn, nr_pages, NULL); >> } else { >> arch_remove_memory(nid, res->start, resource_size(res), >> pgmap_altmap(pgmap)); >> > > Just to make sure, my patches did not break that, right (IOW, broken > upstream)? > That is correct. Your patches helped to remove other usages of wrong pfns. The last few left got fixed in this patch. -aneesh
> > With altmap, all the resource pfns are not initialized. While initializing > pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip > pfns that were never initialized. > > Update memunmap_pages to calculate start and end pfn based on altmap > values. This fixes a kernel crash that is observed when destroying namespace. > > [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 > [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 > cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] > pc: c0000000000b58b0: memset+0x68/0x104 > lr: c0000000003eb008: page_init_poison+0x38/0x50 > ... > current = 0xc000000271c67d80 > paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 > pid = 3665, comm = ndctl > [link register ] c0000000003eb008 page_init_poison+0x38/0x50 > [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 > [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 > [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 > ... > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > --- > mm/memremap.c | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/mm/memremap.c b/mm/memremap.c > index 390bb3544589..76b98110031e 100644 > --- a/mm/memremap.c > +++ b/mm/memremap.c > @@ -113,7 +113,8 @@ static void dev_pagemap_cleanup(struct dev_pagemap > *pgmap) > void memunmap_pages(struct dev_pagemap *pgmap) > { > struct resource *res = &pgmap->res; > - unsigned long pfn = PHYS_PFN(res->start); > + unsigned long start_pfn, end_pfn; > + unsigned long pfn, nr_pages; > int nid; > > dev_pagemap_kill(pgmap); > @@ -121,14 +122,18 @@ void memunmap_pages(struct dev_pagemap *pgmap) > put_page(pfn_to_page(pfn)); > dev_pagemap_cleanup(pgmap); > > + start_pfn = pfn_first(pgmap); > + end_pfn = pfn_end(pgmap); > + nr_pages = end_pfn - start_pfn; > + > /* pages are dead and unused, undo the arch mapping */ > - nid = page_to_nid(pfn_to_page(pfn)); > + nid = page_to_nid(pfn_to_page(start_pfn)); > > mem_hotplug_begin(); > - remove_pfn_range_from_zone(page_zone(pfn_to_page(pfn)), pfn, > - PHYS_PFN(resource_size(res))); > + remove_pfn_range_from_zone(page_zone(pfn_to_page(start_pfn)), > + start_pfn, nr_pages); > if (pgmap->type == MEMORY_DEVICE_PRIVATE) { > - __remove_pages(pfn, PHYS_PFN(resource_size(res)), NULL); > + __remove_pages(start_pfn, nr_pages, NULL); > } else { > arch_remove_memory(nid, res->start, resource_size(res), > pgmap_altmap(pgmap)); > -- > 2.21.0 Reviewed-by: Pankaj Gupta <pagupta@redhat.com> > > >
On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: > With altmap, all the resource pfns are not initialized. While initializing > pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip > pfns that were never initialized. > > Update memunmap_pages to calculate start and end pfn based on altmap > values. This fixes a kernel crash that is observed when destroying namespace. > > [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 > [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 > cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] > pc: c0000000000b58b0: memset+0x68/0x104 > lr: c0000000003eb008: page_init_poison+0x38/0x50 > ... > current = 0xc000000271c67d80 > paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 > pid = 3665, comm = ndctl > [link register ] c0000000003eb008 page_init_poison+0x38/0x50 > [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 > [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 > [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 Doesn't apply to mainline or -next. Which tree is this against?
On 9/27/19 4:15 AM, Andrew Morton wrote: > On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: > >> With altmap, all the resource pfns are not initialized. While initializing >> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >> pfns that were never initialized. >> >> Update memunmap_pages to calculate start and end pfn based on altmap >> values. This fixes a kernel crash that is observed when destroying namespace. >> >> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >> pc: c0000000000b58b0: memset+0x68/0x104 >> lr: c0000000003eb008: page_init_poison+0x38/0x50 >> ... >> current = 0xc000000271c67d80 >> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >> pid = 3665, comm = ndctl >> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 > > Doesn't apply to mainline or -next. Which tree is this against? > After applying the patches from David on mainline. That is the reason I replied to this thread. I should have mentioned in the email that it is based on patch series "[PATCH v4 0/8] mm/memory_hotplug: Shrink zones before removing memory" -aneesh
On 27.09.19 03:51, Aneesh Kumar K.V wrote: > On 9/27/19 4:15 AM, Andrew Morton wrote: >> On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: >> >>> With altmap, all the resource pfns are not initialized. While initializing >>> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >>> pfns that were never initialized. >>> >>> Update memunmap_pages to calculate start and end pfn based on altmap >>> values. This fixes a kernel crash that is observed when destroying namespace. >>> >>> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >>> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >>> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >>> pc: c0000000000b58b0: memset+0x68/0x104 >>> lr: c0000000003eb008: page_init_poison+0x38/0x50 >>> ... >>> current = 0xc000000271c67d80 >>> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >>> pid = 3665, comm = ndctl >>> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >>> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >>> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >>> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 >> >> Doesn't apply to mainline or -next. Which tree is this against? >> > > After applying the patches from David on mainline. That is the reason I > replied to this thread. I should have mentioned in the email that it is > based on patch series "[PATCH v4 0/8] mm/memory_hotplug: Shrink zones > before removing memory" So if I am not wrong, my patch "[PATCH v4 4/8] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone()" makes it show up that we actually call _remove_pages() with wrong parameters, right? If so, I guess it would be better for you to fix it before my series and I will rebase my series on top of that.
On 9/27/19 1:16 PM, David Hildenbrand wrote: > On 27.09.19 03:51, Aneesh Kumar K.V wrote: >> On 9/27/19 4:15 AM, Andrew Morton wrote: >>> On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: >>> >>>> With altmap, all the resource pfns are not initialized. While initializing >>>> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >>>> pfns that were never initialized. >>>> >>>> Update memunmap_pages to calculate start and end pfn based on altmap >>>> values. This fixes a kernel crash that is observed when destroying namespace. >>>> >>>> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >>>> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >>>> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >>>> pc: c0000000000b58b0: memset+0x68/0x104 >>>> lr: c0000000003eb008: page_init_poison+0x38/0x50 >>>> ... >>>> current = 0xc000000271c67d80 >>>> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >>>> pid = 3665, comm = ndctl >>>> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >>>> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >>>> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >>>> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 >>> >>> Doesn't apply to mainline or -next. Which tree is this against? >>> >> >> After applying the patches from David on mainline. That is the reason I >> replied to this thread. I should have mentioned in the email that it is >> based on patch series "[PATCH v4 0/8] mm/memory_hotplug: Shrink zones >> before removing memory" > > So if I am not wrong, my patch "[PATCH v4 4/8] mm/memory_hotplug: Poison > memmap in remove_pfn_range_from_zone()" makes it show up that we > actually call _remove_pages() with wrong parameters, right? > > If so, I guess it would be better for you to fix it before my series and > I will rebase my series on top of that. > I posted a patch that can be applied to mainline. I sent that as a reply to this email. Can you include that and PATCH 2 as first two patches in your series? That should help to locate the full patch series needed for fixing the kernel crash. -aneesh
On 27.09.19 12:36, Aneesh Kumar K.V wrote: > On 9/27/19 1:16 PM, David Hildenbrand wrote: >> On 27.09.19 03:51, Aneesh Kumar K.V wrote: >>> On 9/27/19 4:15 AM, Andrew Morton wrote: >>>> On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: >>>> >>>>> With altmap, all the resource pfns are not initialized. While initializing >>>>> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >>>>> pfns that were never initialized. >>>>> >>>>> Update memunmap_pages to calculate start and end pfn based on altmap >>>>> values. This fixes a kernel crash that is observed when destroying namespace. >>>>> >>>>> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >>>>> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >>>>> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >>>>> pc: c0000000000b58b0: memset+0x68/0x104 >>>>> lr: c0000000003eb008: page_init_poison+0x38/0x50 >>>>> ... >>>>> current = 0xc000000271c67d80 >>>>> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >>>>> pid = 3665, comm = ndctl >>>>> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >>>>> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >>>>> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >>>>> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 >>>> >>>> Doesn't apply to mainline or -next. Which tree is this against? >>>> >>> >>> After applying the patches from David on mainline. That is the reason I >>> replied to this thread. I should have mentioned in the email that it is >>> based on patch series "[PATCH v4 0/8] mm/memory_hotplug: Shrink zones >>> before removing memory" >> >> So if I am not wrong, my patch "[PATCH v4 4/8] mm/memory_hotplug: Poison >> memmap in remove_pfn_range_from_zone()" makes it show up that we >> actually call _remove_pages() with wrong parameters, right? >> >> If so, I guess it would be better for you to fix it before my series and >> I will rebase my series on top of that. >> > > I posted a patch that can be applied to mainline. I sent that as a reply > to this email. Can you include that and PATCH 2 as first two patches in > your series? That should help to locate the full patch series needed > for fixing the kernel crash. I can drag these along, unless Andrew wants to pick them up right away (or we're waiting for more feedback). Is there a Fixes: Tag we can add to the first patch?
On 9/27/19 4:10 PM, David Hildenbrand wrote: > On 27.09.19 12:36, Aneesh Kumar K.V wrote: >> On 9/27/19 1:16 PM, David Hildenbrand wrote: >>> On 27.09.19 03:51, Aneesh Kumar K.V wrote: >>>> On 9/27/19 4:15 AM, Andrew Morton wrote: >>>>> On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: >>>>> >>>>>> With altmap, all the resource pfns are not initialized. While initializing >>>>>> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >>>>>> pfns that were never initialized. >>>>>> >>>>>> Update memunmap_pages to calculate start and end pfn based on altmap >>>>>> values. This fixes a kernel crash that is observed when destroying namespace. >>>>>> >>>>>> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >>>>>> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >>>>>> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >>>>>> pc: c0000000000b58b0: memset+0x68/0x104 >>>>>> lr: c0000000003eb008: page_init_poison+0x38/0x50 >>>>>> ... >>>>>> current = 0xc000000271c67d80 >>>>>> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >>>>>> pid = 3665, comm = ndctl >>>>>> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >>>>>> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >>>>>> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >>>>>> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 >>>>> >>>>> Doesn't apply to mainline or -next. Which tree is this against? >>>>> >>>> >>>> After applying the patches from David on mainline. That is the reason I >>>> replied to this thread. I should have mentioned in the email that it is >>>> based on patch series "[PATCH v4 0/8] mm/memory_hotplug: Shrink zones >>>> before removing memory" >>> >>> So if I am not wrong, my patch "[PATCH v4 4/8] mm/memory_hotplug: Poison >>> memmap in remove_pfn_range_from_zone()" makes it show up that we >>> actually call _remove_pages() with wrong parameters, right? >>> >>> If so, I guess it would be better for you to fix it before my series and >>> I will rebase my series on top of that. >>> >> >> I posted a patch that can be applied to mainline. I sent that as a reply >> to this email. Can you include that and PATCH 2 as first two patches in >> your series? That should help to locate the full patch series needed >> for fixing the kernel crash. > > I can drag these along, unless Andrew wants to pick them up right away > (or we're waiting for more feedback). Considering this patch alone won't fix the issue, It would be nice if we could club them with rest of the changes. > > Is there a Fixes: Tag we can add to the first patch? > IIUC this was always broken. -aneesh
On 27.09.19 13:35, Aneesh Kumar K.V wrote: > On 9/27/19 4:10 PM, David Hildenbrand wrote: >> On 27.09.19 12:36, Aneesh Kumar K.V wrote: >>> On 9/27/19 1:16 PM, David Hildenbrand wrote: >>>> On 27.09.19 03:51, Aneesh Kumar K.V wrote: >>>>> On 9/27/19 4:15 AM, Andrew Morton wrote: >>>>>> On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: >>>>>> >>>>>>> With altmap, all the resource pfns are not initialized. While initializing >>>>>>> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >>>>>>> pfns that were never initialized. >>>>>>> >>>>>>> Update memunmap_pages to calculate start and end pfn based on altmap >>>>>>> values. This fixes a kernel crash that is observed when destroying namespace. >>>>>>> >>>>>>> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >>>>>>> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >>>>>>> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >>>>>>> pc: c0000000000b58b0: memset+0x68/0x104 >>>>>>> lr: c0000000003eb008: page_init_poison+0x38/0x50 >>>>>>> ... >>>>>>> current = 0xc000000271c67d80 >>>>>>> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >>>>>>> pid = 3665, comm = ndctl >>>>>>> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >>>>>>> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >>>>>>> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >>>>>>> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 >>>>>> >>>>>> Doesn't apply to mainline or -next. Which tree is this against? >>>>>> >>>>> >>>>> After applying the patches from David on mainline. That is the reason I >>>>> replied to this thread. I should have mentioned in the email that it is >>>>> based on patch series "[PATCH v4 0/8] mm/memory_hotplug: Shrink zones >>>>> before removing memory" >>>> >>>> So if I am not wrong, my patch "[PATCH v4 4/8] mm/memory_hotplug: Poison >>>> memmap in remove_pfn_range_from_zone()" makes it show up that we >>>> actually call _remove_pages() with wrong parameters, right? >>>> >>>> If so, I guess it would be better for you to fix it before my series and >>>> I will rebase my series on top of that. >>>> >>> >>> I posted a patch that can be applied to mainline. I sent that as a reply >>> to this email. Can you include that and PATCH 2 as first two patches in >>> your series? That should help to locate the full patch series needed >>> for fixing the kernel crash. >> >> I can drag these along, unless Andrew wants to pick them up right away >> (or we're waiting for more feedback). > > Considering this patch alone won't fix the issue, It would be nice if we > could club them with rest of the changes. > I'll drag them along, adding Pankaj's RB's. If they get picked up independently, fine :)
diff --git a/mm/memremap.c b/mm/memremap.c index 390bb3544589..76b98110031e 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -113,7 +113,8 @@ static void dev_pagemap_cleanup(struct dev_pagemap *pgmap) void memunmap_pages(struct dev_pagemap *pgmap) { struct resource *res = &pgmap->res; - unsigned long pfn = PHYS_PFN(res->start); + unsigned long start_pfn, end_pfn; + unsigned long pfn, nr_pages; int nid; dev_pagemap_kill(pgmap); @@ -121,14 +122,18 @@ void memunmap_pages(struct dev_pagemap *pgmap) put_page(pfn_to_page(pfn)); dev_pagemap_cleanup(pgmap); + start_pfn = pfn_first(pgmap); + end_pfn = pfn_end(pgmap); + nr_pages = end_pfn - start_pfn; + /* pages are dead and unused, undo the arch mapping */ - nid = page_to_nid(pfn_to_page(pfn)); + nid = page_to_nid(pfn_to_page(start_pfn)); mem_hotplug_begin(); - remove_pfn_range_from_zone(page_zone(pfn_to_page(pfn)), pfn, - PHYS_PFN(resource_size(res))); + remove_pfn_range_from_zone(page_zone(pfn_to_page(start_pfn)), + start_pfn, nr_pages); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - __remove_pages(pfn, PHYS_PFN(resource_size(res)), NULL); + __remove_pages(start_pfn, nr_pages, NULL); } else { arch_remove_memory(nid, res->start, resource_size(res), pgmap_altmap(pgmap));
With altmap, all the resource pfns are not initialized. While initializing pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip pfns that were never initialized. Update memunmap_pages to calculate start and end pfn based on altmap values. This fixes a kernel crash that is observed when destroying namespace. [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] pc: c0000000000b58b0: memset+0x68/0x104 lr: c0000000003eb008: page_init_poison+0x38/0x50 ... current = 0xc000000271c67d80 paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 pid = 3665, comm = ndctl [link register ] c0000000003eb008 page_init_poison+0x38/0x50 [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 ... Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> --- mm/memremap.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)