Message ID | 20200206231629.14151-3-richardw.yang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fixes "mm/sparsemem: support sub-section hotplug" | expand |
On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > > memmap should be the physical address to page struct instead of virtual > address to pfn. > > Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at > this point. > > Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") > Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> > CC: Dan Williams <dan.j.williams@intel.com> > --- > mm/sparse.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/sparse.c b/mm/sparse.c > index b5da121bdd6e..56816f653588 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > /* Align memmap to section boundary in the subsection case */ > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > section_nr_to_pfn(section_nr) != start_pfn) > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); Yes, this looks obviously correct. This might be tripping up makedumpfile. Do you see any practical effects of this bug? The kernel mostly avoids ->section_mem_map in the vmemmap case and in the !vmemmap case section_nr_to_pfn(section_nr) should always equal start_pfn.
Hi Dan, On 02/06/20 at 06:19pm, Dan Williams wrote: > On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > > diff --git a/mm/sparse.c b/mm/sparse.c > > index b5da121bdd6e..56816f653588 100644 > > --- a/mm/sparse.c > > +++ b/mm/sparse.c > > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > > /* Align memmap to section boundary in the subsection case */ > > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > > section_nr_to_pfn(section_nr) != start_pfn) > > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > > Yes, this looks obviously correct. This might be tripping up > makedumpfile. Do you see any practical effects of this bug? The kernel > mostly avoids ->section_mem_map in the vmemmap case and in the > !vmemmap case section_nr_to_pfn(section_nr) should always equal > start_pfn. The practical effects is that the memmap for the first unaligned section will be lost when destroy namespace to hot remove it. Because we encode the ->section_mem_map into mem_section, and get memmap from the related mem_section to free it in section_deactivate(). In fact in vmemmap, we don't need to encode the ->section_mem_map with memmap. By the way, sub-section support is only valid in vmemmap case, right? Seems yes from code, but I don't find any document to prove it. Thanks Baoquan
On Thu, Feb 6, 2020 at 7:10 PM Baoquan He <bhe@redhat.com> wrote: > > Hi Dan, > > On 02/06/20 at 06:19pm, Dan Williams wrote: > > On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > > > diff --git a/mm/sparse.c b/mm/sparse.c > > > index b5da121bdd6e..56816f653588 100644 > > > --- a/mm/sparse.c > > > +++ b/mm/sparse.c > > > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > > > /* Align memmap to section boundary in the subsection case */ > > > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > > > section_nr_to_pfn(section_nr) != start_pfn) > > > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > > > > Yes, this looks obviously correct. This might be tripping up > > makedumpfile. Do you see any practical effects of this bug? The kernel > > mostly avoids ->section_mem_map in the vmemmap case and in the > > !vmemmap case section_nr_to_pfn(section_nr) should always equal > > start_pfn. > > The practical effects is that the memmap for the first unaligned section will be lost > when destroy namespace to hot remove it. Because we encode the ->section_mem_map > into mem_section, and get memmap from the related mem_section to free it in > section_deactivate(). In fact in vmemmap, we don't need to encode the ->section_mem_map > with memmap. Right, but can you actually trigger that in the SPARSEMEM_VMEMMAP=n case? > By the way, sub-section support is only valid in vmemmap case, right? Yes. > Seems yes from code, but I don't find any document to prove it. check_pfn_span() enforces this requirement.
On 02/06/20 at 07:21pm, Dan Williams wrote: > On Thu, Feb 6, 2020 at 7:10 PM Baoquan He <bhe@redhat.com> wrote: > > > > Hi Dan, > > > > On 02/06/20 at 06:19pm, Dan Williams wrote: > > > On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > > > > diff --git a/mm/sparse.c b/mm/sparse.c > > > > index b5da121bdd6e..56816f653588 100644 > > > > --- a/mm/sparse.c > > > > +++ b/mm/sparse.c > > > > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > > > > /* Align memmap to section boundary in the subsection case */ > > > > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > > > > section_nr_to_pfn(section_nr) != start_pfn) > > > > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > > > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > > > > > > Yes, this looks obviously correct. This might be tripping up > > > makedumpfile. Do you see any practical effects of this bug? The kernel > > > mostly avoids ->section_mem_map in the vmemmap case and in the > > > !vmemmap case section_nr_to_pfn(section_nr) should always equal > > > start_pfn. > > > > The practical effects is that the memmap for the first unaligned section will be lost > > when destroy namespace to hot remove it. Because we encode the ->section_mem_map > > into mem_section, and get memmap from the related mem_section to free it in > > section_deactivate(). In fact in vmemmap, we don't need to encode the ->section_mem_map > > with memmap. > > Right, but can you actually trigger that in the SPARSEMEM_VMEMMAP=n case? I think no, the lost memmap should only happen in vmemmap case. > > > By the way, sub-section support is only valid in vmemmap case, right? > > Yes. > > > Seems yes from code, but I don't find any document to prove it. > > check_pfn_span() enforces this requirement. Thanks for your confirmation. Do you mind if I add some document sentences somewhere make clear this?
On Thu, Feb 6, 2020 at 7:36 PM Baoquan He <bhe@redhat.com> wrote: > > On 02/06/20 at 07:21pm, Dan Williams wrote: > > On Thu, Feb 6, 2020 at 7:10 PM Baoquan He <bhe@redhat.com> wrote: > > > > > > Hi Dan, > > > > > > On 02/06/20 at 06:19pm, Dan Williams wrote: > > > > On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > > > > > diff --git a/mm/sparse.c b/mm/sparse.c > > > > > index b5da121bdd6e..56816f653588 100644 > > > > > --- a/mm/sparse.c > > > > > +++ b/mm/sparse.c > > > > > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > > > > > /* Align memmap to section boundary in the subsection case */ > > > > > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > > > > > section_nr_to_pfn(section_nr) != start_pfn) > > > > > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > > > > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > > > > > > > > Yes, this looks obviously correct. This might be tripping up > > > > makedumpfile. Do you see any practical effects of this bug? The kernel > > > > mostly avoids ->section_mem_map in the vmemmap case and in the > > > > !vmemmap case section_nr_to_pfn(section_nr) should always equal > > > > start_pfn. > > > > > > The practical effects is that the memmap for the first unaligned section will be lost > > > when destroy namespace to hot remove it. Because we encode the ->section_mem_map > > > into mem_section, and get memmap from the related mem_section to free it in > > > section_deactivate(). In fact in vmemmap, we don't need to encode the ->section_mem_map > > > with memmap. > > > > Right, but can you actually trigger that in the SPARSEMEM_VMEMMAP=n case? > > I think no, the lost memmap should only happen in vmemmap case. > > > > > > By the way, sub-section support is only valid in vmemmap case, right? > > > > Yes. > > > > > Seems yes from code, but I don't find any document to prove it. > > > > check_pfn_span() enforces this requirement. > > Thanks for your confirmation. Do you mind if I add some document > sentences somewhere make clear this? > Sure, I'd be happy to review as well.
On 02/07/20 at 07:16am, Wei Yang wrote: > memmap should be the physical address to page struct instead of virtual > address to pfn. Maybe not, memmap stores a virtual address. > > Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at > this point. > > Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") > Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> > CC: Dan Williams <dan.j.williams@intel.com> > --- > mm/sparse.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/sparse.c b/mm/sparse.c > index b5da121bdd6e..56816f653588 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > /* Align memmap to section boundary in the subsection case */ > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > section_nr_to_pfn(section_nr) != start_pfn) > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); With Dan's confirmation, sub-section is only valid in vmemmap case. I think the old if (section_nr_to_pfn(section_nr) != start_pfn) is enough to filter out non vmemmap case. So only below code is good: + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > sparse_init_one_section(ms, section_nr, memmap, ms->usage, 0); > > return 0; > -- > 2.17.1 >
On Thu, Feb 06, 2020 at 06:19:46PM -0800, Dan Williams wrote: >On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: >> >> memmap should be the physical address to page struct instead of virtual >> address to pfn. >> >> Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at >> this point. >> >> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> >> CC: Dan Williams <dan.j.williams@intel.com> >> --- >> mm/sparse.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/sparse.c b/mm/sparse.c >> index b5da121bdd6e..56816f653588 100644 >> --- a/mm/sparse.c >> +++ b/mm/sparse.c >> @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >> /* Align memmap to section boundary in the subsection case */ >> if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && >> section_nr_to_pfn(section_nr) != start_pfn) >> - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); >> + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > >Yes, this looks obviously correct. This might be tripping up >makedumpfile. Do you see any practical effects of this bug? The kernel >mostly avoids ->section_mem_map in the vmemmap case and in the >!vmemmap case section_nr_to_pfn(section_nr) should always equal >start_pfn. To summarize: * vmemmap, ->section_mem_map is not used mostly * !vmemmap, we are sure range is section aligned Sounds we don't need to handle this?
On Fri, Feb 07, 2020 at 12:11:34PM +0800, Baoquan He wrote: >On 02/07/20 at 07:16am, Wei Yang wrote: >> memmap should be the physical address to page struct instead of virtual >> address to pfn. > >Maybe not, memmap stores a virtual address. > >> >> Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at >> this point. >> >> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> >> CC: Dan Williams <dan.j.williams@intel.com> >> --- >> mm/sparse.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/sparse.c b/mm/sparse.c >> index b5da121bdd6e..56816f653588 100644 >> --- a/mm/sparse.c >> +++ b/mm/sparse.c >> @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >> /* Align memmap to section boundary in the subsection case */ >> if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && >> section_nr_to_pfn(section_nr) != start_pfn) >> - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); >> + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > >With Dan's confirmation, sub-section is only valid in vmemmap case. I >think the old if (section_nr_to_pfn(section_nr) != start_pfn) is enough >to filter out non vmemmap case. So only below code is good: > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > You mean replace pfn_to_kaddr with pfn_to_page ? >> sparse_init_one_section(ms, section_nr, memmap, ms->usage, 0); >> >> return 0; >> -- >> 2.17.1 >>
On Fri, Feb 07, 2020 at 11:36:36AM +0800, Baoquan He wrote: >On 02/06/20 at 07:21pm, Dan Williams wrote: >> On Thu, Feb 6, 2020 at 7:10 PM Baoquan He <bhe@redhat.com> wrote: >> > >> > Hi Dan, >> > >> > On 02/06/20 at 06:19pm, Dan Williams wrote: >> > > On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: >> > > > diff --git a/mm/sparse.c b/mm/sparse.c >> > > > index b5da121bdd6e..56816f653588 100644 >> > > > --- a/mm/sparse.c >> > > > +++ b/mm/sparse.c >> > > > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >> > > > /* Align memmap to section boundary in the subsection case */ >> > > > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && >> > > > section_nr_to_pfn(section_nr) != start_pfn) >> > > > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); >> > > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); >> > > >> > > Yes, this looks obviously correct. This might be tripping up >> > > makedumpfile. Do you see any practical effects of this bug? The kernel >> > > mostly avoids ->section_mem_map in the vmemmap case and in the >> > > !vmemmap case section_nr_to_pfn(section_nr) should always equal >> > > start_pfn. >> > >> > The practical effects is that the memmap for the first unaligned section will be lost >> > when destroy namespace to hot remove it. Because we encode the ->section_mem_map >> > into mem_section, and get memmap from the related mem_section to free it in >> > section_deactivate(). In fact in vmemmap, we don't need to encode the ->section_mem_map >> > with memmap. >> >> Right, but can you actually trigger that in the SPARSEMEM_VMEMMAP=n case? > >I think no, the lost memmap should only happen in vmemmap case. > >> >> > By the way, sub-section support is only valid in vmemmap case, right? >> >> Yes. >> >> > Seems yes from code, but I don't find any document to prove it. >> >> check_pfn_span() enforces this requirement. I saw this function, but those combination of vmemmap and !vmemmap make my brain not work properly. > >Thanks for your confirmation. Do you mind if I add some document >sentences somewhere make clear this? Thanks, hope this would help the future audience.
On Thu, Feb 06, 2020 at 06:19:46PM -0800, Dan Williams wrote: >On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: >> >> memmap should be the physical address to page struct instead of virtual >> address to pfn. >> >> Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at >> this point. >> >> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> >> CC: Dan Williams <dan.j.williams@intel.com> >> --- >> mm/sparse.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/sparse.c b/mm/sparse.c >> index b5da121bdd6e..56816f653588 100644 >> --- a/mm/sparse.c >> +++ b/mm/sparse.c >> @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >> /* Align memmap to section boundary in the subsection case */ >> if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && >> section_nr_to_pfn(section_nr) != start_pfn) >> - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); >> + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > >Yes, this looks obviously correct. This might be tripping up >makedumpfile. Do you see any practical effects of this bug? The kernel >mostly avoids ->section_mem_map in the vmemmap case and in the >!vmemmap case section_nr_to_pfn(section_nr) should always equal >start_pfn. I took another look into the code. Looks there is no practical effect after this. Because in the vmemmap case, we don't need ->section_mem_map to retrieve the real memmap. But leave a inconsistent data in section_mem_map is not a good practice.
On Thu, Feb 06, 2020 at 07:21:49PM -0800, Dan Williams wrote: >On Thu, Feb 6, 2020 at 7:10 PM Baoquan He <bhe@redhat.com> wrote: >> >> Hi Dan, >> >> On 02/06/20 at 06:19pm, Dan Williams wrote: >> > On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: >> > > diff --git a/mm/sparse.c b/mm/sparse.c >> > > index b5da121bdd6e..56816f653588 100644 >> > > --- a/mm/sparse.c >> > > +++ b/mm/sparse.c >> > > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >> > > /* Align memmap to section boundary in the subsection case */ >> > > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && >> > > section_nr_to_pfn(section_nr) != start_pfn) >> > > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); >> > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); >> > >> > Yes, this looks obviously correct. This might be tripping up >> > makedumpfile. Do you see any practical effects of this bug? The kernel >> > mostly avoids ->section_mem_map in the vmemmap case and in the >> > !vmemmap case section_nr_to_pfn(section_nr) should always equal >> > start_pfn. >> >> The practical effects is that the memmap for the first unaligned section will be lost >> when destroy namespace to hot remove it. Because we encode the ->section_mem_map >> into mem_section, and get memmap from the related mem_section to free it in >> section_deactivate(). In fact in vmemmap, we don't need to encode the ->section_mem_map >> with memmap. > >Right, but can you actually trigger that in the SPARSEMEM_VMEMMAP=n case? > >> By the way, sub-section support is only valid in vmemmap case, right? > >Yes. Just one question from curiosity. Why we don't want sub-section for !vmemmap case? Because it will wast memory for memmap? > >> Seems yes from code, but I don't find any document to prove it. > >check_pfn_span() enforces this requirement.
On Fri, Feb 7, 2020 at 4:15 AM Wei Yang <richard.weiyang@gmail.com> wrote: > > On Thu, Feb 06, 2020 at 07:21:49PM -0800, Dan Williams wrote: > >On Thu, Feb 6, 2020 at 7:10 PM Baoquan He <bhe@redhat.com> wrote: > >> > >> Hi Dan, > >> > >> On 02/06/20 at 06:19pm, Dan Williams wrote: > >> > On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > >> > > diff --git a/mm/sparse.c b/mm/sparse.c > >> > > index b5da121bdd6e..56816f653588 100644 > >> > > --- a/mm/sparse.c > >> > > +++ b/mm/sparse.c > >> > > @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > >> > > /* Align memmap to section boundary in the subsection case */ > >> > > if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > >> > > section_nr_to_pfn(section_nr) != start_pfn) > >> > > - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > >> > > + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > >> > > >> > Yes, this looks obviously correct. This might be tripping up > >> > makedumpfile. Do you see any practical effects of this bug? The kernel > >> > mostly avoids ->section_mem_map in the vmemmap case and in the > >> > !vmemmap case section_nr_to_pfn(section_nr) should always equal > >> > start_pfn. > >> > >> The practical effects is that the memmap for the first unaligned section will be lost > >> when destroy namespace to hot remove it. Because we encode the ->section_mem_map > >> into mem_section, and get memmap from the related mem_section to free it in > >> section_deactivate(). In fact in vmemmap, we don't need to encode the ->section_mem_map > >> with memmap. > > > >Right, but can you actually trigger that in the SPARSEMEM_VMEMMAP=n case? > > > >> By the way, sub-section support is only valid in vmemmap case, right? > > > >Yes. > > Just one question from curiosity. Why we don't want sub-section for !vmemmap > case? Because it will wast memory for memmap? The effort and maintenance burden outweighs the benefit.
On 02/07/20 at 11:26am, Wei Yang wrote: > On Thu, Feb 06, 2020 at 06:19:46PM -0800, Dan Williams wrote: > >On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > >> > >> memmap should be the physical address to page struct instead of virtual > >> address to pfn. > >> > >> Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at > >> this point. > >> > >> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") > >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> > >> CC: Dan Williams <dan.j.williams@intel.com> > >> --- > >> mm/sparse.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/mm/sparse.c b/mm/sparse.c > >> index b5da121bdd6e..56816f653588 100644 > >> --- a/mm/sparse.c > >> +++ b/mm/sparse.c > >> @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > >> /* Align memmap to section boundary in the subsection case */ > >> if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && > >> section_nr_to_pfn(section_nr) != start_pfn) > >> - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); > >> + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); > > > >Yes, this looks obviously correct. This might be tripping up > >makedumpfile. Do you see any practical effects of this bug? The kernel > >mostly avoids ->section_mem_map in the vmemmap case and in the > >!vmemmap case section_nr_to_pfn(section_nr) should always equal > >start_pfn. > > I took another look into the code. Looks there is no practical effect after > this. Because in the vmemmap case, we don't need ->section_mem_map to retrieve > the real memmap. > > But leave a inconsistent data in section_mem_map is not a good practice. Yeah, it does has no pratical effect. I tried to create sub-section alighed namespace, then trigger crash, makedumpfile isn't impacted. Because pmem memory is only added, but not onlined. We don't report it to kdump, makedumpfile will ignore it. I think it's worth fixing it to encode a correct memmap address. We don't know if in the future this will break anything. Thanks Baoquan
> Am 09.02.2020 um 14:50 schrieb Baoquan He <bhe@redhat.com>: > > On 02/07/20 at 11:26am, Wei Yang wrote: >>> On Thu, Feb 06, 2020 at 06:19:46PM -0800, Dan Williams wrote: >>> On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: >>>> >>>> memmap should be the physical address to page struct instead of virtual >>>> address to pfn. >>>> >>>> Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at >>>> this point. >>>> >>>> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >>>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> >>>> CC: Dan Williams <dan.j.williams@intel.com> >>>> --- >>>> mm/sparse.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/mm/sparse.c b/mm/sparse.c >>>> index b5da121bdd6e..56816f653588 100644 >>>> --- a/mm/sparse.c >>>> +++ b/mm/sparse.c >>>> @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >>>> /* Align memmap to section boundary in the subsection case */ >>>> if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && >>>> section_nr_to_pfn(section_nr) != start_pfn) >>>> - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); >>>> + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); >>> >>> Yes, this looks obviously correct. This might be tripping up >>> makedumpfile. Do you see any practical effects of this bug? The kernel >>> mostly avoids ->section_mem_map in the vmemmap case and in the >>> !vmemmap case section_nr_to_pfn(section_nr) should always equal >>> start_pfn. >> >> I took another look into the code. Looks there is no practical effect after >> this. Because in the vmemmap case, we don't need ->section_mem_map to retrieve >> the real memmap. >> >> But leave a inconsistent data in section_mem_map is not a good practice. > > Yeah, it does has no pratical effect. I tried to create sub-section > alighed namespace, then trigger crash, makedumpfile isn't impacted. > Because pmem memory is only added, but not onlined. We don't report it > to kdump, makedumpfile will ignore it. > > I think it's worth fixing it to encode a correct memmap address. We > don't know if in the future this will break anything. We can have system memory and devmem overlap within a section (which is still buggy and to be fixed in other regard - e.g., pfn_to_online_page() does not work correctly). E.g., 64 mb of (boot) system memory in a section. Then you can hot-add devmem that spans the remaining 64 mb of that section. So some of that memory will be kdumped - and should be fixed if broken. Cheers > > Thanks > Baoquan
On Sun, Feb 09, 2020 at 03:14:28PM +0100, David Hildenbrand wrote: > > >> Am 09.02.2020 um 14:50 schrieb Baoquan He <bhe@redhat.com>: >> >> On 02/07/20 at 11:26am, Wei Yang wrote: >>>> On Thu, Feb 06, 2020 at 06:19:46PM -0800, Dan Williams wrote: >>>> On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@linux.intel.com> wrote: >>>>> >>>>> memmap should be the physical address to page struct instead of virtual >>>>> address to pfn. >>>>> >>>>> Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at >>>>> this point. >>>>> >>>>> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") >>>>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> >>>>> CC: Dan Williams <dan.j.williams@intel.com> >>>>> --- >>>>> mm/sparse.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/mm/sparse.c b/mm/sparse.c >>>>> index b5da121bdd6e..56816f653588 100644 >>>>> --- a/mm/sparse.c >>>>> +++ b/mm/sparse.c >>>>> @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >>>>> /* Align memmap to section boundary in the subsection case */ >>>>> if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && >>>>> section_nr_to_pfn(section_nr) != start_pfn) >>>>> - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); >>>>> + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); >>>> >>>> Yes, this looks obviously correct. This might be tripping up >>>> makedumpfile. Do you see any practical effects of this bug? The kernel >>>> mostly avoids ->section_mem_map in the vmemmap case and in the >>>> !vmemmap case section_nr_to_pfn(section_nr) should always equal >>>> start_pfn. >>> >>> I took another look into the code. Looks there is no practical effect after >>> this. Because in the vmemmap case, we don't need ->section_mem_map to retrieve >>> the real memmap. >>> >>> But leave a inconsistent data in section_mem_map is not a good practice. >> >> Yeah, it does has no pratical effect. I tried to create sub-section >> alighed namespace, then trigger crash, makedumpfile isn't impacted. >> Because pmem memory is only added, but not onlined. We don't report it >> to kdump, makedumpfile will ignore it. >> >> I think it's worth fixing it to encode a correct memmap address. We >> don't know if in the future this will break anything. > >We can have system memory and devmem overlap within a section (which is still buggy and to be fixed in other regard - e.g., pfn_to_online_page() does not work correctly). > >E.g., 64 mb of (boot) system memory in a section. Then you can hot-add devmem that spans the remaining 64 mb of that section. > >So some of that memory will be kdumped - and should be fixed if broken. > >Cheers Thanks for the explanation, I will add this in the changelog. > > >> >> Thanks >> Baoquan
diff --git a/mm/sparse.c b/mm/sparse.c index b5da121bdd6e..56816f653588 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, /* Align memmap to section boundary in the subsection case */ if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) && section_nr_to_pfn(section_nr) != start_pfn) - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); + memmap = pfn_to_page(section_nr_to_pfn(section_nr)); sparse_init_one_section(ms, section_nr, memmap, ms->usage, 0); return 0;
memmap should be the physical address to page struct instead of virtual address to pfn. Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at this point. Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> CC: Dan Williams <dan.j.williams@intel.com> --- mm/sparse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)