Message ID | 20200220043316.19668-1-bhe@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | mm/hotplug: Only use subsection map in VMEMMAP case | expand |
On Thu 20-02-20 12:33:09, Baoquan He wrote: > Memory sub-section hotplug was added to fix the issue that nvdimm could > be mapped at non-section aligned starting address. A subsection map is > added into struct mem_section_usage to implement it. However, sub-section > is only supported in VMEMMAP case. Why? Is there any fundamental reason or just a lack of implementation? VMEMMAP should be really only an implementation detail unless I am missing something subtle. > Hence there's no need to operate > subsection map in SPARSEMEM|!VMEMMAP case. In this patchset, change > codes to make sub-section map and the relevant operation only available > in VMEMMAP case. > > And since sub-section hotplug added, the hot add/remove functionality > have been broken in SPARSEMEM|!VMEMMAP case. Wei Yang and I, each of us > make one patch to fix one of the failures. In this patchset, the patch > 1/7 from me is used to fix the hot remove failure. Wei Yang's patch has > been merged by Andrew. Not sure I understand. Are there more issues to be fixed? > include/linux/mmzone.h | 2 + > mm/sparse.c | 178 +++++++++++++++++++++++++++++------------ > 2 files changed, 127 insertions(+), 53 deletions(-) Why do we need to add so much code to remove a functionality from one memory model?
On 02/20/20 at 11:38am, Michal Hocko wrote: > On Thu 20-02-20 12:33:09, Baoquan He wrote: > > Memory sub-section hotplug was added to fix the issue that nvdimm could > > be mapped at non-section aligned starting address. A subsection map is > > added into struct mem_section_usage to implement it. However, sub-section > > is only supported in VMEMMAP case. > > Why? Is there any fundamental reason or just a lack of implementation? > VMEMMAP should be really only an implementation detail unless I am > missing something subtle. Thanks for checking. VMEMMAP is one of two ways to convert a PFN to the corresponding 'struct page' in SPARSE model. I mentioned them as VMEMMAP case, or !VMEMMAP case because we called them like this previously when reviewed patches, hope it won't cause confusion. Currently, config ZONE_DEVICE depends on SPARSEMEM_VMEMMAP. The subsection_map is added to struct mem_section_usage to track which sub section is present, VMEMMAP fills those bits which corresponding sub-sections are present, while !VMEMMAP, namely classic SPARSE, fills the whole map always. As we know, VMEMMAP builds page table to map a cluster of 'struct page' into the corresponding area of 'vmemmap'. Subsection hotplug can be supported naturally, w/o any change, just map needed region related to sub-sections on demand. For !VMEMMAP, it allocates memmap with alloc_pages() or vmalloc, thing is a little complicated, e.g the mixed section, boot memory occupies the starting area, later pmem hot added to the rear part. About !VMEMMAP which doesn't support sub-section hotplog, Dan said it's more because the effort and maintenance burden outweighs the benefit. And the current 64 bit ARCHes all enable SPARSEMEM_VMEMMAP_ENABLE by default. So no need to keep subsection_map and its handling in SPARSE|!VMEMMAP. > > > Hence there's no need to operate > > subsection map in SPARSEMEM|!VMEMMAP case. In this patchset, change > > codes to make sub-section map and the relevant operation only available > > in VMEMMAP case. > > > > And since sub-section hotplug added, the hot add/remove functionality > > have been broken in SPARSEMEM|!VMEMMAP case. Wei Yang and I, each of us > > make one patch to fix one of the failures. In this patchset, the patch > > 1/7 from me is used to fix the hot remove failure. Wei Yang's patch has > > been merged by Andrew. > > Not sure I understand. Are there more issues to be fixed? Only these two. Wei Yang firstly posted the patch to fix the hot add failure in SPARSE|!VMEMMAP. When I reviewed his patch and tested, found hot remove failed too. So the patch 1/7 is to fix the hot remove failure in !VMEMMAP. With these two patches, hot add/remove works well in !VMEMMAP. Not sure if it's clear. > > include/linux/mmzone.h | 2 + > > mm/sparse.c | 178 +++++++++++++++++++++++++++++------------ > > 2 files changed, 127 insertions(+), 53 deletions(-) > > Why do we need to add so much code to remove a functionality from one > memory model? Hmm, Dan also asked this before. The adding mainly happens in patch 2, 3, 4, including the two newly added function defitions, the code comments above them, and those added dummy functions for !VMEMMAP. Thanks Baoquan
>>> include/linux/mmzone.h | 2 + >>> mm/sparse.c | 178 +++++++++++++++++++++++++++++------------ >>> 2 files changed, 127 insertions(+), 53 deletions(-) >> >> Why do we need to add so much code to remove a functionality from one >> memory model? > > Hmm, Dan also asked this before. > > The adding mainly happens in patch 2, 3, 4, including the two newly > added function defitions, the code comments above them, and those added > dummy functions for !VMEMMAP. AFAIKS, it's mostly a bunch of newly added comments on top of functions. E.g., the comment for fill_subsection_map() alone spans 12 LOC in total. I do wonder if we have to be that verbose. We are barely that verbose on MM code (and usually I don't see much benefit unless it's a function with many users from many different places).
On Tue 25-02-20 10:10:45, David Hildenbrand wrote: > >>> include/linux/mmzone.h | 2 + > >>> mm/sparse.c | 178 +++++++++++++++++++++++++++++------------ > >>> 2 files changed, 127 insertions(+), 53 deletions(-) > >> > >> Why do we need to add so much code to remove a functionality from one > >> memory model? > > > > Hmm, Dan also asked this before. > > > > The adding mainly happens in patch 2, 3, 4, including the two newly > > added function defitions, the code comments above them, and those added > > dummy functions for !VMEMMAP. > > AFAIKS, it's mostly a bunch of newly added comments on top of functions. > E.g., the comment for fill_subsection_map() alone spans 12 LOC in total. > I do wonder if we have to be that verbose. We are barely that verbose on > MM code (and usually I don't see much benefit unless it's a function > with many users from many different places). I would tend to agree here. Not that I am against kernel doc documentation but these are internal functions and the comment doesn't really give any better insight IMHO. I would be much more inclined if this was the general pattern in the respective file but it just stands out.
On Fri 21-02-20 22:28:47, Baoquan He wrote: > On 02/20/20 at 11:38am, Michal Hocko wrote: > > On Thu 20-02-20 12:33:09, Baoquan He wrote: > > > Memory sub-section hotplug was added to fix the issue that nvdimm could > > > be mapped at non-section aligned starting address. A subsection map is > > > added into struct mem_section_usage to implement it. However, sub-section > > > is only supported in VMEMMAP case. > > > > Why? Is there any fundamental reason or just a lack of implementation? > > VMEMMAP should be really only an implementation detail unless I am > > missing something subtle. > > Thanks for checking. > > VMEMMAP is one of two ways to convert a PFN to the corresponding > 'struct page' in SPARSE model. I mentioned them as VMEMMAP case, or > !VMEMMAP case because we called them like this previously when reviewed > patches, hope it won't cause confusion. > > Currently, config ZONE_DEVICE depends on SPARSEMEM_VMEMMAP. The > subsection_map is added to struct mem_section_usage to track which sub > section is present, VMEMMAP fills those bits which corresponding > sub-sections are present, while !VMEMMAP, namely classic SPARSE, fills > the whole map always. > > As we know, VMEMMAP builds page table to map a cluster of 'struct page' > into the corresponding area of 'vmemmap'. Subsection hotplug can be > supported naturally, w/o any change, just map needed region related to > sub-sections on demand. For !VMEMMAP, it allocates memmap with > alloc_pages() or vmalloc, thing is a little complicated, e.g the mixed > section, boot memory occupies the starting area, later pmem hot added to > the rear part. > > About !VMEMMAP which doesn't support sub-section hotplog, Dan said > it's more because the effort and maintenance burden outweighs the > benefit. And the current 64 bit ARCHes all enable > SPARSEMEM_VMEMMAP_ENABLE by default. OK, if this is the primary argument then make sure to document it in the changelog (cover letter).
On 02/25/20 at 11:02am, Michal Hocko wrote: > On Tue 25-02-20 10:10:45, David Hildenbrand wrote: > > >>> include/linux/mmzone.h | 2 + > > >>> mm/sparse.c | 178 +++++++++++++++++++++++++++++------------ > > >>> 2 files changed, 127 insertions(+), 53 deletions(-) > > >> > > >> Why do we need to add so much code to remove a functionality from one > > >> memory model? > > > > > > Hmm, Dan also asked this before. > > > > > > The adding mainly happens in patch 2, 3, 4, including the two newly > > > added function defitions, the code comments above them, and those added > > > dummy functions for !VMEMMAP. > > > > AFAIKS, it's mostly a bunch of newly added comments on top of functions. > > E.g., the comment for fill_subsection_map() alone spans 12 LOC in total. > > I do wonder if we have to be that verbose. We are barely that verbose on > > MM code (and usually I don't see much benefit unless it's a function > > with many users from many different places). > > I would tend to agree here. Not that I am against kernel doc > documentation but these are internal functions and the comment doesn't > really give any better insight IMHO. I would be much more inclined if > this was the general pattern in the respective file but it just stands > out. I saw there are internal functions which have code comments, e.g shrink_slab() in mm/vmscan.c, not only this one place, there are several places. I personally prefer to see code comment for function if possible, this can save time, e.g people can skip the bitmap operation when read code if not necessary. And here I mainly want to tell there are different returned value to note different behaviour when call them. Anyway, it's fine to me to remove them. The two functions are internal, and not so complicated. I will remove them since you both object. However, I disagree with the saying that we should not add code comment for internal function. Thanks Baoquan
On 02/25/20 at 11:03am, Michal Hocko wrote: > On Fri 21-02-20 22:28:47, Baoquan He wrote: > > On 02/20/20 at 11:38am, Michal Hocko wrote: > > > On Thu 20-02-20 12:33:09, Baoquan He wrote: > > > > Memory sub-section hotplug was added to fix the issue that nvdimm could > > > > be mapped at non-section aligned starting address. A subsection map is > > > > added into struct mem_section_usage to implement it. However, sub-section > > > > is only supported in VMEMMAP case. > > > > > > Why? Is there any fundamental reason or just a lack of implementation? > > > VMEMMAP should be really only an implementation detail unless I am > > > missing something subtle. > > > > Thanks for checking. > > > > VMEMMAP is one of two ways to convert a PFN to the corresponding > > 'struct page' in SPARSE model. I mentioned them as VMEMMAP case, or > > !VMEMMAP case because we called them like this previously when reviewed > > patches, hope it won't cause confusion. > > > > Currently, config ZONE_DEVICE depends on SPARSEMEM_VMEMMAP. The > > subsection_map is added to struct mem_section_usage to track which sub > > section is present, VMEMMAP fills those bits which corresponding > > sub-sections are present, while !VMEMMAP, namely classic SPARSE, fills > > the whole map always. > > > > As we know, VMEMMAP builds page table to map a cluster of 'struct page' > > into the corresponding area of 'vmemmap'. Subsection hotplug can be > > supported naturally, w/o any change, just map needed region related to > > sub-sections on demand. For !VMEMMAP, it allocates memmap with > > alloc_pages() or vmalloc, thing is a little complicated, e.g the mixed > > section, boot memory occupies the starting area, later pmem hot added to > > the rear part. > > > > About !VMEMMAP which doesn't support sub-section hotplog, Dan said > > it's more because the effort and maintenance burden outweighs the > > benefit. And the current 64 bit ARCHes all enable > > SPARSEMEM_VMEMMAP_ENABLE by default. > > OK, if this is the primary argument then make sure to document it in the > changelog (cover letter). Will add it when repost.
On Wed 26-02-20 11:42:36, Baoquan He wrote: > On 02/25/20 at 11:02am, Michal Hocko wrote: > > On Tue 25-02-20 10:10:45, David Hildenbrand wrote: > > > >>> include/linux/mmzone.h | 2 + > > > >>> mm/sparse.c | 178 +++++++++++++++++++++++++++++------------ > > > >>> 2 files changed, 127 insertions(+), 53 deletions(-) > > > >> > > > >> Why do we need to add so much code to remove a functionality from one > > > >> memory model? > > > > > > > > Hmm, Dan also asked this before. > > > > > > > > The adding mainly happens in patch 2, 3, 4, including the two newly > > > > added function defitions, the code comments above them, and those added > > > > dummy functions for !VMEMMAP. > > > > > > AFAIKS, it's mostly a bunch of newly added comments on top of functions. > > > E.g., the comment for fill_subsection_map() alone spans 12 LOC in total. > > > I do wonder if we have to be that verbose. We are barely that verbose on > > > MM code (and usually I don't see much benefit unless it's a function > > > with many users from many different places). > > > > I would tend to agree here. Not that I am against kernel doc > > documentation but these are internal functions and the comment doesn't > > really give any better insight IMHO. I would be much more inclined if > > this was the general pattern in the respective file but it just stands > > out. > > I saw there are internal functions which have code comments, e.g > shrink_slab() in mm/vmscan.c, not only this one place, there are several > places. I personally prefer to see code comment for function if > possible, this can save time, e.g people can skip the bitmap operation > when read code if not necessary. And here I mainly want to tell there > are different returned value to note different behaviour when call them. ... yet nobody really cares about different return code. It is an error that is propagated up the call chain and that's all. I also like when there is a kernel doc comment that helps to understand the intented usage, context the function can be called from, potential side effects, locking requirements and other details people need to know when calling functions. But have a look at /** * clear_subsection_map - Clear subsection map of one memory region * * @pfn - start pfn of the memory range * @nr_pages - number of pfns to add in the region * * This is only intended for hotplug, and clear the related subsection * map inside one section. * * Return: * * -EINVAL - Section already deactived. * * 0 - Subsection map is emptied. * * 1 - Subsection map is not empty. */ the only useful information in here is that this is a hotplug stuff but I would be completely lost about the intention without already knowing what is this whole subsection about.
On 02/26/20 at 10:14am, Michal Hocko wrote: > On Wed 26-02-20 11:42:36, Baoquan He wrote: > > On 02/25/20 at 11:02am, Michal Hocko wrote: > > > On Tue 25-02-20 10:10:45, David Hildenbrand wrote: > > > > >>> include/linux/mmzone.h | 2 + > > > > >>> mm/sparse.c | 178 +++++++++++++++++++++++++++++------------ > > > > >>> 2 files changed, 127 insertions(+), 53 deletions(-) > > > > >> > > > > >> Why do we need to add so much code to remove a functionality from one > > > > >> memory model? > > > > > > > > > > Hmm, Dan also asked this before. > > > > > > > > > > The adding mainly happens in patch 2, 3, 4, including the two newly > > > > > added function defitions, the code comments above them, and those added > > > > > dummy functions for !VMEMMAP. > > > > > > > > AFAIKS, it's mostly a bunch of newly added comments on top of functions. > > > > E.g., the comment for fill_subsection_map() alone spans 12 LOC in total. > > > > I do wonder if we have to be that verbose. We are barely that verbose on > > > > MM code (and usually I don't see much benefit unless it's a function > > > > with many users from many different places). > > > > > > I would tend to agree here. Not that I am against kernel doc > > > documentation but these are internal functions and the comment doesn't > > > really give any better insight IMHO. I would be much more inclined if > > > this was the general pattern in the respective file but it just stands > > > out. > > > > I saw there are internal functions which have code comments, e.g > > shrink_slab() in mm/vmscan.c, not only this one place, there are several > > places. I personally prefer to see code comment for function if > > possible, this can save time, e.g people can skip the bitmap operation > > when read code if not necessary. And here I mainly want to tell there > > are different returned value to note different behaviour when call them. > > ... yet nobody really cares about different return code. It is an error > that is propagated up the call chain and that's all. > > I also like when there is a kernel doc comment that helps to understand > the intented usage, context the function can be called from, potential > side effects, locking requirements and other details people need to know Fair enough. As I have said, I didn't intend to stick to add kernel doc comments for these two functions. Will remove them. Thanks for reviewing. > when calling functions. But have a look at > /** > * clear_subsection_map - Clear subsection map of one memory region > * > * @pfn - start pfn of the memory range > * @nr_pages - number of pfns to add in the region > * > * This is only intended for hotplug, and clear the related subsection > * map inside one section. > * > * Return: > * * -EINVAL - Section already deactived. > * * 0 - Subsection map is emptied. > * * 1 - Subsection map is not empty. > */ > > the only useful information in here is that this is a hotplug stuff but > I would be completely lost about the intention without already knowing > what is this whole subsection about. > > -- > Michal Hocko > SUSE Labs >