Message ID | 20180627013116.12411-5-bhe@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> Signed-off-by: Baoquan He <bhe@redhat.com> > > Signed-off-by: Baoquan He <bhe@redhat.com> Please remove duplicated signed-off > if (!usemap) { > ms->section_mem_map = 0; > + nr_consumed_maps++; Currently, we do not set ms->section_mem_map to 0 when fail to allocate usemap, only when fail to allocate mmap we set section_mem_map to 0. I think this is an existing bug. Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
On 06/27/18 at 11:19pm, Pavel Tatashin wrote: > > Signed-off-by: Baoquan He <bhe@redhat.com> > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > Please remove duplicated signed-off Done. > > > if (!usemap) { > > ms->section_mem_map = 0; > > + nr_consumed_maps++; > > Currently, we do not set ms->section_mem_map to 0 when fail to > allocate usemap, only when fail to allocate mmap we set > section_mem_map to 0. I think this is an existing bug. Yes, found it when changing code. Later in sparse_init(), added checking to see if usemap is available, otherwise also do "ms->section_mem_map = 0;" to clear its ->section_mem_map. Here if want to be perfect, we may need to free the relevant memmap because usemap is allocated together, memmap could be allocated one section by one section. I didn't do that because usemap allocation is smaller one, if that allocation even failed in this early system initializaiton stage, the kernel won't live long, so don't bother to do that to complicate code. > > Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> + /* The numner of present sections stored in nr_present_sections ^ "number", please. This comment needs CodingStyle love. > + * are kept the same since mem sections are marked as present in ^ s/are/is/ This sentence is really odd to me. What is the point? Just that we are not making sections present? Could we just say that instead of referring to functions and variable names? > + * memory_present(). In this for loop, we need check which sections > + * failed to allocate memmap or usemap, then clear its > + * ->section_mem_map accordingly. Rather than referring to the for loop, how about we actually comment the code that is doing this operation? > During this process, we need This is missing a "to": "we need _to_ increase". > + * increase 'nr_consumed_maps' whether its allocation of memmap > + * or usemap failed or not, so that after we handle the i-th > + * memory section, can get memmap and usemap of (i+1)-th section > + * correctly. */ This makes no sense to me. Why are we incrementing 'nr_consumed_maps' when we do not consume one? You say that we increment it so that things will work, but not how or why it makes things work. I'm confused.
> > + * increase 'nr_consumed_maps' whether its allocation of memmap > > + * or usemap failed or not, so that after we handle the i-th > > + * memory section, can get memmap and usemap of (i+1)-th section > > + * correctly. */ > > This makes no sense to me. Why are we incrementing 'nr_consumed_maps' > when we do not consume one? > > You say that we increment it so that things will work, but not how or > why it makes things work. I'm confused. Hi Dave, nr_consumed_maps is a local counter. map_map contains struct pages for each section. In order to assign them to correct sections this local counter must be incremented even when some parts of map_map are empty. Here is example: Node1: map_map[0] -> Struct pages ... map_map[1] -> NULL Node2: map_map[2] -> Struct pages ... We always want to configure section from Node2 with struct pages from Node2. Even, if there are holes in-between. The same with usemap. Pavel
On 06/29/2018 10:48 AM, Pavel Tatashin wrote: > Here is example: > Node1: > map_map[0] -> Struct pages ... > map_map[1] -> NULL > Node2: > map_map[2] -> Struct pages ... > > We always want to configure section from Node2 with struct pages from > Node2. Even, if there are holes in-between. The same with usemap. Right... But your example consumes two mem_map[]s. But, from scanning the code, we increment nr_consumed_maps three times. Correct?
On 06/29/2018 01:52 PM, Dave Hansen wrote: > On 06/29/2018 10:48 AM, Pavel Tatashin wrote: >> Here is example: >> Node1: >> map_map[0] -> Struct pages ... >> map_map[1] -> NULL >> Node2: >> map_map[2] -> Struct pages ... >> >> We always want to configure section from Node2 with struct pages from >> Node2. Even, if there are holes in-between. The same with usemap. > > Right... But your example consumes two mem_map[]s. > > But, from scanning the code, we increment nr_consumed_maps three times. > Correct? Correct: it should be incremented on every iteration of the loop. No matter if the entries contained valid data or NULLs. So we increment in three places: if map_map[] has invalid entry, increment, continue if usemap_map[] has invalid entry, increment, continue at the end of the loop, everything was valid we increment it This is done so nr_consumed_maps does not get out of sync with the current pnum. pnum does not equal to nr_consumed_maps, as there are may be holes in pnums, but there is one-to-one correlation. Pavel
On 06/29/2018 11:01 AM, Pavel Tatashin wrote: > Correct: it should be incremented on every iteration of the loop. No matter if the entries contained valid data or NULLs. So we increment in three places: > > if map_map[] has invalid entry, increment, continue > if usemap_map[] has invalid entry, increment, continue > at the end of the loop, everything was valid we increment it > > This is done so nr_consumed_maps does not get out of sync with the > current pnum. pnum does not equal to nr_consumed_maps, as there are > may be holes in pnums, but there is one-to-one correlation. Can this be made more clear in the code?
> > This is done so nr_consumed_maps does not get out of sync with the > > current pnum. pnum does not equal to nr_consumed_maps, as there are > > may be holes in pnums, but there is one-to-one correlation. > Can this be made more clear in the code? Absolutely. I've done it here: http://lkml.kernel.org/r/20180628173010.23849-1-pasha.tatashin@oracle.com Pavel
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 640e68f8324b..a98ec2fb6915 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -281,6 +281,7 @@ void __init sparse_mem_maps_populate_node(struct page **map_map, unsigned long pnum; unsigned long size = sizeof(struct page) * PAGES_PER_SECTION; void *vmemmap_buf_start; + int nr_consumed_maps = 0; size = ALIGN(size, PMD_SIZE); vmemmap_buf_start = __earlyonly_bootmem_alloc(nodeid, size * map_count, @@ -297,8 +298,8 @@ void __init sparse_mem_maps_populate_node(struct page **map_map, if (!present_section_nr(pnum)) continue; - map_map[pnum] = sparse_mem_map_populate(pnum, nodeid, NULL); - if (map_map[pnum]) + map_map[nr_consumed_maps] = sparse_mem_map_populate(pnum, nodeid, NULL); + if (map_map[nr_consumed_maps++]) continue; ms = __nr_to_section(pnum); pr_err("%s: sparsemem memory map backing failed some memory will not be available\n", diff --git a/mm/sparse.c b/mm/sparse.c index b2848cc6e32a..2eb8ee72e44d 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -386,6 +386,7 @@ static void __init sparse_early_usemaps_alloc_node(void *data, unsigned long pnum; unsigned long **usemap_map = (unsigned long **)data; int size = usemap_size(); + int nr_consumed_maps = 0; usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nodeid), size * usemap_count); @@ -397,9 +398,10 @@ static void __init sparse_early_usemaps_alloc_node(void *data, for (pnum = pnum_begin; pnum < pnum_end; pnum++) { if (!present_section_nr(pnum)) continue; - usemap_map[pnum] = usemap; + usemap_map[nr_consumed_maps] = usemap; usemap += size; - check_usemap_section_nr(nodeid, usemap_map[pnum]); + check_usemap_section_nr(nodeid, usemap_map[nr_consumed_maps]); + nr_consumed_maps++; } } @@ -424,29 +426,33 @@ void __init sparse_mem_maps_populate_node(struct page **map_map, void *map; unsigned long pnum; unsigned long size = sizeof(struct page) * PAGES_PER_SECTION; + int nr_consumed_maps; size = PAGE_ALIGN(size); map = memblock_virt_alloc_try_nid_raw(size * map_count, PAGE_SIZE, __pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nodeid); if (map) { + nr_consumed_maps = 0; for (pnum = pnum_begin; pnum < pnum_end; pnum++) { if (!present_section_nr(pnum)) continue; - map_map[pnum] = map; + map_map[nr_consumed_maps] = map; map += size; + nr_consumed_maps++; } return; } /* fallback */ + nr_consumed_maps = 0; for (pnum = pnum_begin; pnum < pnum_end; pnum++) { struct mem_section *ms; if (!present_section_nr(pnum)) continue; - map_map[pnum] = sparse_mem_map_populate(pnum, nodeid, NULL); - if (map_map[pnum]) + map_map[nr_consumed_maps] = sparse_mem_map_populate(pnum, nodeid, NULL); + if (map_map[nr_consumed_maps++]) continue; ms = __nr_to_section(pnum); pr_err("%s: sparsemem memory map backing failed some memory will not be available\n", @@ -526,6 +532,7 @@ static void __init alloc_usemap_and_memmap(void (*alloc_func) /* new start, update count etc*/ nodeid_begin = nodeid; pnum_begin = pnum; + data += map_count * data_unit_size; map_count = 1; } /* ok, last chunk */ @@ -544,6 +551,7 @@ void __init sparse_init(void) unsigned long *usemap; unsigned long **usemap_map; int size; + int nr_consumed_maps = 0; #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER int size2; struct page **map_map; @@ -566,7 +574,7 @@ void __init sparse_init(void) * powerpc need to call sparse_init_one_section right after each * sparse_early_mem_map_alloc, so allocate usemap_map at first. */ - size = sizeof(unsigned long *) * NR_MEM_SECTIONS; + size = sizeof(unsigned long *) * nr_present_sections; usemap_map = memblock_virt_alloc(size, 0); if (!usemap_map) panic("can not allocate usemap_map\n"); @@ -575,7 +583,7 @@ void __init sparse_init(void) sizeof(usemap_map[0])); #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER - size2 = sizeof(struct page *) * NR_MEM_SECTIONS; + size2 = sizeof(struct page *) * nr_present_sections; map_map = memblock_virt_alloc(size2, 0); if (!map_map) panic("can not allocate map_map\n"); @@ -584,27 +592,44 @@ void __init sparse_init(void) sizeof(map_map[0])); #endif + /* The numner of present sections stored in nr_present_sections + * are kept the same since mem sections are marked as present in + * memory_present(). In this for loop, we need check which sections + * failed to allocate memmap or usemap, then clear its + * ->section_mem_map accordingly. During this process, we need + * increase 'nr_consumed_maps' whether its allocation of memmap + * or usemap failed or not, so that after we handle the i-th + * memory section, can get memmap and usemap of (i+1)-th section + * correctly. */ for_each_present_section_nr(0, pnum) { struct mem_section *ms; + + if (nr_consumed_maps >= nr_present_sections) { + pr_err("nr_consumed_maps goes beyond nr_present_sections\n"); + break; + } ms = __nr_to_section(pnum); - usemap = usemap_map[pnum]; + usemap = usemap_map[nr_consumed_maps]; if (!usemap) { ms->section_mem_map = 0; + nr_consumed_maps++; continue; } #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER - map = map_map[pnum]; + map = map_map[nr_consumed_maps]; #else map = sparse_early_mem_map_alloc(pnum); #endif if (!map) { ms->section_mem_map = 0; + nr_consumed_maps++; continue; } sparse_init_one_section(__nr_to_section(pnum), pnum, map, usemap); + nr_consumed_maps++; } vmemmap_populate_print_last();