Message ID | 20231122182419.30633-1-fancer.lancer@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | MIPS: mm: Fix some memory-related issues | expand |
On Wed, 22 Nov 2023 21:23:58 +0300 Serge Semin <fancer.lancer@gmail.com> wrote: > Just recently I've rebased my MIPS32-related work from kernel 6.5-rc4 onto > the latest kernel 6.7-rc1 and immediately got into a bootup-time > mm-related bug (see patches 3-5 in this series). After fixing it I decided > it was time to submit for review the generic MIPS code fixes which I have > been collecting in my local repo for the last year. I was going to submit > them a bit later after I finished working on a patchset with my SoC > arch-specific changes, but since it was getting bigger and bigger, it > turned to be reasonable to spill out the generic part of series right away > especially seeing it might get to be useful in the most recent kernel. It would have been better to separate out the two tiny unrelated MM patches from this series. I'll steal them - if they later turn up via the MIPS tree then that's OK.
On Wed, Nov 22, 2023 at 09:24:04PM +0300, Serge Semin wrote: > Based on the init_unavailable_range() method and it's callee semantics no > multi-line info messages are intended to be printed to the console. Thus > append the '\n' symbol to the respective info string. > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> > --- > mm/mm_init.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/mm_init.c b/mm/mm_init.c > index 3fa33e2d32ba..db8b91175834 100644 > --- a/mm/mm_init.c > +++ b/mm/mm_init.c > @@ -827,7 +827,7 @@ static void __init init_unavailable_range(unsigned long spfn, > } > > if (pgcnt) > - pr_info("On node %d, zone %s: %lld pages in unavailable ranges", > + pr_info("On node %d, zone %s: %lld pages in unavailable ranges\n", > node, zone_names[zone], pgcnt); > } > > -- > 2.42.1 >
On Wed, Nov 22, 2023 at 10:29:00AM -0800, Andrew Morton wrote: > On Wed, 22 Nov 2023 21:23:58 +0300 Serge Semin <fancer.lancer@gmail.com> wrote: > > > Just recently I've rebased my MIPS32-related work from kernel 6.5-rc4 onto > > the latest kernel 6.7-rc1 and immediately got into a bootup-time > > mm-related bug (see patches 3-5 in this series). After fixing it I decided > > it was time to submit for review the generic MIPS code fixes which I have > > been collecting in my local repo for the last year. I was going to submit > > them a bit later after I finished working on a patchset with my SoC > > arch-specific changes, but since it was getting bigger and bigger, it > > turned to be reasonable to spill out the generic part of series right away > > especially seeing it might get to be useful in the most recent kernel. > > It would have been better to separate out the two tiny unrelated MM > patches from this series. One of them isn't completely unrelated to the series content. The biggest problem I fixed in the patch [PATCH 3/7] mips: Fix max_mapnr being uninitialized on early stages Link: https://lore.kernel.org/linux-mips/20231122182419.30633-4-fancer.lancer@gmail.com/ of this series. I was sure that it was a correct fix at least for having the pfn_valid() method working incorrectly, but I had doubts whether the memory mapped IO pages were supposed to be left uninitialized by the arch code relying on the init_unavailable_range() doing that especially seeing it was printing a warning about having unavailable ranges. If it turned out to be incorrect I would have needed to drop the patch [PATCH 5/7] mm/mm_init.c: Extend init unavailable range doc info Link: https://lore.kernel.org/linux-mips/20231122182419.30633-6-fancer.lancer@gmail.com/ and fix that problem too in the framework of the MIPS arch. My alternative assumption regarding that problem was that the arch-code should have used memblock_reserve() method for the IO ranges, so then the calls-chain: mem_init() +-> memblock_free_all() +-> free_low_memory_core_early() +-> memmap_init_reserved_pages() +-> memmap_init_reserved_pages(v) +-> for_each_reserved_mem_region(region) +-> reserve_bootmem_region(start, end, nid); would have properly initialized the IO-pages reserved earlier by means of the memblock_reserve() method. But it turned out that reserve_bootmem_region() was available only when CONFIG_DEFERRED_STRUCT_PAGE_INIT was enabled which didn't seem to be widespreadly utilized in the arch code. Not finding a better option I decided to stick to the solution relying on the init_unavailable_range() method doing the trick and just fix the method kdoc. Seeing you accepted the patch [PATCH 5/7] mm/mm_init.c: Extend init unavailable range doc info it was a correct decision. > I'll steal them - if they later turn up via > the MIPS tree then that's OK. Ok. Thanks for picking them up. I'll drop those two patches from the series on v2. -Serge(y)
Hi Serge, On Wed, Nov 22, 2023 at 09:24:03PM +0300, Serge Semin wrote: > Besides of the already described reasons the pages backended memory holes > might be persistent due to having memory mapped IO spaces behind those > ranges in the framework of flatmem kernel config. Add such note to the > init_unavailable_range() method kdoc in order to point out to one more > reason of having the function executed for such regions. > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com> > > --- > > Please let me know if the IO-space pages must be initialized somehow > differently rather relying on free_area_init() executing the > init_unavailable_range() method. Maybe I'm missing something, but why do you need struct pages in the IO space? > --- > mm/mm_init.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/mm_init.c b/mm/mm_init.c > index 077bfe393b5e..3fa33e2d32ba 100644 > --- a/mm/mm_init.c > +++ b/mm/mm_init.c > @@ -796,6 +796,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn) > * - physical memory bank size is not necessarily the exact multiple of the > * arbitrary section size > * - early reserved memory may not be listed in memblock.memory > + * - memory mapped IO space > * - memory layouts defined with memmap= kernel parameter may not align > * nicely with memmap sections > * > -- > 2.42.1 >
On Thu, Nov 23, 2023 at 01:42:39PM +0300, Serge Semin wrote: > On Thu, Nov 23, 2023 at 12:18:54PM +0200, Mike Rapoport wrote: > > On Wed, Nov 22, 2023 at 09:24:03PM +0300, Serge Semin wrote: > > > Besides of the already described reasons the pages backended memory holes > > > might be persistent due to having memory mapped IO spaces behind those > > > ranges in the framework of flatmem kernel config. Add such note to the > > > init_unavailable_range() method kdoc in order to point out to one more > > > reason of having the function executed for such regions. > > > > > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com> > > > > > > --- > > > > > > Please let me know if the IO-space pages must be initialized somehow > > > differently rather relying on free_area_init() executing the > > > init_unavailable_range() method. > > > > > Maybe I'm missing something, but why do you need struct pages in the > > IO space? > > In my case at the very least that's due to having a SRAM device > available in the middle of the MMIO-space. The region is getting > mapped using the ioremap_wc() method (Uncached Write-Combine CA), > which eventually is converted to calling get_vm_area() and > ioremap_page_range() (see ioremap_prot() function on MIPS), which in > its turn use the page structs for mapping. Another similar case is > using ioremap_wc() in the PCIe outbound ATU space mapping of > the graphic/video cards framebuffers. ioremap_page_range() does not need struct pages, but rather physical addresses. > In general having the pages array defined for the IO-memory is > required for mapping the IO-space other than just uncached (my sram > case for example) or, for instance, with special access attribute for > the user-space (if I am not missing something in a way VM works in > that case). No, struct pages are not required to map IO space. If you need to map MMIO to userspace there's remap_pfn_range() for that. My guess is that your system has a hole in the physical memory mappings and with FLATMEM that hole will have essentially unused struct pages, which are initialized by init_unavailable_range(). But from mm perspective this is still a hole even though there's some MMIO ranges in that hole. Now, if that hole is large you are wasting memory for unused memory map and it maybe worth considering using SPARSEMEM. > -Serge(y) > > > > > > --- > > > mm/mm_init.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > > index 077bfe393b5e..3fa33e2d32ba 100644 > > > --- a/mm/mm_init.c > > > +++ b/mm/mm_init.c > > > @@ -796,6 +796,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn) > > > * - physical memory bank size is not necessarily the exact multiple of the > > > * arbitrary section size > > > * - early reserved memory may not be listed in memblock.memory > > > + * - memory mapped IO space > > > * - memory layouts defined with memmap= kernel parameter may not align > > > * nicely with memmap sections > > > * > > > -- > > > 2.42.1 > > > > > > > -- > > Sincerely yours, > > Mike. > >
On Fri, Nov 24, 2023 at 02:18:44PM +0300, Serge Semin wrote: > On Fri, Nov 24, 2023 at 10:19:00AM +0200, Mike Rapoport wrote: > > On Thu, Nov 23, 2023 at 01:42:39PM +0300, Serge Semin wrote: > > > On Thu, Nov 23, 2023 at 12:18:54PM +0200, Mike Rapoport wrote: > > > > On Wed, Nov 22, 2023 at 09:24:03PM +0300, Serge Semin wrote: > > > > > Besides of the already described reasons the pages backended memory holes > > > > > might be persistent due to having memory mapped IO spaces behind those > > > > > ranges in the framework of flatmem kernel config. Add such note to the > > > > > init_unavailable_range() method kdoc in order to point out to one more > > > > > reason of having the function executed for such regions. > > > > > > > > > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com> > > > > > > > > > > --- > > > > > > > > > > Please let me know if the IO-space pages must be initialized somehow > > > > > differently rather relying on free_area_init() executing the > > > > > init_unavailable_range() method. > > > > > > > > > > > Maybe I'm missing something, but why do you need struct pages in the > > > > IO space? > > > > > > In my case at the very least that's due to having a SRAM device > > > available in the middle of the MMIO-space. The region is getting > > > mapped using the ioremap_wc() method (Uncached Write-Combine CA), > > > which eventually is converted to calling get_vm_area() and > > > ioremap_page_range() (see ioremap_prot() function on MIPS), which in > > > its turn use the page structs for mapping. Another similar case is > > > using ioremap_wc() in the PCIe outbound ATU space mapping of > > > the graphic/video cards framebuffers. > > > > ioremap_page_range() does not need struct pages, but rather physical > > addresses. > > Unless I miss something or MIPS32 is somehow special/wrong in that > matter, but from my just got experience it actually does at least in > the framework of the __update_cache() implementation which is called > in the set_ptes() method (former set_pte_at()), which in its turn > is eventually invoked by vmap_range_noflush() and finally by > ioremap_page_range(). See the patch > [PATCH 3/7] mips: Fix max_mapnr being uninitialized on early stages > Link: https://lore.kernel.org/linux-mips/20231122182419.30633-4-fancer.lancer@gmail.com/ > of this series and the stack-trace of the bug fixed by that patch. > > Is it wrong that on MIPS32 ioremap_page_range() eventually relies on > the page structs? It has been like that for, I don't know, long time. > If so then the sparse memory config might be broken on MIPS32..( Do you mind posting your physical memory layout? If I understand correctly, you have a hole in your RAM and there is MMIO region somewhere in that hole. With FLATMEM the memory map exists for that hole and hence pfn_valid() returns 1 for the MMIO range as well. That makes __update_cache() to check folio state and that check would fail if the memory map contained garbage. But since the hole in the memory map is initialized with init_unavailable_range() you get a valid struct page/struct folio and everything is fine. With that, the init_unavailable_range() docs need not mention IO space at all, they should mention holes within FLATMEM memory map. As for SPARSEMEM, if the hole does not belong to any section, pfn_valid() will be false for it and __update_cache() won't try to access memory map. > > > In general having the pages array defined for the IO-memory is > > > required for mapping the IO-space other than just uncached (my sram > > > case for example) or, for instance, with special access attribute for > > > the user-space (if I am not missing something in a way VM works in > > > that case). > > > > > No, struct pages are not required to map IO space. If you need to map MMIO > > to userspace there's remap_pfn_range() for that. > > Is this correct for both flat and sparse memory config? In anyway > please see my comment above about the problem I recently got. > > > > > My guess is that your system has a hole in the physical memory mappings and > > with FLATMEM that hole will have essentially unused struct pages, which are > > initialized by init_unavailable_range(). But from mm perspective this is > > still a hole even though there's some MMIO ranges in that hole. > > Absolutely right. Here is the physical memory layout in my system. > 0 - 128MB: RAM > 128MB - 512MB: Memory mapped IO > 512MB - 768MB..8.256GB: RAM > > > > > Now, if that hole is large you are wasting memory for unused memory map and > > it maybe worth considering using SPARSEMEM. > > Do you think it's worth to move to the sparse memory configuration in > order to save the 384MB of mapping with the 16K page model? AFAIU flat > memory config is more performant. Performance is critical on the most > of the SoC applications especially when using the 10G ethernet or > the high-speed PCIe devices. > > -Serge(y) > > > > > > -Serge(y) > > > > > > > > > > > > --- > > > > > mm/mm_init.c | 1 + > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > > > > index 077bfe393b5e..3fa33e2d32ba 100644 > > > > > --- a/mm/mm_init.c > > > > > +++ b/mm/mm_init.c > > > > > @@ -796,6 +796,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn) > > > > > * - physical memory bank size is not necessarily the exact multiple of the > > > > > * arbitrary section size > > > > > * - early reserved memory may not be listed in memblock.memory > > > > > + * - memory mapped IO space > > > > > * - memory layouts defined with memmap= kernel parameter may not align > > > > > * nicely with memmap sections > > > > > * > > > > > -- > > > > > 2.42.1 > > > > > > > > > > > > > -- > > > > Sincerely yours, > > > > Mike. > > > > > > > > -- > > Sincerely yours, > > Mike.
On Tue, Nov 28, 2023 at 01:51:32PM +0300, Serge Semin wrote: > On Tue, Nov 28, 2023 at 09:13:39AM +0200, Mike Rapoport wrote: > > On Fri, Nov 24, 2023 at 02:18:44PM +0300, Serge Semin wrote: > > > Do you mind posting your physical memory layout? > > I actually already did in response to the last part of your previous > message. You must have missed it. Here is the copy of the message: Sorry, for some reason I didn't scroll down your previous mail :) > > On Fri, Nov 24, 2023 at 02:18:44PM +0300, Serge Semin wrote: > > > On Fri, Nov 24, 2023 at 10:19:00AM +0200, Mike Rapoport wrote: > > > ... > > > > > > > > My guess is that your system has a hole in the physical memory mappings and > > > > with FLATMEM that hole will have essentially unused struct pages, which are > > > > initialized by init_unavailable_range(). But from mm perspective this is > > > > still a hole even though there's some MMIO ranges in that hole. > > > > > > Absolutely right. Here is the physical memory layout in my system. > > > 0 - 128MB: RAM > > > 128MB - 512MB: Memory mapped IO > > > 512MB - 768MB..8.256GB: RAM > > > > > > > > > > > Now, if that hole is large you are wasting memory for unused memory map and > > > > it maybe worth considering using SPARSEMEM. > > > > > > Do you think it's worth to move to the sparse memory configuration in > > > order to save the 384MB of mapping with the 16K page model? AFAIU flat > > > memory config is more performant. Performance is critical on the most > > > of the SoC applications especially when using the 10G ethernet or > > > the high-speed PCIe devices. > > Could you also answer to my question above regarding using the > sparsemem instead on my hw memory layout? Currently MIPS defines section size to 256MB, so with your memory layout with SPARSMEM there will be two sections of 256MB, at 0 and at 512MB, so you'll save memory map for 256M which is roughly 1M with 16k pages. It's possible With SPARSEMEM the pfn_to_page() and page_to_pfn() are a bit longer in terms of assembly instructions, but I really doubt you'll notice any performance difference in real world applications. > > With FLATMEM the memory map exists for that > > hole and hence pfn_valid() returns 1 for the MMIO range as well. That makes > > __update_cache() to check folio state and that check would fail if the memory > > map contained garbage. But since the hole in the memory map is initialized > > with init_unavailable_range() you get a valid struct page/struct folio and > > everything is fine. > > Right. That's what currently happens on MIPS32 and that's what I had > to fix in the framework of this series by the next patch: > Link: https://lore.kernel.org/linux-mips/20231122182419.30633-4-fancer.lancer@gmail.com/ > flatmem version of the pfn_valid() method has been broken due to > max_mapnr being uninitialized before mem_init() is called. So > init_unavailable_range() didn't initialize the pages on the early > bootup stage. Thus afterwards, when max_mapnr has finally got a valid > value any attempts to call the __update_cache() method on the MMIO > memory hole caused the unaligned access crash. The fix for max_mapnr makes pfn_valid()==1 for the entire memory map and this fixes up the struct pages in the hole. > > > > With that, the init_unavailable_range() docs need not mention IO space at > > all, they should mention holes within FLATMEM memory map. > > Ok. I'll resend the patch with mentioning flatmem holes instead of > mentioning the IO-spaces. > > > > > As for SPARSEMEM, if the hole does not belong to any section, pfn_valid() > > will be false for it and __update_cache() won't try to access memory map. > > Ah, I see. In case of the SPARSEMEM config an another version of > pfn_valid() will be called. It's defined in the include/linux/mmzone.h > header file. Right? If so then no problem there indeed. Yes, SPARSMEM uses pfn_valid() defined in include/linux/mmzone.h > -Serge(y)
Just recently I've rebased my MIPS32-related work from kernel 6.5-rc4 onto the latest kernel 6.7-rc1 and immediately got into a bootup-time mm-related bug (see patches 3-5 in this series). After fixing it I decided it was time to submit for review the generic MIPS code fixes which I have been collecting in my local repo for the last year. I was going to submit them a bit later after I finished working on a patchset with my SoC arch-specific changes, but since it was getting bigger and bigger, it turned to be reasonable to spill out the generic part of series right away especially seeing it might get to be useful in the most recent kernel. So this series starts with the MIPS-specific dmi_early_remap() implementation fix. It is utilized by the DMI driver in the framework of the dmi_setup() method, which is called at the very early boot stage - in setup_arch((). No VM available at that stage which is required for the ioremap_cache() to properly work. Thus it was a mistake to have the dmi_early_remap() macro-function defined as ioremap_cache(). It should have been ioremap_uc() in first place. After that goes a fix for the high-memory zone PFNs calculation procedure on MIPS. It turned out that after some not that recent commit the IO-memory PFNs got to the high-memory even though they were directly reachable, thus should have been left in the normal zone. Then a series of fixes for the recently discovered mm-bug is presented. Any attempt to re-map the IO-memory with the cached attribute caused the bootup procedure to crash with the "Unhandled kernel unaligned access" message. After some digging I found out that the problem was in the uninitialized IO-memory pages. Please see the patch "mips: Fix max_mapnr being uninitialized on early stages" description for the detailed explanation of the problem and suggested fix. Afterwards I submitted several cleanup patches for the MIPS/mm and generic mm code. The patchset is closed with a small improvement which sets the MIPS board/machine name to the dump-stack module in order to print arch-personalized oopses in the same way as it's done on ARM, ARM64, RISC-V, etc. That's it for today.) Thanks for review in advance. Any tests are very welcome. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com> Cc: Aleksandar Rikalo <arikalo@gmail.com> Cc: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com> Cc: Chao-ying Fu <cfu@wavecomp.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Yinglu Yang <yangyinglu@loongson.cn>, Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Marc Zyngier <maz@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Serge Semin (7): mips: dmi: Fix early remap on MIPS32 mips: Fix incorrect max_low_pfn adjustment mips: Fix max_mapnr being uninitialized on early stages mips: Optimize max_mapnr init procedure mm/mm_init.c: Extend init unavailable range doc info mm/mm_init.c: Append '\n' to the unavailable ranges log-message mips: Set dump-stack arch description arch/mips/include/asm/dmi.h | 2 +- arch/mips/kernel/prom.c | 2 ++ arch/mips/kernel/setup.c | 4 ++-- arch/mips/mm/init.c | 16 +++++++++------- mm/mm_init.c | 3 ++- 5 files changed, 16 insertions(+), 11 deletions(-)