Message ID | 20230109221624.592315-1-isaacmanjarres@google.com (mailing list archive) |
---|---|
Headers | show |
Series | Fixes for kmemleak tracking with CMA regions | expand |
Hi Isaac, Please cc me on kmemleak patches. I only noticed when Andrew picket them up. On Mon, Jan 09, 2023 at 02:16:21PM -0800, Isaac J. Manjarres wrote: > When trying to boot a device with an ARM64 kernel with the following > config options enabled: > > CONFIG_DEBUG_PAGEALLOC=y > CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y > CONFIG_DEBUG_KMEMLEAK=y > > a page-fault is encountered when kmemleak starts to scan the list of gray > or allocated objects that it maintains. Upon closer inspection, it was > observed that these page-faults always occurred when kmemleak attempted > to scan a CMA region. What I don't understand is why kmemleak scans such CMA regions. The only reason for a kmemleak_ignore_phys() call in cma_declare_contiguous_nid() is because the kmemleak_alloc_phys() hook was called on the memblock_alloc_range_nid() path, so we don't want this scanned. Do you have a backtrace? > At the moment, kmemleak is made aware of CMA regions that are specified > through the devicetree to be created at specific memory addresses or > dynamically allocated within a range of addresses. However, if the > CMA region is constrained to a certain range of addresses through the > command line, the region is reserved through the memblock_reserve() > function, but kmemleak_alloc_phys() is not invoked. The combination of kmemleak_alloc_phys() + kmemleak_free_part_phys() in your series is equivalent to not adding it at all in the first place. > Furthermore, > kmemleak is never informed about CMA regions being freed to buddy at > boot, which is problematic when CONFIG_DEBUG_PAGEALLOC is enabled, as > all CMA regions are unmapped from the kernel's address space, and > subsequently causes a page-fault when kmemleak attempts to scan any > of them. kmemleak would only scan such objects if it knows about them. So I think it's only the case where CMA does a memblock allocation. The kmemleak_ignore_phys() should tell kmemleak not to touch this region but it's probably better to just free it altogether (i.e. replace the ignore with the free kmemleak callback). Would this be sufficient for your scenario? > This series makes it so that kmemleak is aware of every CMA region before > they are freed to the buddy allocator, so that at that time, kmemleak > can be informed that each region is about to be freed, and thus it > should not attempt to scan those regions. I may be missing something but I don't get why kmemleak needs to be informed only to tell kmemleak shortly after to remove them from its list of objects.
On Wed, Jan 18, 2023 at 05:16:46PM +0000, Catalin Marinas wrote: > Hi Isaac, > > Please cc me on kmemleak patches. I only noticed when Andrew picket them > up. Will do, sorry about that. > > What I don't understand is why kmemleak scans such CMA regions. The only > reason for a kmemleak_ignore_phys() call in cma_declare_contiguous_nid() > is because the kmemleak_alloc_phys() hook was called on the > memblock_alloc_range_nid() path, so we don't want this scanned. The reason is because kmemleak_ignore_phys() is only called within cma_declare_contiguous_nid(), which is not called for every CMA region. For instance, CMA regions which are specified through the devicetree and not constrained to a fixed address are allocated through early_init_dt_alloc_reserved_memory_arch(), which eventually calls kmemleak_alloc_phys() through memblock_phys_alloc_range(). When the CMA region is constrained to a particular address, it is allocated through early_init_dt_reserve_memory(), which is followed up by a call to kmemleak_alloc_phys() due to this commit: https://lore.kernel.org/all/20211123090641.3654006-1-calvinzhang.cool@gmail.com/T/#u I'm not sure if that commit is appropriate, given that reserved regions that still have their direct mappings intact may be used for DMA, which isn't appropriate for kmemleak scanning. In any one of these cases, the kmemleak object is never removed, nor is kmemleak told to ignore it, so it ends up scanning it later. > Do you have a backtrace? Yes: [ 61.155981][ T97] Unable to handle kernel paging request at virtual address ... [ 61.156241][ T97] Hardware name: Oriole EVT 1.1 (DT) [ 61.156243][ T97] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 61.156246][ T97] pc : scan_block+0xbc/0x3c8 [ 61.156253][ T97] lr : scan_block+0xac/0x3c8 [ 61.156291][ T97] Call trace: [ 61.156293][ T97] scan_block+0xbc/0x3c8 [ 61.156296][ T97] scan_gray_list+0x130/0x23c [ 61.156299][ T97] kmemleak_scan+0x408/0x71c [ 61.156302][ T97] kmemleak_scan_thread+0xc0/0xf0 [ 61.156306][ T97] kthread+0x114/0x15c [ 61.156311][ T97] ret_from_fork+0x10/0x20 when I looked at the PTE from the page table walk, I saw that the virtual address mapped to the physical address within a CMA region, and that the page was unmapped (because of CONFIG_DEBUG_PAGEALLOC). > kmemleak would only scan such objects if it knows about them. So I think > it's only the case where CMA does a memblock allocation. The > kmemleak_ignore_phys() should tell kmemleak not to touch this region but > it's probably better to just free it altogether (i.e. replace the ignore > with the free kmemleak callback). Would this be sufficient for your > scenario? I agree that freeing the kmemleak object is a better strategy. However, replacing the call to kmemleak_ignore_phys() wouldn't be sufficient, as there are other scenarios that would still leave behind kmemleak objects to be scanned. That's why I ended up freeing the kmemleak object in a path that is common for all CMA areas. > I may be missing something but I don't get why kmemleak needs to be > informed only to tell kmemleak shortly after to remove them from its > list of objects. That's a good point, and I agree that it's pointless; ideally whatever way the CMA region is allocated would not inform kmemleak, and we wouldn't have to deal with kmemleak. We could use a flag like MEMBLOCK_ALLOC_NOLEAKTRACE to tell memblock to skip the call to kmemleak_alloc_phys(), but that wouldn't work as is, since MEMBLOCK_ALLOC_NOLEAKTRACE is meant to be used as the "end" parameter for memblock_alloc() and friends. For CMA regions "end" can be used (e.g. the case where the CMA region is supposed to be within a certain range). Perhaps we should change memblock to take MEMBLOCK_ALLOC_NOLEAKTRACE as a flag under a "flags" argument instead? --Isaac
On Thu, Jan 19, 2023 at 04:20:56PM -0800, Isaac Manjarres wrote: > On Wed, Jan 18, 2023 at 05:16:46PM +0000, Catalin Marinas wrote: > > What I don't understand is why kmemleak scans such CMA regions. The only > > reason for a kmemleak_ignore_phys() call in cma_declare_contiguous_nid() > > is because the kmemleak_alloc_phys() hook was called on the > > memblock_alloc_range_nid() path, so we don't want this scanned. > The reason is because kmemleak_ignore_phys() is only called within > cma_declare_contiguous_nid(), which is not called for every CMA region. > > For instance, CMA regions which are specified through the devicetree > and not constrained to a fixed address are allocated through > early_init_dt_alloc_reserved_memory_arch(), which eventually calls > kmemleak_alloc_phys() through memblock_phys_alloc_range(). > > When the CMA region is constrained to a particular address, it is allocated > through early_init_dt_reserve_memory(), which is followed up by a call to > kmemleak_alloc_phys() due to this commit: > https://lore.kernel.org/all/20211123090641.3654006-1-calvinzhang.cool@gmail.com/T/#u Thanks for digging this out. This patch shouldn't have ended up upstream (commit 972fa3a7c17c "mm: kmemleak: alloc gray object for reserved region with direct map"). I thought both Calvin Zhang and I agreed that it's not the correct approach (not even sure there was a real problem to fix). Do you still get the any faults with the above commit reverted? I'd prefer this if it works rather than adding unnecessary kmemleak_alloc/free callbacks that pretty much cancel each-other. > I'm not sure if that commit is appropriate, given that reserved regions > that still have their direct mappings intact may be used for DMA, which > isn't appropriate for kmemleak scanning. It's not. I think it should be reverted. > > kmemleak would only scan such objects if it knows about them. So I think > > it's only the case where CMA does a memblock allocation. The > > kmemleak_ignore_phys() should tell kmemleak not to touch this region but > > it's probably better to just free it altogether (i.e. replace the ignore > > with the free kmemleak callback). Would this be sufficient for your > > scenario? > > I agree that freeing the kmemleak object is a better strategy. However, > replacing the call to kmemleak_ignore_phys() wouldn't be sufficient, > as there are other scenarios that would still leave behind kmemleak > objects to be scanned. That's why I ended up freeing the kmemleak object > in a path that is common for all CMA areas. The only reason for kmemleak_ignore_phys() was to counter the actual kmemleak_alloc() call from the memblock code on the CMA allocation.
On Tue, 24 Jan 2023 15:48:57 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote: > Thanks for digging this out. This patch shouldn't have ended up upstream > (commit 972fa3a7c17c "mm: kmemleak: alloc gray object for reserved > region with direct map"). I thought both Calvin Zhang and I agreed that > it's not the correct approach (not even sure there was a real problem to > fix). > > Do you still get the any faults with the above commit reverted? I'd > prefer this if it works rather than adding unnecessary > kmemleak_alloc/free callbacks that pretty much cancel each-other. > > > I'm not sure if that commit is appropriate, given that reserved regions > > that still have their direct mappings intact may be used for DMA, which > > isn't appropriate for kmemleak scanning. > > It's not. I think it should be reverted. Could someone please send along a patch to revert this, along with the explanation for doing so? And please consider a cc:stable.
On Tue, Jan 24, 2023 at 03:48:57PM +0000, Catalin Marinas wrote: > Thanks for digging this out. This patch shouldn't have ended up upstream > (commit 972fa3a7c17c "mm: kmemleak: alloc gray object for reserved > region with direct map"). I thought both Calvin Zhang and I agreed that > it's not the correct approach (not even sure there was a real problem to > fix). > > Do you still get the any faults with the above commit reverted? I'd > prefer this if it works rather than adding unnecessary > kmemleak_alloc/free callbacks that pretty much cancel each-other. Yes, I still see the same problem after reverting that commit. The problem still persists because there are CMA areas that are allocated through memblock_phys_alloc_range(), which invokes kmemleak_alloc_phys(). The allocation call stack is along the lines of: kmemleak_alloc_phys() memblock_alloc_range_nid() memblock_phys_alloc_range() __reserved_mem_alloc_size() fdt_init_reserved_mem() I also followed up on my suggestion about adding a flags parameter to the memblock allocation functions to be able to use MEMBLOCK_ALLOC_NOLEAKTRACE in this particular scenario, but that would involve changing many call-sites, which doesn't make much sense given that there are only 4 call-sites that actually use this flag. Maybe adding a new memblock allocation function that allows this flag to be passed as just a flag can be used to avoid creating these kmemleak objects for CMA allocations? --Isaac
On Tue, Jan 24, 2023 at 12:20:15PM -0800, Andrew Morton wrote: > On Tue, 24 Jan 2023 15:48:57 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote: > > > Thanks for digging this out. This patch shouldn't have ended up upstream > > (commit 972fa3a7c17c "mm: kmemleak: alloc gray object for reserved > > region with direct map"). I thought both Calvin Zhang and I agreed that > > it's not the correct approach (not even sure there was a real problem to > > fix). > > > > Do you still get the any faults with the above commit reverted? I'd > > prefer this if it works rather than adding unnecessary > > kmemleak_alloc/free callbacks that pretty much cancel each-other. > > > > > I'm not sure if that commit is appropriate, given that reserved regions > > > that still have their direct mappings intact may be used for DMA, which > > > isn't appropriate for kmemleak scanning. > > > > It's not. I think it should be reverted. > > Could someone please send along a patch to revert this, along > with the explanation for doing so? And please consider a cc:stable. Yes, I can send a revert patch later today. My patches that are currently in mm-unstable depend on this patch though, so those would have to be dropped from that branch as well. --Isaac
On Tue, Jan 24, 2023 at 01:19:57PM -0800, Isaac Manjarres wrote: > On Tue, Jan 24, 2023 at 03:48:57PM +0000, Catalin Marinas wrote: > > Thanks for digging this out. This patch shouldn't have ended up upstream > > (commit 972fa3a7c17c "mm: kmemleak: alloc gray object for reserved > > region with direct map"). I thought both Calvin Zhang and I agreed that > > it's not the correct approach (not even sure there was a real problem to > > fix). > > > > Do you still get the any faults with the above commit reverted? I'd > > prefer this if it works rather than adding unnecessary > > kmemleak_alloc/free callbacks that pretty much cancel each-other. > > Yes, I still see the same problem after reverting that commit. The problem > still persists because there are CMA areas that are allocated through > memblock_phys_alloc_range(), which invokes kmemleak_alloc_phys(). The > allocation call stack is along the lines of: > > kmemleak_alloc_phys() > memblock_alloc_range_nid() > memblock_phys_alloc_range() > __reserved_mem_alloc_size() > fdt_init_reserved_mem() Ah, that's another place that kmemleak shouldn't care about. > I also followed up on my suggestion about adding a flags parameter to > the memblock allocation functions to be able to use > MEMBLOCK_ALLOC_NOLEAKTRACE in this particular scenario, but that would > involve changing many call-sites, which doesn't make much sense given > that there are only 4 call-sites that actually use this flag. Yeah, the current way of passing MEMBLOCK_ALLOC_NOLEAKTRACE is not ideal. We did it more like a quick hack. See some past discussion here (and adding Mike to this thread): https://lore.kernel.org/all/YYUChdTeXP%2FOQUwS@arm.com/ > Maybe adding a new memblock allocation function that allows this flag to > be passed as just a flag can be used to avoid creating these kmemleak > objects for CMA allocations? That's an option. If there's too much churn to add a flag, an alternative is to use the bottom bit of 'end' to set the noleaktrace flag. Yet another idea is to avoid the kmemleak callback on all the 'phys' memblock allocations. We can add the callback to the higher level memblock_alloc() which returns a VA but the lower level 'phys' variants could simply avoid it. However, I think we still need the MEMBLOCK_ALLOC_NOLEAKTRACE flag for the kasan shadow allocation. Well, given that this flag is not widely used, we can add explicit kmemleak_ignore() calls in those four places. I think the latter, if it works, would be the least intrusive.
On Wed, Jan 25, 2023 at 12:08:58PM +0000, Catalin Marinas wrote: > That's an option. If there's too much churn to add a flag, an > alternative is to use the bottom bit of 'end' to set the noleaktrace > flag. Using the least significant bit won't work; there are allocations for CMA regions that can be specified to occur within the first 4 GB of memory, and would have an alloc-ranges of [0 0xffff_ffff]. I also don't think there's anything in the memblock documentation that ensures that those bits are supposed to be clear all the time. > Yet another idea is to avoid the kmemleak callback on all the 'phys' > memblock allocations. We can add the callback to the higher level > memblock_alloc() which returns a VA but the lower level 'phys' variants > could simply avoid it. However, I think we still need the > MEMBLOCK_ALLOC_NOLEAKTRACE flag for the kasan shadow allocation. Well, > given that this flag is not widely used, we can add explicit > kmemleak_ignore() calls in those four places. > > I think the latter, if it works, would be the least intrusive. I agree; I think using kmemleak_ignore() would be best. I will split that into series: 1 series that fixes the kmemleak issue with CMA regions by reverting Calvin's patch and adding a call to kmemleak_ignore in the call-stack I referenced earlier, and then another series that cleans up the usage of the flag. --Isaac