Message ID | c252ab82-dc4a-4267-84c3-4a20471524c5@yoseli.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | m68k 54418 fails to execute user space | expand |
Hi all, I keep going with my investigations, hoping that someone will have an idea to help :-) ! I think I found something, but not sure: On 19/06/2024 14:29, Jean-Michel Hautbois wrote: > Hello ! > > I am (still :-)) trying to make an upstream kernel work on my mcf54418 > custom board. The kernel boots nicely now, but I have an issue when > executing binaries. > > I tried uclibc-ng and glibc, with similar results. > > First, I need a patch to change the ELF_ET_DYN_BASE value [1]. > Next, I switched from jffs2 to ubifs, but the behavior is the same, with > one or the other (just that jffs2 needs an extra patch to use > dirty_folio callback) [2]. > > Now, I wrote a very dumb and classical program: > > $> cat hello.c > #include <stdio.h> > > int main() > { > printf("Hello World on coldfire\n"); > } > > And I linked it statically, and dynamically (later called hello-stat and > hello-dyn). > > -- I added another patch to help debug the issue: > diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c > index a1f46919934c..d01bd3276204 100644 > --- a/fs/ubifs/file.c > +++ b/fs/ubifs/file.c > @@ -1474,6 +1474,7 @@ static bool ubifs_dirty_folio(struct address_space > *mapping, > * An attempt to dirty a page without budgeting for it - should > not > * happen. > */ > + VM_BUG_ON_FOLIO(ret != false, folio); > ubifs_assert(c, ret == false); > return ret; > } > > Now comes the (fun ?) part. > > When I execute /bin/bash as my init, and I launch the hello-dyn, I get: > [ 4.140000] Run /bin/bash as init process > [ 4.150000] with arguments: > [ 4.150000] /bin/bash > [ 4.150000] with environment: > [ 4.160000] HOME=/ > [ 4.160000] TERM=linux > bash: cannot set terminal process group (-1): Inappropriate ioctl for > device > bash: no job control in this shell > bash-5.2# [ 166.510000] random: crng init done > > bash-5.2# /bin/hello-dyn > Hello World on coldfir[ 382.450000] page: refcount:2 mapcount:1 > mapping:4206dc84 index:0x94 pfn:0x20ac9 > e > [ 382.460000] aops:0x413b1470 ino:9f dentry name:"libc.so.6" > [ 382.460000] flags: 0x38(uptodate|dirty|lru|zone=0) > [ 382.470000] raw: 00000038 4fed43dc 4fed4544 4206dc84 00000094 > 00000000 00000000 00000002 > [ 382.480000] raw: 41592000 > [ 382.480000] page dumped because: VM_BUG_ON_FOLIO(ret != false) > [ 382.490000] kernel BUG at fs/ubifs/file.c:1477! > [ 382.490000] *** TRAP #7 *** FORMAT=4 > [ 382.490000] Current process id is 26 > [ 382.490000] BAD KERNEL TRAP: 00000000 > [ 382.490000] PC: [<410fba84>] ubifs_dirty_folio+0x3e/0x42 > [ 382.490000] SR: 2010 SP: 4286bda4 a2: 41f24510 > [ 382.490000] d0: 00000023 d1: 000000b9 d2: 4fed4444 d3: 6014c000 > [ 382.490000] d4: 41593419 d5: 60150000 a0: 4286bce8 a1: 414d5180 > [ 382.490000] Process hello-dyn (pid: 26, task=41f24510) > [ 382.490000] Frame format=4 eff addr=41450e8d pc=414674a3 > [ 382.490000] Stack from 4286bde0: > [ 382.490000] 000005c5 4145aeed 41467485 4fed4444 410586de > 4206dc84 4fed4444 41593419 > [ 382.490000] 4286bf1a 41074196 4fed4444 00000000 ffffffff > fffffffe 4286be9e 60150000 > [ 382.490000] 00000001 41387c58 41073de0 4107797c 00000001 > 4283f184 ffffffff 42884600 > [ 382.490000] 60150000 42884600 4283a3d0 4286a000 4283a3fc > 00000000 00000000 00000000 > [ 382.490000] 00000000 4286bf5c 41074456 4286bf1a 4283f184 > 6014c000 60150000 4286be9e > [ 382.490000] 4283a3fc 4286bef6 4286bf1a 4102a05a 4283a3d4 > 4283f344 4283a3d0 42830000 > [ 382.490000] Call Trace: [<410586de>] folio_mark_dirty+0x44/0x58 > [ 382.490000] [<41074196>] unmap_page_range+0x3b6/0x602 > [ 382.490000] [<41387c58>] mas_find+0x0/0xfa > [ 382.490000] [<41073de0>] unmap_page_range+0x0/0x602 > [ 382.490000] [<4107797c>] vma_next+0x0/0x14 > [ 382.490000] [<41074456>] unmap_vmas+0x74/0x98 > [ 382.490000] [<4102a05a>] up_read+0x0/0x36 > [ 382.490000] [<41079ab2>] exit_mmap+0xd4/0x1c0 > [ 382.490000] [<41009300>] arch_local_irq_enable+0x0/0xc > [ 382.490000] [<410092f4>] arch_local_irq_disable+0x0/0xc > [ 382.490000] [<41006b76>] __mmput+0x2e/0x86 > [ 382.490000] [<4100a080>] do_exit+0x21e/0x6e8 > [ 382.490000] [<4100a6c8>] sys_exit_group+0x0/0x14 > [ 382.490000] [<4100a686>] do_group_exit+0x22/0x64 > [ 382.490000] [<4100a6dc>] pid_child_should_wake+0x0/0x56 > [ 382.490000] [<41005848>] system_call+0x54/0xa8 > [ 382.490000] > [ 382.490000] Code: 74a3 4879 4145 0e8d 4eb9 4139 adb8 4e47 <241f> 4e75 > 91c8 f4e8 5288 f4e8 5288 f4e8 5288 f4e8 41e8 000d b1fc 0000 07f0 63e6 > [ 382.490000] Disabling lock debugging due to kernel taint > [ 382.490000] note: hello-dyn[26] exited with irqs disabled > [ 382.500000] Fixing recursive fault but reboot is needed! > [ 387.520000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: folio->private != NULL, in fs/ubifs/file.c:1016 > [ 387.530000] UBIFS warning (ubi0:0 pid 24): ubifs_ro_mode: switched to > read-only mode, error -22 > [ 387.540000] CPU: 0 PID: 24 Comm: kworker/u4:1 Tainted: G D > 6.10.0-rc4-00050-g21c7ef993c85 #491 > [ 387.540000] Workqueue: writeback wb_workfn (flush-ubifs_0_0) > [ 387.540000] Stack from 41c7fc1c: > [ 387.540000] 41c7fc1c 4145ed0d 4145ed0d 00000000 0012d0a0 > 41e4d000 413a271a 4145ed0d > [ 387.540000] 410fbf84 41e4d000 ffffffea 414674a3 000003f8 > 4fed4444 41c7fd72 41c7fcea > [ 387.540000] 41c7fce6 41c7fd72 00002230 41058b72 41051efc > 41c7fd72 41051efc 41c7fd72 > [ 387.540000] 41058e74 4fed4444 41c7fd72 00000000 00000064 > 00000400 4281c200 4206dc84 > [ 387.540000] 00000000 410fb9dc 4206dc84 41c7fd72 410fbf00 > 00000000 41059f34 4206dc84 > [ 387.540000] 41c7fd72 4206dbac 00000400 00000000 00000004 > 41c7fd72 4206dbac 41c7fd72 > [ 387.540000] Call Trace: [<413a271a>] dump_stack+0xc/0x10 > [ 387.540000] [<410fbf84>] ubifs_writepage+0x84/0x1e0 > [ 387.540000] [<41058b72>] writeback_iter+0x0/0x2b0 > [ 387.540000] [<41051efc>] folio_unlock+0x0/0x5c > [ 387.540000] [<41051efc>] folio_unlock+0x0/0x5c > [ 387.540000] [<41058e74>] write_cache_pages+0x52/0x6e > [ 387.540000] [<410fb9dc>] ubifs_writepages+0x14/0x1a > [ 387.540000] [<410fbf00>] ubifs_writepage+0x0/0x1e0 > [ 387.540000] [<41059f34>] do_writepages+0x76/0x192 > [ 387.540000] [<41024378>] vruntime_eligible+0x70/0x8c > [ 387.540000] [<410b7b06>] __writeback_single_inode+0x46/0x182 > [ 387.540000] [<410b7e9a>] writeback_sb_inodes+0x2c/0x31a > [ 387.540000] [<410b7ffc>] writeback_sb_inodes+0x18e/0x31a > [ 387.540000] [<4109784a>] super_trylock_shared+0x0/0x40 > [ 387.540000] [<410b7e6e>] writeback_sb_inodes+0x0/0x31a > [ 387.540000] [<4102a05a>] up_read+0x0/0x36 > [ 387.540000] [<41020000>] devm_register_sys_off_handler+0x18/0x64 > [ 387.540000] [<410b81f4>] __writeback_inodes_wb+0x6c/0x9e > [ 387.540000] [<41059ce8>] wb_over_bg_thresh+0x0/0x7c > [ 387.540000] [<410b8332>] queue_io+0x0/0xaa > [ 387.540000] [<410b8484>] wb_writeback.isra.0+0xa8/0x132 > [ 387.540000] [<410b84ea>] wb_writeback.isra.0+0x10e/0x132 > [ 387.540000] [<410b72d8>] arch_local_irq_enable+0x0/0xc > [ 387.540000] [<410b72cc>] arch_local_irq_disable+0x0/0xc > [ 387.540000] [<410b740a>] finish_writeback_work+0x0/0x44 > [ 387.540000] [<410b83dc>] wb_writeback.isra.0+0x0/0x132 > [ 387.540000] [<410b8ac4>] wb_workfn+0x188/0x286 > [ 387.540000] [<41393fba>] sized_strscpy+0x0/0x13e > [ 387.540000] [<41016312>] list_del_init+0x0/0x18 > [ 387.540000] [<41021ccc>] arch_local_irq_disable+0x0/0x12 > [ 387.540000] [<41018382>] process_scheduled_works+0x172/0x220 > [ 387.540000] [<410162e8>] arch_local_irq_disable+0x0/0xc > [ 387.540000] [<41017b8c>] assign_work+0x0/0x54 > [ 387.540000] [<41018afa>] worker_thread+0x0/0x212 > [ 387.540000] [<41018cd2>] worker_thread+0x1d8/0x212 > [ 387.540000] [<4101d3f4>] kthread_exit+0x0/0x14 > [ 387.540000] [<41018afa>] worker_thread+0x0/0x212 > [ 387.540000] [<4101d6ac>] kthread+0x9a/0xa4 > [ 387.540000] [<4101d612>] kthread+0x0/0xa4 > [ 387.540000] [<410033d0>] ret_from_kernel_thread+0xc/0x14 > [ 387.540000] > [ 387.550000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: !c->ro_media && !c->ro_mount, in fs/ubifs/journal.c:108 > [ 387.570000] UBIFS error (ubi0:0 pid 24): make_reservation: cannot > reserve 4144 bytes in jhead 2, error -30 > [ 387.580000] UBIFS error (ubi0:0 pid 24): do_writepage: cannot write > folio 148 of inode 159, error -30 > [ 387.590000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: folio->private != NULL, in fs/ubifs/file.c:944 > [ 387.600000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: c->bi.dd_growth >= 0, in fs/ubifs/budget.c:550 > > If I reboot, and launch hello-stat from bash, I get: > [ 4.140000] Run /bin/bash as init process > [ 4.150000] with arguments: > [ 4.150000] /bin/bash > [ 4.150000] with environment: > [ 4.160000] HOME=/ > [ 4.160000] TERM=linux > bash: cannot set terminal process group (-1): Inappropriate ioctl for > device > bash: no job control in this shell > bash-5.2# /bin/hello-stat > > I don't have a printf() printed, and a second launch results in: > bash-5.2# /bin/hello-stat > [ 18.540000] page: refcount:2 mapcount:1 mapping:4207134c index:0x29 > pfn:0x27f23 > [ 18.550000] aops:0x413b1470 ino:8f dentry name:"hello-stat" > [ 18.560000] flags: 0x38(uptodate|dirty|lru|zone=0) > [ 18.560000] raw: 00000038 4fed5a14 414f3fdc 4207134c 00000029 > 00000000 00000000 00000002 > [ 18.570000] raw: 4fe46000 > [ 18.570000] page dumped because: VM_BUG_ON_FOLIO(ret != false) > [ 18.580000] kernel BUG at fs/ubifs/file.c:1477! > [ 18.580000] *** TRAP #7 *** FORMAT=4 > [ 18.580000] Current process id is 28 > [ 18.580000] BAD KERNEL TRAP: 00000000 > [ 18.580000] PC: [<410fba84>] ubifs_dirty_folio+0x3e/0x42 > [ 18.580000] SR: 2010 SP: 4288bc9c a2: 41f24a10 > [ 18.580000] d0: 00000023 d1: 000000b8 d2: 4ffda0ec d3: 80054000 > [ 18.580000] d4: 4fe47419 d5: 8005a000 a0: 4288bbe0 a1: 414d5180 > [ 18.580000] Process hello-stat (pid: 28, task=41f24a10) > [ 18.580000] Frame format=4 eff addr=41450e8d pc=414674a3 > [ 18.580000] Stack from 4288bcd8: > [ 18.580000] 000005c5 4145aeed 41467485 4ffda0ec 410586de > 4207134c 4ffda0ec 4fe47419 > [ 18.580000] 4288be12 41074196 4ffda0ec 00000000 ffffffff > fffffffe 4288bd96 8005a000 > [ 18.580000] 00000001 41387c58 41073de0 4107797c 00000001 > 4283eb64 ffffffff 42890800 > [ 18.580000] 8005a000 42890800 4283a1f0 4288a000 4283a21c > 00000000 00000000 00000000 > [ 18.580000] 00000000 4288be54 41074456 4288be12 4283eb64 > 80052000 8005a000 4288bd96 > [ 18.580000] 4283a21c 4288bdee 4288be12 4102a05a 4283a1f4 > 4283eaf4 4283a1f0 42830000 > [ 18.580000] Call Trace: [<410586de>] folio_mark_dirty+0x44/0x58 > [ 18.580000] [<41074196>] unmap_page_range+0x3b6/0x602 > [ 18.580000] [<41387c58>] mas_find+0x0/0xfa > [ 18.580000] [<41073de0>] unmap_page_range+0x0/0x602 > [ 18.580000] [<4107797c>] vma_next+0x0/0x14 > [ 18.580000] [<41074456>] unmap_vmas+0x74/0x98 > [ 18.580000] [<4102a05a>] up_read+0x0/0x36 > [ 18.580000] [<41079ab2>] exit_mmap+0xd4/0x1c0 > [ 18.580000] [<41009300>] arch_local_irq_enable+0x0/0xc > [ 18.580000] [<410092f4>] arch_local_irq_disable+0x0/0xc > [ 18.580000] [<41014283>] __sys_setreuid+0x99/0x138 > [ 18.580000] [<41006b76>] __mmput+0x2e/0x86 > [ 18.580000] [<4100a080>] do_exit+0x21e/0x6e8 > [ 18.580000] [<4100a686>] do_group_exit+0x22/0x64 > [ 18.580000] [<4101276e>] get_signal+0x420/0x502 > [ 18.580000] [<41003d68>] test_ti_thread_flag+0x0/0x14 > [ 18.580000] [<41004444>] do_notify_resume+0x4a/0x4e2 > [ 18.580000] [<41011cac>] force_sig_fault_to_task+0x40/0x4c > [ 18.580000] [<410058fe>] Lsignal_return+0x14/0x26 > [ 18.580000] > [ 18.580000] Code: 74a3 4879 4145 0e8d 4eb9 4139 adb8 4e47 <241f> 4e75 > 91c8 f4e8 5288 f4e8 5288 f4e8 5288 f4e8 41e8 000d b1fc 0000 07f0 63e6 > [ 18.580000] Disabling lock debugging due to kernel taint > [ 18.580000] note: hello-stat[28] exited with irqs disabled > [ 18.590000] Fixing recursive fault but reboot is needed! > [ 23.600000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: folio->private != NULL, in fs/ubifs/file.c:1016 > [ 23.610000] UBIFS warning (ubi0:0 pid 24): ubifs_ro_mode: switched to > read-only mode, error -22 > [ 23.620000] CPU: 0 PID: 24 Comm: kworker/u4:1 Tainted: G D > 6.10.0-rc4-00050-g21c7ef993c85 #491 > [ 23.620000] Workqueue: writeback wb_workfn (flush-ubifs_0_0) > [ 23.620000] Stack from 41c7fc1c: > [ 23.620000] 41c7fc1c 4145ed0d 4145ed0d 00000000 00233cf8 > 41e4d000 413a271a 4145ed0d > [ 23.620000] 410fbf84 41e4d000 ffffffea 414674a3 000003f8 > 4ffda0ec 41c7fd72 41c7fcea > [ 23.620000] 41c7fce6 41c7fd72 ffff9408 41058b72 41051efc > 41c7fd72 41051efc 41c7fd72 > [ 23.620000] 41058e74 4ffda0ec 41c7fd72 00000000 00000064 > 00000400 4281c200 4207134c > [ 23.620000] 00000000 410fb9dc 4207134c 41c7fd72 410fbf00 > 00000000 41059f34 4207134c > [ 23.620000] 41c7fd72 42071274 00000400 00000000 00000004 > 41c7fd72 42071274 41c7fd72 > [ 23.620000] Call Trace: [<413a271a>] dump_stack+0xc/0x10 > [ 23.620000] [<410fbf84>] ubifs_writepage+0x84/0x1e0 > [ 23.620000] [<41058b72>] writeback_iter+0x0/0x2b0 > [ 23.620000] [<41051efc>] folio_unlock+0x0/0x5c > [ 23.620000] [<41051efc>] folio_unlock+0x0/0x5c > [ 23.620000] [<41058e74>] write_cache_pages+0x52/0x6e > [ 23.620000] [<410fb9dc>] ubifs_writepages+0x14/0x1a > [ 23.620000] [<410fbf00>] ubifs_writepage+0x0/0x1e0 > [ 23.620000] [<41059f34>] do_writepages+0x76/0x192 > [ 23.620000] [<41024378>] vruntime_eligible+0x70/0x8c > [ 23.620000] [<410b7b06>] __writeback_single_inode+0x46/0x182 > [ 23.620000] [<410b7e9a>] writeback_sb_inodes+0x2c/0x31a > [ 23.620000] [<410b7ffc>] writeback_sb_inodes+0x18e/0x31a > [ 23.620000] [<4109784a>] super_trylock_shared+0x0/0x40 > [ 23.620000] [<410b7e6e>] writeback_sb_inodes+0x0/0x31a > [ 23.620000] [<4102a05a>] up_read+0x0/0x36 > [ 23.620000] [<41020000>] devm_register_sys_off_handler+0x18/0x64 > [ 23.620000] [<410b81f4>] __writeback_inodes_wb+0x6c/0x9e > [ 23.620000] [<41059ce8>] wb_over_bg_thresh+0x0/0x7c > [ 23.620000] [<410b8332>] queue_io+0x0/0xaa > [ 23.620000] [<410b8484>] wb_writeback.isra.0+0xa8/0x132 > [ 23.620000] [<410b84ea>] wb_writeback.isra.0+0x10e/0x132 > [ 23.620000] [<410b72d8>] arch_local_irq_enable+0x0/0xc > [ 23.620000] [<410b72cc>] arch_local_irq_disable+0x0/0xc > [ 23.620000] [<410b740a>] finish_writeback_work+0x0/0x44 > [ 23.620000] [<410b83dc>] wb_writeback.isra.0+0x0/0x132 > [ 23.620000] [<410b8ac4>] wb_workfn+0x188/0x286 > [ 23.620000] [<41393fba>] sized_strscpy+0x0/0x13e > [ 23.620000] [<41016312>] list_del_init+0x0/0x18 > [ 23.620000] [<41021ccc>] arch_local_irq_disable+0x0/0x12 > [ 23.620000] [<41018382>] process_scheduled_works+0x172/0x220 > [ 23.620000] [<410162e8>] arch_local_irq_disable+0x0/0xc > [ 23.620000] [<41017b8c>] assign_work+0x0/0x54 > [ 23.620000] [<41018afa>] worker_thread+0x0/0x212 > [ 23.620000] [<41018cd2>] worker_thread+0x1d8/0x212 > [ 23.620000] [<4101d3f4>] kthread_exit+0x0/0x14 > [ 23.620000] [<41018afa>] worker_thread+0x0/0x212 > [ 23.620000] [<4101d6ac>] kthread+0x9a/0xa4 > [ 23.620000] [<4101d612>] kthread+0x0/0xa4 > [ 23.620000] [<410033d0>] ret_from_kernel_thread+0xc/0x14 > [ 23.620000] > [ 23.630000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: !c->ro_media && !c->ro_mount, in fs/ubifs/journal.c:108 > [ 23.650000] UBIFS error (ubi0:0 pid 24): make_reservation: cannot > reserve 4144 bytes in jhead 2, error -30 > [ 23.660000] UBIFS error (ubi0:0 pid 24): do_writepage: cannot write > folio 41 of inode 143, error -30 > [ 23.670000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: folio->private != NULL, in fs/ubifs/file.c:944 > [ 23.680000] UBIFS error (ubi0:0 pid 24): ubifs_assert_failed: UBIFS > assert failed: c->bi.dd_growth >= 0, in fs/ubifs/budget.c:550 > > Now, if I change the init process, and call /bin/hello-dyn instead: > [ 4.140000] Run /bin/hello-dyn as init process > [ 4.150000] with arguments: > [ 4.150000] /bin/hello-dyn > [ 4.150000] with environment: > [ 4.160000] HOME=/ > [ 4.160000] TERM=linux > Hello World on coldfire > [ 4.470000] Kernel panic - not syncing: Attempted to kill init! > exitcode=0x00000000 > [ 4.470000] CPU: 0 PID: 1 Comm: hello-dyn Not tainted > 6.10.0-rc4-00050-g21c7ef993c85 #491 > [ 4.470000] Stack from 4186df2c: > [ 4.470000] 4186df2c 4145ed0d 4145ed0d 41009300 00000001 > 000000ff 413a271a 4145ed0d > [ 4.470000] 4139a532 41009300 00000000 000000ff 00000001 > 418a0010 00000000 410092f4 > [ 4.470000] 4186dfac 41009fd4 41452c80 00000000 00000000 > 4100a6c8 00000000 00000001 > [ 4.470000] 00000000 00000001 41872010 6014f468 6014c000 > 41872010 6014f468 6014c000 > [ 4.470000] 601501cc 4100a686 00000000 4186c000 00000000 > 4100a6dc 00000000 41005848 > [ 4.470000] 00000000 00000001 000000f7 00000000 00000001 > 60161700 60161700 601501d4 > [ 4.470000] Call Trace: [<41009300>] arch_local_irq_enable+0x0/0xc > [ 4.470000] [<413a271a>] dump_stack+0xc/0x10 > [ 4.470000] [<4139a532>] panic+0xc6/0x25c > [ 4.470000] [<41009300>] arch_local_irq_enable+0x0/0xc > [ 4.470000] [<410092f4>] arch_local_irq_disable+0x0/0xc > [ 4.470000] [<41009fd4>] do_exit+0x172/0x6e8 > [ 4.470000] [<4100a6c8>] sys_exit_group+0x0/0x14 > [ 4.470000] [<4100a686>] do_group_exit+0x22/0x64 > [ 4.470000] [<4100a6dc>] pid_child_should_wake+0x0/0x56 > [ 4.470000] [<41005848>] system_call+0x54/0xa8 > [ 4.470000] > [ 4.470000] ---[ end Kernel panic - not syncing: Attempted to kill > init! exitcode=0x00000000 ]--- > > Which is expected. So, I modified hello-dyn a bit: > $> cat hello.c > #include <stdlib.h> > #include <stdio.h> > #include <unistd.h> > > int main() > { > printf("Hello World on coldfire, execute bash !\n"); > /* Let's now execute /bin/bash with execve */ > char *args[] = {"/bin/bash", NULL}; > execve(args[0], args, NULL); > /* If we reach here, it means execve failed */ > perror("execve"); > exit(EXIT_FAILURE); > } > > And now, I get: > Hello World on coldfire, execute bash ! > bash: cannot set terminal process group (-1): Inappropriate ioctl for > device > bash: no job control in this shell > bash-5.2# ls > [ 122.030000] page: refcount:3 mapcount:2 mapping:4206cdc4 index:0x94 > pfn:0x20ab2 > [ 122.030000] aops:0x413b1470 ino:9f dentry name:"libc.so.6" > [ 122.040000] flags: 0x38(uptodate|dirty|lru|zone=0) > [ 122.040000] raw: 00000038 4fed3fec 4fed4208 4206cdc4 00000094 > 00000000 00000001 00000003 > [ 122.050000] raw: 41564000 > [ 122.050000] page dumped because: VM_BUG_ON_FOLIO(ret != false) > [ 122.060000] kernel BUG at fs/ubifs/file.c:1477! > [ 122.060000] *** TRAP #7 *** FORMAT=4 > [ 122.060000] Current process id is 26 > [ 122.060000] BAD KERNEL TRAP: 00000000 > [ 122.060000] PC: [<410fba84>] ubifs_dirty_folio+0x3e/0x42 > [ 122.060000] SR: 2010 SP: 42871cec a2: 41f24510 > [ 122.060000] d0: 00000023 d1: 000000b9 d2: 4fed4108 d3: 601ce000 > [ 122.060000] d4: 41565411 d5: 601d2000 a0: 42871c30 a1: 414d5180 > [ 122.060000] Process bash (pid: 26, task=41f24510) > [ 122.060000] Frame format=4 eff addr=41450e8d pc=414674a3 > <snip> When I printk the do_page_fault first debug, I get for the first call to ls: bash-5.2# ls [ 14.700000] do page fault: [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, address=0x70069ee6, 0, (ptrval) This call works almost fine (I still have the assert failed: folio->private != NULL issue). And when I call it a second time, I get: bash-5.2# ls [ 19.820000] do page fault: [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700e2004, 2, (ptrval) The address corresponds to the defined zone ELF_ET_DYN_BASE as I set it to 0x70000000. regs->pc is not the same as the address. It might be unrelevant, but any help is appreciated to understand the process behind :-). I keep digging, and I am in the asm part which fears me a bit ! Thanks ! JM
Jean-Michel, On 24/06/24 20:56, Jean-Michel Hautbois wrote: > > When I printk the do_page_fault first debug, I get for the first call > to ls: > bash-5.2# ls > [ 14.700000] do page fault: > [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, address=0x70069ee6, > 0, (ptrval) Page not present, read fault. Please disable obfuscation of kernel pointer addresses by printk. Maybe also disable address space randomization while debugging this. > This call works almost fine (I still have the assert failed: > folio->private != NULL issue). > > And when I call it a second time, I get: > bash-5.2# ls > [ 19.820000] do page fault: > [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700e2004, > 2, (ptrval) Page not present, write fault. It would be helpful if you could get a dump of /proc/1/maps before the execve() syscall in your helloworld init replacement. That might confirm all these addresses are legit (assuming mappings survive across execve(), that is), and what they correspond to. > > The address corresponds to the defined zone ELF_ET_DYN_BASE as I set > it to 0x70000000. > > regs->pc is not the same as the address. It might be unrelevant, but > any help is appreciated to understand the process behind :-). > > I keep digging, and I am in the asm part which fears me a bit ! I don't see that you'd need to look at any asm code here. Cheers, Michael > > Thanks ! > JM
Hi Michael, On 26/06/2024 03:56, Michael Schmitz wrote: > Jean-Michel, > > On 24/06/24 20:56, Jean-Michel Hautbois wrote: >> >> When I printk the do_page_fault first debug, I get for the first call >> to ls: >> bash-5.2# ls >> [ 14.700000] do page fault: >> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, address=0x70069ee6, >> 0, (ptrval) > > Page not present, read fault. Please disable obfuscation of kernel > pointer addresses by printk. Maybe also disable address space > randomization while debugging this. Randomization is disabled, I think I don't have hash pointers... Will heck again, thanks for the explanation. > >> This call works almost fine (I still have the assert failed: >> folio->private != NULL issue). >> >> And when I call it a second time, I get: >> bash-5.2# ls >> [ 19.820000] do page fault: >> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700e2004, >> 2, (ptrval) > > Page not present, write fault. > > It would be helpful if you could get a dump of /proc/1/maps before the > execve() syscall in your helloworld init replacement. That might confirm > all these addresses are legit (assuming mappings survive across > execve(), that is), and what they correspond to. Executing mount /proc will make it impossible to then call cat /proc/1/maps :-(. Can I enable a printk somewhere in the kernel to do the same maybe ? JM > >> >> The address corresponds to the defined zone ELF_ET_DYN_BASE as I set >> it to 0x70000000. >> >> regs->pc is not the same as the address. It might be unrelevant, but >> any help is appreciated to understand the process behind :-). >> >> I keep digging, and I am in the asm part which fears me a bit ! > > I don't see that you'd need to look at any asm code here. > > Cheers, > > Michael > >> >> Thanks ! >> JM
Michael, On 26/06/2024 03:56, Michael Schmitz wrote: > Jean-Michel, > > On 24/06/24 20:56, Jean-Michel Hautbois wrote: >> >> When I printk the do_page_fault first debug, I get for the first call >> to ls: >> bash-5.2# ls >> [ 14.700000] do page fault: >> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, address=0x70069ee6, >> 0, (ptrval) > > Page not present, read fault. Please disable obfuscation of kernel > pointer addresses by printk. Maybe also disable address space > randomization while debugging this. > >> This call works almost fine (I still have the assert failed: >> folio->private != NULL issue). >> >> And when I call it a second time, I get: >> bash-5.2# ls >> [ 19.820000] do page fault: >> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700e2004, >> 2, (ptrval) > > Page not present, write fault. > > It would be helpful if you could get a dump of /proc/1/maps before the > execve() syscall in your helloworld init replacement. That might confirm > all these addresses are legit (assuming mappings survive across > execve(), that is), and what they correspond to. I changed the code a bit, and display an error only when error_code is not 0. The result is for the bash launch: [ 3.864000] Run /bin/bash as init process [ 3.865000] with arguments: [ 3.866000] /bin/bash [ 3.868000] with environment: [ 3.869000] HOME=/ [ 3.871000] TERM=linux [ 4.250000] do page fault: [ 4.250000] regs->sr=0x2000, regs->pc=0x41366984, address=0x700b3364, 2, 41fb0000 [ 4.258000] do page fault: [ 4.258000] regs->sr=0x2000, regs->pc=0x41366984, address=0x6002140c, 2, 41fb0000 [ 4.273000] do page fault: [ 4.273000] regs->sr=0x0, regs->pc=0x60011996, address=0x6001ff74, 3, 41fb0000 [ 4.332000] do page fault: [ 4.332000] regs->sr=0x10, regs->pc=0x60012bf4, address=0x700adf44, 3, 41fb0000 [ 4.497000] do page fault: [ 4.497000] regs->sr=0x0, regs->pc=0x60016344, address=0x600592d0, 2, 41fb0000 [ 4.505000] do page fault: [ 4.505000] regs->sr=0x10, regs->pc=0x60007320, address=0x60055f54, 3, 41fb0000 [ 4.533000] do page fault: [ 4.533000] regs->sr=0x0, regs->pc=0x60016344, address=0x60066244, 2, 41fb0000 [ 4.539000] do page fault: [ 4.539000] regs->sr=0x10, regs->pc=0x60007320, address=0x60065f54, 3, 41fb0000 [ 4.635000] do page fault: [ 4.635000] regs->sr=0x0, regs->pc=0x600098d2, address=0x60068014, 2, 41fb0000 [ 4.637000] do page fault: [ 4.637000] regs->sr=0x0, regs->pc=0x60016344, address=0x600a2d1c, 2, 41fb0000 [ 4.647000] do page fault: [ 4.647000] regs->sr=0x10, regs->pc=0x60007320, address=0x600a1f54, 3, 41fb0000 [ 5.163000] do page fault: [ 5.163000] regs->sr=0x0, regs->pc=0x60016344, address=0x601d1d28, 2, 41fb0000 [ 5.175000] do page fault: [ 5.175000] regs->sr=0x10, regs->pc=0x60007320, address=0x601cdf44, 3, 41fb0000 [ 5.212000] do page fault: [ 5.212000] regs->sr=0x14, regs->pc=0x6000b04a, address=0x601ce004, 2, 41fb0000 [ 5.373000] do page fault: [ 5.373000] regs->sr=0x14, regs->pc=0x6000b04a, address=0x60056004, 2, 41fb0000 [ 5.396000] random: crng init done [ 5.401000] do page fault: [ 5.401000] regs->sr=0x14, regs->pc=0x6000b04a, address=0x700ae004, 2, 41fb0000 [ 5.414000] do page fault: [ 5.414000] regs->sr=0x0, regs->pc=0x6000b42a, address=0x700b0000, 2, 41fb0000 [ 5.487000] do page fault: [ 5.487000] regs->sr=0x0, regs->pc=0x6019cbf0, address=0x601d5010, 2, 41fb0000 [ 5.488000] do page fault: [ 5.488000] regs->sr=0x0, regs->pc=0x6019cbf6, address=0x601db288, 2, 41fb0000 [ 5.552000] do page fault: [ 5.552000] regs->sr=0x0, regs->pc=0x600e1032, address=0x601d21d4, 3, 41fb0000 [ 5.569000] do page fault: [ 5.569000] regs->sr=0x0, regs->pc=0x600df938, address=0x700ba8e4, 2, 41fb0000 [ 5.580000] do page fault: [ 5.580000] regs->sr=0x8, regs->pc=0x70038476, address=0x700b4890, 2, 41fb0000 [ 5.634000] do page fault: [ 5.634000] regs->sr=0x0, regs->pc=0x6011cbf2, address=0x700be004, 2, 41fb0000 [ 5.677000] do page fault: [ 5.677000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700c71b4, 2, 41fb0000 [ 5.684000] do page fault: [ 5.684000] regs->sr=0x2004, regs->pc=0x410a724a, address=0x700c0000, 2, 41fb0000 [ 5.686000] do page fault: [ 5.686000] regs->sr=0x2004, regs->pc=0x410a724a, address=0x700c2000, 2, 41fb0000 [ 6.058000] do page fault: [ 6.058000] regs->sr=0x10, regs->pc=0x7004b63c, address=0x700b8d2a, 3, 41fb0000 [ 6.206000] do page fault: [ 6.206000] regs->sr=0x0, regs->pc=0x700365d2, address=0x6005a468, 2, 41fb0000 [ 6.231000] do page fault: [ 6.231000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700c483c, 2, 41fb0000 bash: cannot set terminal process group (-1): Inappropriate ioctl for device bash: no job control in this shell [ 6.327000] do page fault: [ 6.327000] regs->sr=0x4, regs->pc=0x70092344, address=0x700bc410, 3, 41fb0000 [ 6.349000] do page fault: [ 6.349000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700c80f4, 2, 41fb0000 [ 6.438000] do page fault: [ 6.438000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700ca114, 2, 41fb0000 [ 6.452000] do page fault: [ 6.452000] regs->sr=0x8, regs->pc=0x600f28d0, address=0xbffcfd4c, 2, 41fb0000 [ 6.481000] do page fault: [ 6.481000] regs->sr=0x0, regs->pc=0x6008b6a8, address=0xbffcd1a0, 2, 41fb0000 [ 6.493000] do page fault: [ 6.493000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700cc474, 2, 41fb0000 [ 6.516000] do page fault: [ 6.516000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700ce1bc, 2, 41fb0000 [ 6.517000] do page fault: [ 6.517000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700d000c, 2, 41fb0000 [ 6.519000] do page fault: [ 6.519000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700d246c, 2, 41fb0000 [ 6.520000] do page fault: [ 6.520000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700d42bc, 2, 41fb0000 [ 6.528000] do page fault: [ 6.528000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700d6004, 2, 41fb0000 [ 6.543000] do page fault: [ 6.543000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700d8584, 2, 41fb0000 [ 6.544000] do page fault: [ 6.544000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700da3d4, 2, 41fb0000 [ 6.546000] do page fault: [ 6.546000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700dc224, 2, 41fb0000 [ 6.547000] do page fault: [ 6.547000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700de074, 2, 41fb0000 [ 6.576000] do page fault: [ 6.576000] regs->sr=0x0, regs->pc=0x6011ca12, address=0x700e02c4, 2, 41fb0000 bash-5.2# My kernel is loaded at 0x41000000. JM
Hi Michael, On 26/06/2024 03:56, Michael Schmitz wrote: > Jean-Michel, > > On 24/06/24 20:56, Jean-Michel Hautbois wrote: >> >> When I printk the do_page_fault first debug, I get for the first call >> to ls: >> bash-5.2# ls >> [ 14.700000] do page fault: >> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, address=0x70069ee6, >> 0, (ptrval) > > Page not present, read fault. Please disable obfuscation of kernel > pointer addresses by printk. Maybe also disable address space > randomization while debugging this. > >> This call works almost fine (I still have the assert failed: >> folio->private != NULL issue). >> >> And when I call it a second time, I get: >> bash-5.2# ls >> [ 19.820000] do page fault: >> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700e2004, >> 2, (ptrval) > > Page not present, write fault. > > It would be helpful if you could get a dump of /proc/1/maps before the > execve() syscall in your helloworld init replacement. That might confirm > all these addresses are legit (assuming mappings survive across > execve(), that is), and what they correspond to. > >> >> The address corresponds to the defined zone ELF_ET_DYN_BASE as I set >> it to 0x70000000. >> >> regs->pc is not the same as the address. It might be unrelevant, but >> any help is appreciated to understand the process behind :-). >> >> I keep digging, and I am in the asm part which fears me a bit ! > > I don't see that you'd need to look at any asm code here. I add a small test in do_page_fault, and in case of an error, it panics. The result follows: ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log [ 3.857000] Run /bin/bash as init process [ 3.858000] with arguments: [ 3.861000] /bin/bash [ 3.862000] with environment: [ 3.863000] HOME=/ [ 3.864000] TERM=linux [ 4.242000] do page fault: [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, address=0x700b3364, 2, 41fb0000 [ 4.242000] Kernel panic - not syncing: page fault error [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted 6.10.0-rc5-g927da6cf01fe-dirty #25 [ 4.242000] Stack from 4186dda8: [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 00000000 4136ee10 41423aa4 [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d 4186de60 41fb0000 41d51a60 [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 700b3364 00000002 41fb0000 [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 41d51a00 4186de60 41d51a60 [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 700b3364 00000002 00000000 [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 4186de60 00000000 00000000 [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator 1)) [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 arch/m68k/kernel/traps.c:775) [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 arch/m68k/kernel/traps.c:775) [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) [ 4.242000] ma_slots (lib/maple_tree.c:759) [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) fs/binfmt_elf.c:421 (discriminator 1)) [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) [ 4.242000] memset (arch/m68k/lib/memset.c:11) [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) [ 4.242000] memset (arch/m68k/lib/memset.c:11) [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 fs/exec.c:1891 fs/exec.c:1867) [ 4.242000] copy_strings_kernel (fs/exec.c:669) [ 4.242000] count_strings_kernel (fs/exec.c:473) [ 4.242000] kernel_execve (fs/exec.c:2058) [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) [ 4.242000] run_init_process (init/main.c:1389) [ 4.242000] _printk (kernel/printk/printk.c:2365) [ 4.242000] kernel_init (init/main.c:1508) [ 4.242000] kernel_init (init/main.c:1459) [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) [ 4.242000] [ 4.242000] ---[ end Kernel panic - not syncing: page fault error ]--- Looks like a memory mapping failure, but why ? My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM starts at 0x40000000 and ends at 0x50000000 (256MB). It looks like a TLB write miss which is obscure to me :-). I tried to use the /proc but as expected it is not alive after mounting it. Thanks, JM > Cheers, > > Michael > >> >> Thanks ! >> JM
Jean-Michel, On 27/06/24 01:28, Jean-Michel Hautbois wrote: > Hi Michael, > > On 26/06/2024 03:56, Michael Schmitz wrote: >> Jean-Michel, >> >> On 24/06/24 20:56, Jean-Michel Hautbois wrote: >>> >>> When I printk the do_page_fault first debug, I get for the first >>> call to ls: >>> bash-5.2# ls >>> [ 14.700000] do page fault: >>> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, >>> address=0x70069ee6, 0, (ptrval) >> >> Page not present, read fault. Please disable obfuscation of kernel >> pointer addresses by printk. Maybe also disable address space >> randomization while debugging this. >> >>> This call works almost fine (I still have the assert failed: >>> folio->private != NULL issue). >>> >>> And when I call it a second time, I get: >>> bash-5.2# ls >>> [ 19.820000] do page fault: >>> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, >>> address=0x700e2004, 2, (ptrval) >> >> Page not present, write fault. >> >> It would be helpful if you could get a dump of /proc/1/maps before >> the execve() syscall in your helloworld init replacement. That might >> confirm all these addresses are legit (assuming mappings survive >> across execve(), that is), and what they correspond to. >> >>> >>> The address corresponds to the defined zone ELF_ET_DYN_BASE as I set >>> it to 0x70000000. >>> >>> regs->pc is not the same as the address. It might be unrelevant, but >>> any help is appreciated to understand the process behind :-). >>> >>> I keep digging, and I am in the asm part which fears me a bit ! >> >> I don't see that you'd need to look at any asm code here. > > I add a small test in do_page_fault, and in case of an error, it > panics. The result follows: Please take a look at the comments at the start of arch/m68k/mm/fault.c:do_page_fault(). The meaning of the bits in error_code are explained there. error_code != 0 is just one possible case out of the four that are handled by do_page_fault(). It does not signify 'no error' - if there hadn't been a page fault, do_page_fault() would not have been called. You just forced a panic each time a write fault and/or a protection fault happens. Write faults are absolutely expected to happen when loading a library - ld.so needs to perform relocation after loading a dynamic library, and that means writes to the GOT in the library's data segment (PIC assumed). > ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log > [ 3.857000] Run /bin/bash as init process > [ 3.858000] with arguments: > [ 3.861000] /bin/bash > [ 3.862000] with environment: > [ 3.863000] HOME=/ > [ 3.864000] TERM=linux > [ 4.242000] do page fault: > [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, > address=0x700b3364, 2, 41fb0000 > [ 4.242000] Kernel panic - not syncing: page fault error > [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted > 6.10.0-rc5-g927da6cf01fe-dirty #25 > [ 4.242000] Stack from 4186dda8: > [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 > 00000000 4136ee10 41423aa4 > [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d > 4186de60 41fb0000 41d51a60 > [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 > 700b3364 00000002 41fb0000 > [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 > 41d51a00 4186de60 41d51a60 > [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 > 700b3364 00000002 00000000 > [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 > 4186de60 00000000 00000000 > [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) > [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) > [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator 1)) > [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) > [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 > arch/m68k/kernel/traps.c:775) > [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 > arch/m68k/kernel/traps.c:775) > [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) > [ 4.242000] ma_slots (lib/maple_tree.c:759) > [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) > [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) > fs/binfmt_elf.c:421 (discriminator 1)) > [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) > [ 4.242000] memset (arch/m68k/lib/memset.c:11) > [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 > fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) > [ 4.242000] memset (arch/m68k/lib/memset.c:11) > [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 > fs/exec.c:1891 fs/exec.c:1867) > [ 4.242000] copy_strings_kernel (fs/exec.c:669) > [ 4.242000] count_strings_kernel (fs/exec.c:473) > [ 4.242000] kernel_execve (fs/exec.c:2058) > [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) > [ 4.242000] run_init_process (init/main.c:1389) > [ 4.242000] _printk (kernel/printk/printk.c:2365) > [ 4.242000] kernel_init (init/main.c:1508) > [ 4.242000] kernel_init (init/main.c:1459) > [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) > [ 4.242000] > [ 4.242000] ---[ end Kernel panic - not syncing: page fault error ]--- > > Looks like a memory mapping failure, but why ? > My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM > starts at 0x40000000 and ends at 0x50000000 (256MB). 0x41fb0000 seems to be init's page directory. The fault address is in the range where I'd expect dynamic libraries to reside. > > It looks like a TLB write miss which is obscure to me :-). > > I tried to use the /proc but as expected it is not alive after > mounting it. The memory map ought to be accessible through sysrq - an alternative would be to modify the ELF binfmt handler and dump the map once ld.so has finished with relocations. Cheers, Michael > Thanks, > JM > > >> Cheers, >> >> Michael >> >>> >>> Thanks ! >>> JM
Hi Michael, On 26/06/2024 21:36, Michael Schmitz wrote: > Jean-Michel, > > On 27/06/24 01:28, Jean-Michel Hautbois wrote: >> Hi Michael, >> >> On 26/06/2024 03:56, Michael Schmitz wrote: >>> Jean-Michel, >>> >>> On 24/06/24 20:56, Jean-Michel Hautbois wrote: >>>> >>>> When I printk the do_page_fault first debug, I get for the first >>>> call to ls: >>>> bash-5.2# ls >>>> [ 14.700000] do page fault: >>>> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, >>>> address=0x70069ee6, 0, (ptrval) >>> >>> Page not present, read fault. Please disable obfuscation of kernel >>> pointer addresses by printk. Maybe also disable address space >>> randomization while debugging this. >>> >>>> This call works almost fine (I still have the assert failed: >>>> folio->private != NULL issue). >>>> >>>> And when I call it a second time, I get: >>>> bash-5.2# ls >>>> [ 19.820000] do page fault: >>>> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, >>>> address=0x700e2004, 2, (ptrval) >>> >>> Page not present, write fault. >>> >>> It would be helpful if you could get a dump of /proc/1/maps before >>> the execve() syscall in your helloworld init replacement. That might >>> confirm all these addresses are legit (assuming mappings survive >>> across execve(), that is), and what they correspond to. >>> >>>> >>>> The address corresponds to the defined zone ELF_ET_DYN_BASE as I set >>>> it to 0x70000000. >>>> >>>> regs->pc is not the same as the address. It might be unrelevant, but >>>> any help is appreciated to understand the process behind :-). >>>> >>>> I keep digging, and I am in the asm part which fears me a bit ! >>> >>> I don't see that you'd need to look at any asm code here. >> >> I add a small test in do_page_fault, and in case of an error, it >> panics. The result follows: > > Please take a look at the comments at the start of > arch/m68k/mm/fault.c:do_page_fault(). The meaning of the bits in > error_code are explained there. > > error_code != 0 is just one possible case out of the four that are > handled by do_page_fault(). It does not signify 'no error' - if there > hadn't been a page fault, do_page_fault() would not have been called. > > You just forced a panic each time a write fault and/or a protection > fault happens. Write faults are absolutely expected to happen when > loading a library - ld.so needs to perform relocation after loading a > dynamic library, and that means writes to the GOT in the library's data > segment (PIC assumed). I feel dumb -_- thanks for the explanation (I *knew* that page faults are expected, but as I couldn't not see when it works or not, I kept focus on this silly part). > > >> ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log >> [ 3.857000] Run /bin/bash as init process >> [ 3.858000] with arguments: >> [ 3.861000] /bin/bash >> [ 3.862000] with environment: >> [ 3.863000] HOME=/ >> [ 3.864000] TERM=linux >> [ 4.242000] do page fault: >> [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, >> address=0x700b3364, 2, 41fb0000 >> [ 4.242000] Kernel panic - not syncing: page fault error >> [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted >> 6.10.0-rc5-g927da6cf01fe-dirty #25 >> [ 4.242000] Stack from 4186dda8: >> [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 >> 00000000 4136ee10 41423aa4 >> [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d >> 4186de60 41fb0000 41d51a60 >> [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 >> 700b3364 00000002 41fb0000 >> [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 >> 41d51a00 4186de60 41d51a60 >> [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 >> 700b3364 00000002 00000000 >> [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 >> 4186de60 00000000 00000000 >> [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) >> [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) >> [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator 1)) >> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 >> arch/m68k/kernel/traps.c:775) >> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 >> arch/m68k/kernel/traps.c:775) >> [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) >> [ 4.242000] ma_slots (lib/maple_tree.c:759) >> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >> [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) >> fs/binfmt_elf.c:421 (discriminator 1)) >> [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) >> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >> [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 >> fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) >> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >> [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 >> fs/exec.c:1891 fs/exec.c:1867) >> [ 4.242000] copy_strings_kernel (fs/exec.c:669) >> [ 4.242000] count_strings_kernel (fs/exec.c:473) >> [ 4.242000] kernel_execve (fs/exec.c:2058) >> [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) >> [ 4.242000] run_init_process (init/main.c:1389) >> [ 4.242000] _printk (kernel/printk/printk.c:2365) >> [ 4.242000] kernel_init (init/main.c:1508) >> [ 4.242000] kernel_init (init/main.c:1459) >> [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) >> [ 4.242000] >> [ 4.242000] ---[ end Kernel panic - not syncing: page fault error ]--- >> >> Looks like a memory mapping failure, but why ? >> My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM >> starts at 0x40000000 and ends at 0x50000000 (256MB). > 0x41fb0000 seems to be init's page directory. The fault address is in > the range where I'd expect dynamic libraries to reside. How can I check it is init's page directory ? >> >> It looks like a TLB write miss which is obscure to me :-). >> >> I tried to use the /proc but as expected it is not alive after >> mounting it. > > The memory map ought to be accessible through sysrq - an alternative > would be to modify the ELF binfmt handler and dump the map once ld.so > has finished with relocations. I can use sysrq but AFAIK I can show the memory info, not the memory map ? Modifying the ELF binfmt handler might be doable... JM > > Cheers, > > Michael > > >> Thanks, >> JM >> >> >>> Cheers, >>> >>> Michael >>> >>>> >>>> Thanks ! >>>> JM
Michael, On 26/06/2024 21:36, Michael Schmitz wrote: > Jean-Michel, > > On 27/06/24 01:28, Jean-Michel Hautbois wrote: >> Hi Michael, >> >> On 26/06/2024 03:56, Michael Schmitz wrote: >>> Jean-Michel, >>> >>> On 24/06/24 20:56, Jean-Michel Hautbois wrote: >>>> >>>> When I printk the do_page_fault first debug, I get for the first >>>> call to ls: >>>> bash-5.2# ls >>>> [ 14.700000] do page fault: >>>> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, >>>> address=0x70069ee6, 0, (ptrval) >>> >>> Page not present, read fault. Please disable obfuscation of kernel >>> pointer addresses by printk. Maybe also disable address space >>> randomization while debugging this. >>> >>>> This call works almost fine (I still have the assert failed: >>>> folio->private != NULL issue). >>>> >>>> And when I call it a second time, I get: >>>> bash-5.2# ls >>>> [ 19.820000] do page fault: >>>> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, >>>> address=0x700e2004, 2, (ptrval) >>> >>> Page not present, write fault. >>> >>> It would be helpful if you could get a dump of /proc/1/maps before >>> the execve() syscall in your helloworld init replacement. That might >>> confirm all these addresses are legit (assuming mappings survive >>> across execve(), that is), and what they correspond to. >>> >>>> >>>> The address corresponds to the defined zone ELF_ET_DYN_BASE as I set >>>> it to 0x70000000. >>>> >>>> regs->pc is not the same as the address. It might be unrelevant, but >>>> any help is appreciated to understand the process behind :-). >>>> >>>> I keep digging, and I am in the asm part which fears me a bit ! >>> >>> I don't see that you'd need to look at any asm code here. >> >> I add a small test in do_page_fault, and in case of an error, it >> panics. The result follows: > > Please take a look at the comments at the start of > arch/m68k/mm/fault.c:do_page_fault(). The meaning of the bits in > error_code are explained there. > > error_code != 0 is just one possible case out of the four that are > handled by do_page_fault(). It does not signify 'no error' - if there > hadn't been a page fault, do_page_fault() would not have been called. > > You just forced a panic each time a write fault and/or a protection > fault happens. Write faults are absolutely expected to happen when > loading a library - ld.so needs to perform relocation after loading a > dynamic library, and that means writes to the GOT in the library's data > segment (PIC assumed). > > >> ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log >> [ 3.857000] Run /bin/bash as init process >> [ 3.858000] with arguments: >> [ 3.861000] /bin/bash >> [ 3.862000] with environment: >> [ 3.863000] HOME=/ >> [ 3.864000] TERM=linux >> [ 4.242000] do page fault: >> [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, >> address=0x700b3364, 2, 41fb0000 >> [ 4.242000] Kernel panic - not syncing: page fault error >> [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted >> 6.10.0-rc5-g927da6cf01fe-dirty #25 >> [ 4.242000] Stack from 4186dda8: >> [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 >> 00000000 4136ee10 41423aa4 >> [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d >> 4186de60 41fb0000 41d51a60 >> [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 >> 700b3364 00000002 41fb0000 >> [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 >> 41d51a00 4186de60 41d51a60 >> [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 >> 700b3364 00000002 00000000 >> [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 >> 4186de60 00000000 00000000 >> [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) >> [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) >> [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator 1)) >> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 >> arch/m68k/kernel/traps.c:775) >> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 >> arch/m68k/kernel/traps.c:775) >> [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) >> [ 4.242000] ma_slots (lib/maple_tree.c:759) >> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >> [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) >> fs/binfmt_elf.c:421 (discriminator 1)) >> [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) >> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >> [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 >> fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) >> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >> [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 >> fs/exec.c:1891 fs/exec.c:1867) >> [ 4.242000] copy_strings_kernel (fs/exec.c:669) >> [ 4.242000] count_strings_kernel (fs/exec.c:473) >> [ 4.242000] kernel_execve (fs/exec.c:2058) >> [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) >> [ 4.242000] run_init_process (init/main.c:1389) >> [ 4.242000] _printk (kernel/printk/printk.c:2365) >> [ 4.242000] kernel_init (init/main.c:1508) >> [ 4.242000] kernel_init (init/main.c:1459) >> [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) >> [ 4.242000] >> [ 4.242000] ---[ end Kernel panic - not syncing: page fault error ]--- >> >> Looks like a memory mapping failure, but why ? >> My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM >> starts at 0x40000000 and ends at 0x50000000 (256MB). > 0x41fb0000 seems to be init's page directory. The fault address is in > the range where I'd expect dynamic libraries to reside. >> >> It looks like a TLB write miss which is obscure to me :-). >> >> I tried to use the /proc but as expected it is not alive after >> mounting it. > > The memory map ought to be accessible through sysrq - an alternative > would be to modify the ELF binfmt handler and dump the map once ld.so > has finished with relocations. I added a dump in the binfmt_elf file: diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index a43897b03ce9..395f556f3a90 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -816,6 +816,63 @@ static int parse_elf_properties(struct file *f, const struct elf_phdr *phdr, return ret == -ENOENT ? 0 : ret; } +static int dump_memory_map(struct task_struct *task) +{ + struct mm_struct *mm = task->mm; + struct vm_area_struct *vma; + MA_STATE(mas, &mm->mm_mt, 0, -1); + struct file *file; + struct path *path; + char *buf; + char *pathname; + + // Acquire the read lock for mmap_lock + down_read(&mm->mmap_lock); + mas_lock(&mas); + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, ULONG_MAX)) { + if (vma->vm_file) { + buf = (char *)__get_free_page(GFP_KERNEL); + if (!buf) { + continue; // Handle memory allocation failure + } + + file = vma->vm_file; + path = &file->f_path; + pathname = d_path(path, buf, PAGE_SIZE); + if (IS_ERR(pathname)) { + pathname = NULL; + } + + pr_info("%lx-%lx %c%c%c%c %08lx %02x:%02x %lu %s\n", + vma->vm_start, vma->vm_end, + vma->vm_flags & VM_READ ? 'r' : '-', + vma->vm_flags & VM_WRITE ? 'w' : '-', + vma->vm_flags & VM_EXEC ? 'x' : '-', + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', + vma->vm_pgoff << PAGE_SHIFT, + MAJOR(file->f_inode->i_rdev), + MINOR(file->f_inode->i_rdev), + file->f_inode->i_ino, + pathname ? pathname : ""); + + free_page((unsigned long)buf); + } else { + pr_info("%lx-%lx %c%c%c%c %08lx 00:00 0\n", + vma->vm_start, vma->vm_end, + vma->vm_flags & VM_READ ? 'r' : '-', + vma->vm_flags & VM_WRITE ? 'w' : '-', + vma->vm_flags & VM_EXEC ? 'x' : '-', + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', + vma->vm_pgoff << PAGE_SHIFT); + } + } + mas_unlock(&mas); + // Release the read lock for mmap_lock + up_read(&mm->mmap_lock); + + return 0; +} + static int load_elf_binary(struct linux_binprm *bprm) { struct file *interpreter = NULL; /* to shut gcc up */ @@ -1299,6 +1356,9 @@ static int load_elf_binary(struct linux_binprm *bprm) finalize_exec(bprm); START_THREAD(elf_ex, regs, elf_entry, bprm->p); + if (current->pid == 1) { // Check if this is the init process + dump_memory_map(current); + } retval = 0; out: return retval; I think it is quick and dirty, but seems to do the trick. I then get in my console: [ 4.265000] 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 [ 4.266000] 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 [ 4.267000] 70000000-700ac000 r-xp 00000000 00:00 27 /bin/bash [ 4.268000] 700ac000-700b4000 rw-p 000ac000 00:00 27 /bin/bash [ 4.269000] 700b4000-700be000 rwxp 700b4000 00:00 0 [ 4.270000] bfe7a000-bfe9c000 rw-p bffde000 00:00 0 But nothing rings a bell at this level for me... Thanks !
Hi JM, On 27/6/24 22:36, Jean-Michel Hautbois wrote: > Michael, > > On 26/06/2024 21:36, Michael Schmitz wrote: >> Jean-Michel, >> >> On 27/06/24 01:28, Jean-Michel Hautbois wrote: >>> Hi Michael, >>> >>> On 26/06/2024 03:56, Michael Schmitz wrote: >>>> Jean-Michel, >>>> >>>> On 24/06/24 20:56, Jean-Michel Hautbois wrote: >>>>> >>>>> When I printk the do_page_fault first debug, I get for the first call to ls: >>>>> bash-5.2# ls >>>>> [ 14.700000] do page fault: >>>>> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, address=0x70069ee6, 0, (ptrval) >>>> >>>> Page not present, read fault. Please disable obfuscation of kernel pointer addresses by printk. Maybe also disable address space randomization while debugging this. >>>> >>>>> This call works almost fine (I still have the assert failed: folio->private != NULL issue). >>>>> >>>>> And when I call it a second time, I get: >>>>> bash-5.2# ls >>>>> [ 19.820000] do page fault: >>>>> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, address=0x700e2004, 2, (ptrval) >>>> >>>> Page not present, write fault. >>>> >>>> It would be helpful if you could get a dump of /proc/1/maps before the execve() syscall in your helloworld init replacement. That might confirm all these addresses are legit (assuming mappings survive across execve(), that is), and what they correspond to. >>>> >>>>> >>>>> The address corresponds to the defined zone ELF_ET_DYN_BASE as I set it to 0x70000000. >>>>> >>>>> regs->pc is not the same as the address. It might be unrelevant, but any help is appreciated to understand the process behind :-). >>>>> >>>>> I keep digging, and I am in the asm part which fears me a bit ! >>>> >>>> I don't see that you'd need to look at any asm code here. >>> >>> I add a small test in do_page_fault, and in case of an error, it panics. The result follows: >> >> Please take a look at the comments at the start of arch/m68k/mm/fault.c:do_page_fault(). The meaning of the bits in error_code are explained there. >> >> error_code != 0 is just one possible case out of the four that are handled by do_page_fault(). It does not signify 'no error' - if there hadn't been a page fault, do_page_fault() would not have been called. >> >> You just forced a panic each time a write fault and/or a protection fault happens. Write faults are absolutely expected to happen when loading a library - ld.so needs to perform relocation after loading a dynamic library, and that means writes to the GOT in the library's data segment (PIC assumed). >> >> >>> ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log >>> [ 3.857000] Run /bin/bash as init process >>> [ 3.858000] with arguments: >>> [ 3.861000] /bin/bash >>> [ 3.862000] with environment: >>> [ 3.863000] HOME=/ >>> [ 3.864000] TERM=linux >>> [ 4.242000] do page fault: >>> [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, address=0x700b3364, 2, 41fb0000 >>> [ 4.242000] Kernel panic - not syncing: page fault error >>> [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted 6.10.0-rc5-g927da6cf01fe-dirty #25 >>> [ 4.242000] Stack from 4186dda8: >>> [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 00000000 4136ee10 41423aa4 >>> [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d 4186de60 41fb0000 41d51a60 >>> [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 700b3364 00000002 41fb0000 >>> [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 41d51a00 4186de60 41d51a60 >>> [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 700b3364 00000002 00000000 >>> [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 4186de60 00000000 00000000 >>> [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) >>> [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) >>> [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator 1)) >>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 arch/m68k/kernel/traps.c:775) >>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 arch/m68k/kernel/traps.c:775) >>> [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) >>> [ 4.242000] ma_slots (lib/maple_tree.c:759) >>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>> [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) fs/binfmt_elf.c:421 (discriminator 1)) >>> [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) >>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>> [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) >>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>> [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 fs/exec.c:1891 fs/exec.c:1867) >>> [ 4.242000] copy_strings_kernel (fs/exec.c:669) >>> [ 4.242000] count_strings_kernel (fs/exec.c:473) >>> [ 4.242000] kernel_execve (fs/exec.c:2058) >>> [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) >>> [ 4.242000] run_init_process (init/main.c:1389) >>> [ 4.242000] _printk (kernel/printk/printk.c:2365) >>> [ 4.242000] kernel_init (init/main.c:1508) >>> [ 4.242000] kernel_init (init/main.c:1459) >>> [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) >>> [ 4.242000] >>> [ 4.242000] ---[ end Kernel panic - not syncing: page fault error ]--- >>> >>> Looks like a memory mapping failure, but why ? >>> My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM starts at 0x40000000 and ends at 0x50000000 (256MB). >> 0x41fb0000 seems to be init's page directory. The fault address is in the range where I'd expect dynamic libraries to reside. >>> >>> It looks like a TLB write miss which is obscure to me :-). >>> >>> I tried to use the /proc but as expected it is not alive after mounting it. >> >> The memory map ought to be accessible through sysrq - an alternative would be to modify the ELF binfmt handler and dump the map once ld.so has finished with relocations. > > I added a dump in the binfmt_elf file: > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index a43897b03ce9..395f556f3a90 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -816,6 +816,63 @@ static int parse_elf_properties(struct file *f, const struct elf_phdr *phdr, > return ret == -ENOENT ? 0 : ret; > } > > +static int dump_memory_map(struct task_struct *task) > +{ > + struct mm_struct *mm = task->mm; > + struct vm_area_struct *vma; > + MA_STATE(mas, &mm->mm_mt, 0, -1); > + struct file *file; > + struct path *path; > + char *buf; > + char *pathname; > + > + // Acquire the read lock for mmap_lock > + down_read(&mm->mmap_lock); > + mas_lock(&mas); > + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, ULONG_MAX)) { > + if (vma->vm_file) { > + buf = (char *)__get_free_page(GFP_KERNEL); > + if (!buf) { > + continue; // Handle memory allocation failure > + } > + > + file = vma->vm_file; > + path = &file->f_path; > + pathname = d_path(path, buf, PAGE_SIZE); > + if (IS_ERR(pathname)) { > + pathname = NULL; > + } > + > + pr_info("%lx-%lx %c%c%c%c %08lx %02x:%02x %lu %s\n", > + vma->vm_start, vma->vm_end, > + vma->vm_flags & VM_READ ? 'r' : '-', > + vma->vm_flags & VM_WRITE ? 'w' : '-', > + vma->vm_flags & VM_EXEC ? 'x' : '-', > + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', > + vma->vm_pgoff << PAGE_SHIFT, > + MAJOR(file->f_inode->i_rdev), > + MINOR(file->f_inode->i_rdev), > + file->f_inode->i_ino, > + pathname ? pathname : ""); > + > + free_page((unsigned long)buf); > + } else { > + pr_info("%lx-%lx %c%c%c%c %08lx 00:00 0\n", > + vma->vm_start, vma->vm_end, > + vma->vm_flags & VM_READ ? 'r' : '-', > + vma->vm_flags & VM_WRITE ? 'w' : '-', > + vma->vm_flags & VM_EXEC ? 'x' : '-', > + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', > + vma->vm_pgoff << PAGE_SHIFT); > + } > + } > + mas_unlock(&mas); > + // Release the read lock for mmap_lock > + up_read(&mm->mmap_lock); > + > + return 0; > +} > + > static int load_elf_binary(struct linux_binprm *bprm) > { > struct file *interpreter = NULL; /* to shut gcc up */ > @@ -1299,6 +1356,9 @@ static int load_elf_binary(struct linux_binprm *bprm) > > finalize_exec(bprm); > START_THREAD(elf_ex, regs, elf_entry, bprm->p); > + if (current->pid == 1) { // Check if this is the init process > + dump_memory_map(current); > + } > retval = 0; > out: > return retval; > > I think it is quick and dirty, but seems to do the trick. > I then get in my console: > [ 4.265000] 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > [ 4.266000] 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 > [ 4.267000] 70000000-700ac000 r-xp 00000000 00:00 27 /bin/bash > [ 4.268000] 700ac000-700b4000 rw-p 000ac000 00:00 27 /bin/bash > [ 4.269000] 700b4000-700be000 rwxp 700b4000 00:00 0 > [ 4.270000] bfe7a000-bfe9c000 rw-p bffde000 00:00 0 > > But nothing rings a bell at this level for me... > Thanks ! Here is the same dump trace generated on my newly resurrected M5475EVB for comparison: [snip] Freeing unused kernel image (initmem) memory: 80K This architecture does not have kernel memory protection. Run /sbin/init as init process Run /etc/init as init process Run /bin/init as init process process '/bin/init' started with executable stack 60000000-60008000 r-xp 00000000 00:00 550544 /lib/ld-uClibc-0.9.33.2.so 60008000-6000c000 rw-p 00006000 00:00 550544 /lib/ld-uClibc-0.9.33.2.so 80000000-80004000 r-xp 00000000 00:00 1882624 /bin/init 80004000-80008000 rw-p 00002000 00:00 1882624 /bin/init bfc9a000-bfcbc000 rwxp bffde000 00:00 0 Welcome to ... Execution otherwise continues as normal to a shell after this. Regards Greg
Hi Greg, On 27/06/2024 16:46, Greg Ungerer wrote: > Hi JM, > > On 27/6/24 22:36, Jean-Michel Hautbois wrote: >> Michael, >> >> On 26/06/2024 21:36, Michael Schmitz wrote: >>> Jean-Michel, >>> >>> On 27/06/24 01:28, Jean-Michel Hautbois wrote: >>>> Hi Michael, >>>> >>>> On 26/06/2024 03:56, Michael Schmitz wrote: >>>>> Jean-Michel, >>>>> >>>>> On 24/06/24 20:56, Jean-Michel Hautbois wrote: >>>>>> >>>>>> When I printk the do_page_fault first debug, I get for the first >>>>>> call to ls: >>>>>> bash-5.2# ls >>>>>> [ 14.700000] do page fault: >>>>>> [ 14.700000] regs->sr=0x0, regs->pc=0x70069ee6, >>>>>> address=0x70069ee6, 0, (ptrval) >>>>> >>>>> Page not present, read fault. Please disable obfuscation of kernel >>>>> pointer addresses by printk. Maybe also disable address space >>>>> randomization while debugging this. >>>>> >>>>>> This call works almost fine (I still have the assert failed: >>>>>> folio->private != NULL issue). >>>>>> >>>>>> And when I call it a second time, I get: >>>>>> bash-5.2# ls >>>>>> [ 19.820000] do page fault: >>>>>> [ 19.820000] regs->sr=0x0, regs->pc=0x6011d65a, >>>>>> address=0x700e2004, 2, (ptrval) >>>>> >>>>> Page not present, write fault. >>>>> >>>>> It would be helpful if you could get a dump of /proc/1/maps before >>>>> the execve() syscall in your helloworld init replacement. That >>>>> might confirm all these addresses are legit (assuming mappings >>>>> survive across execve(), that is), and what they correspond to. >>>>> >>>>>> >>>>>> The address corresponds to the defined zone ELF_ET_DYN_BASE as I >>>>>> set it to 0x70000000. >>>>>> >>>>>> regs->pc is not the same as the address. It might be unrelevant, >>>>>> but any help is appreciated to understand the process behind :-). >>>>>> >>>>>> I keep digging, and I am in the asm part which fears me a bit ! >>>>> >>>>> I don't see that you'd need to look at any asm code here. >>>> >>>> I add a small test in do_page_fault, and in case of an error, it >>>> panics. The result follows: >>> >>> Please take a look at the comments at the start of >>> arch/m68k/mm/fault.c:do_page_fault(). The meaning of the bits in >>> error_code are explained there. >>> >>> error_code != 0 is just one possible case out of the four that are >>> handled by do_page_fault(). It does not signify 'no error' - if there >>> hadn't been a page fault, do_page_fault() would not have been called. >>> >>> You just forced a panic each time a write fault and/or a protection >>> fault happens. Write faults are absolutely expected to happen when >>> loading a library - ld.so needs to perform relocation after loading a >>> dynamic library, and that means writes to the GOT in the library's >>> data segment (PIC assumed). >>> >>> >>>> ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log >>>> [ 3.857000] Run /bin/bash as init process >>>> [ 3.858000] with arguments: >>>> [ 3.861000] /bin/bash >>>> [ 3.862000] with environment: >>>> [ 3.863000] HOME=/ >>>> [ 3.864000] TERM=linux >>>> [ 4.242000] do page fault: >>>> [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, >>>> address=0x700b3364, 2, 41fb0000 >>>> [ 4.242000] Kernel panic - not syncing: page fault error >>>> [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted >>>> 6.10.0-rc5-g927da6cf01fe-dirty #25 >>>> [ 4.242000] Stack from 4186dda8: >>>> [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 >>>> 00000000 4136ee10 41423aa4 >>>> [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d >>>> 4186de60 41fb0000 41d51a60 >>>> [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 >>>> 700b3364 00000002 41fb0000 >>>> [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 >>>> 41d51a00 4186de60 41d51a60 >>>> [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 >>>> 700b3364 00000002 00000000 >>>> [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 >>>> 4186de60 00000000 00000000 >>>> [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) >>>> [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) >>>> [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator >>>> 1)) >>>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 >>>> arch/m68k/kernel/traps.c:775) >>>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 >>>> arch/m68k/kernel/traps.c:775) >>>> [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) >>>> [ 4.242000] ma_slots (lib/maple_tree.c:759) >>>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>>> [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) >>>> fs/binfmt_elf.c:421 (discriminator 1)) >>>> [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) >>>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>>> [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 >>>> fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) >>>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>>> [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 >>>> fs/exec.c:1891 fs/exec.c:1867) >>>> [ 4.242000] copy_strings_kernel (fs/exec.c:669) >>>> [ 4.242000] count_strings_kernel (fs/exec.c:473) >>>> [ 4.242000] kernel_execve (fs/exec.c:2058) >>>> [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) >>>> [ 4.242000] run_init_process (init/main.c:1389) >>>> [ 4.242000] _printk (kernel/printk/printk.c:2365) >>>> [ 4.242000] kernel_init (init/main.c:1508) >>>> [ 4.242000] kernel_init (init/main.c:1459) >>>> [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) >>>> [ 4.242000] >>>> [ 4.242000] ---[ end Kernel panic - not syncing: page fault error >>>> ]--- >>>> >>>> Looks like a memory mapping failure, but why ? >>>> My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM >>>> starts at 0x40000000 and ends at 0x50000000 (256MB). >>> 0x41fb0000 seems to be init's page directory. The fault address is in >>> the range where I'd expect dynamic libraries to reside. >>>> >>>> It looks like a TLB write miss which is obscure to me :-). >>>> >>>> I tried to use the /proc but as expected it is not alive after >>>> mounting it. >>> >>> The memory map ought to be accessible through sysrq - an alternative >>> would be to modify the ELF binfmt handler and dump the map once ld.so >>> has finished with relocations. >> >> I added a dump in the binfmt_elf file: >> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c >> index a43897b03ce9..395f556f3a90 100644 >> --- a/fs/binfmt_elf.c >> +++ b/fs/binfmt_elf.c >> @@ -816,6 +816,63 @@ static int parse_elf_properties(struct file *f, >> const struct elf_phdr *phdr, >> return ret == -ENOENT ? 0 : ret; >> } >> >> +static int dump_memory_map(struct task_struct *task) >> +{ >> + struct mm_struct *mm = task->mm; >> + struct vm_area_struct *vma; >> + MA_STATE(mas, &mm->mm_mt, 0, -1); >> + struct file *file; >> + struct path *path; >> + char *buf; >> + char *pathname; >> + >> + // Acquire the read lock for mmap_lock >> + down_read(&mm->mmap_lock); >> + mas_lock(&mas); >> + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, >> ULONG_MAX)) { >> + if (vma->vm_file) { >> + buf = (char *)__get_free_page(GFP_KERNEL); >> + if (!buf) { >> + continue; // Handle memory allocation failure >> + } >> + >> + file = vma->vm_file; >> + path = &file->f_path; >> + pathname = d_path(path, buf, PAGE_SIZE); >> + if (IS_ERR(pathname)) { >> + pathname = NULL; >> + } >> + >> + pr_info("%lx-%lx %c%c%c%c %08lx %02x:%02x %lu %s\n", >> + vma->vm_start, vma->vm_end, >> + vma->vm_flags & VM_READ ? 'r' : '-', >> + vma->vm_flags & VM_WRITE ? 'w' : '-', >> + vma->vm_flags & VM_EXEC ? 'x' : '-', >> + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', >> + vma->vm_pgoff << PAGE_SHIFT, >> + MAJOR(file->f_inode->i_rdev), >> + MINOR(file->f_inode->i_rdev), >> + file->f_inode->i_ino, >> + pathname ? pathname : ""); >> + >> + free_page((unsigned long)buf); >> + } else { >> + pr_info("%lx-%lx %c%c%c%c %08lx 00:00 0\n", >> + vma->vm_start, vma->vm_end, >> + vma->vm_flags & VM_READ ? 'r' : '-', >> + vma->vm_flags & VM_WRITE ? 'w' : '-', >> + vma->vm_flags & VM_EXEC ? 'x' : '-', >> + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', >> + vma->vm_pgoff << PAGE_SHIFT); >> + } >> + } >> + mas_unlock(&mas); >> + // Release the read lock for mmap_lock >> + up_read(&mm->mmap_lock); >> + >> + return 0; >> +} >> + >> static int load_elf_binary(struct linux_binprm *bprm) >> { >> struct file *interpreter = NULL; /* to shut gcc up */ >> @@ -1299,6 +1356,9 @@ static int load_elf_binary(struct linux_binprm >> *bprm) >> >> finalize_exec(bprm); >> START_THREAD(elf_ex, regs, elf_entry, bprm->p); >> + if (current->pid == 1) { // Check if this is the init process >> + dump_memory_map(current); >> + } >> retval = 0; >> out: >> return retval; >> >> I think it is quick and dirty, but seems to do the trick. >> I then get in my console: >> [ 4.265000] 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> [ 4.266000] 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> [ 4.267000] 70000000-700ac000 r-xp 00000000 00:00 27 /bin/bash >> [ 4.268000] 700ac000-700b4000 rw-p 000ac000 00:00 27 /bin/bash >> [ 4.269000] 700b4000-700be000 rwxp 700b4000 00:00 0 >> [ 4.270000] bfe7a000-bfe9c000 rw-p bffde000 00:00 0 >> >> But nothing rings a bell at this level for me... >> Thanks ! > > Here is the same dump trace generated on my newly resurrected M5475EVB > for comparison: > > [snip] > Freeing unused kernel image (initmem) memory: 80K > This architecture does not have kernel memory protection. > Run /sbin/init as init process > Run /etc/init as init process > Run /bin/init as init process > process '/bin/init' started with executable stack I don't have this message, I suppose it is related to uClibc vs libc ? > 60000000-60008000 r-xp 00000000 00:00 550544 /lib/ld-uClibc-0.9.33.2.so > 60008000-6000c000 rw-p 00006000 00:00 550544 /lib/ld-uClibc-0.9.33.2.so > 80000000-80004000 r-xp 00000000 00:00 1882624 /bin/init > 80004000-80008000 rw-p 00002000 00:00 1882624 /bin/init You init is at 0x8000000 and not 0x7000000... Interesting. Even if I don't think it has a big impact... > bfc9a000-bfcbc000 rwxp bffde000 00:00 0 > Welcome to > ... > > Execution otherwise continues as normal to a shell after this. > > Regards > Greg > >
Jean-Michel, Am 28.06.2024 um 00:36 schrieb Jean-Michel Hautbois: >>> ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log >>> [ 3.857000] Run /bin/bash as init process >>> [ 3.858000] with arguments: >>> [ 3.861000] /bin/bash >>> [ 3.862000] with environment: >>> [ 3.863000] HOME=/ >>> [ 3.864000] TERM=linux >>> [ 4.242000] do page fault: >>> [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, >>> address=0x700b3364, 2, 41fb0000 >>> [ 4.242000] Kernel panic - not syncing: page fault error >>> [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted >>> 6.10.0-rc5-g927da6cf01fe-dirty #25 >>> [ 4.242000] Stack from 4186dda8: >>> [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 >>> 00000000 4136ee10 41423aa4 >>> [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d >>> 4186de60 41fb0000 41d51a60 >>> [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 >>> 700b3364 00000002 41fb0000 >>> [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 >>> 41d51a00 4186de60 41d51a60 >>> [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 >>> 700b3364 00000002 00000000 >>> [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 >>> 4186de60 00000000 00000000 >>> [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) >>> [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) >>> [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator 1)) >>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 >>> arch/m68k/kernel/traps.c:775) >>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 >>> arch/m68k/kernel/traps.c:775) >>> [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) >>> [ 4.242000] ma_slots (lib/maple_tree.c:759) >>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>> [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) >>> fs/binfmt_elf.c:421 (discriminator 1)) >>> [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) >>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>> [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 >>> fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) >>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>> [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 >>> fs/exec.c:1891 fs/exec.c:1867) >>> [ 4.242000] copy_strings_kernel (fs/exec.c:669) >>> [ 4.242000] count_strings_kernel (fs/exec.c:473) >>> [ 4.242000] kernel_execve (fs/exec.c:2058) >>> [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) >>> [ 4.242000] run_init_process (init/main.c:1389) >>> [ 4.242000] _printk (kernel/printk/printk.c:2365) >>> [ 4.242000] kernel_init (init/main.c:1508) >>> [ 4.242000] kernel_init (init/main.c:1459) >>> [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) >>> [ 4.242000] >>> [ 4.242000] ---[ end Kernel panic - not syncing: page fault error >>> ]--- >>> >>> Looks like a memory mapping failure, but why ? >>> My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM >>> starts at 0x40000000 and ends at 0x50000000 (256MB). >> 0x41fb0000 seems to be init's page directory. The fault address is in >> the range where I'd expect dynamic libraries to reside. >>> >>> It looks like a TLB write miss which is obscure to me :-). >>> >>> I tried to use the /proc but as expected it is not alive after >>> mounting it. >> >> The memory map ought to be accessible through sysrq - an alternative >> would be to modify the ELF binfmt handler and dump the map once ld.so >> has finished with relocations. > > I added a dump in the binfmt_elf file: > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index a43897b03ce9..395f556f3a90 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -816,6 +816,63 @@ static int parse_elf_properties(struct file *f, > const struct elf_phdr *phdr, > return ret == -ENOENT ? 0 : ret; > } > > +static int dump_memory_map(struct task_struct *task) > +{ > + struct mm_struct *mm = task->mm; > + struct vm_area_struct *vma; > + MA_STATE(mas, &mm->mm_mt, 0, -1); > + struct file *file; > + struct path *path; > + char *buf; > + char *pathname; > + > + // Acquire the read lock for mmap_lock > + down_read(&mm->mmap_lock); > + mas_lock(&mas); > + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, > ULONG_MAX)) { > + if (vma->vm_file) { > + buf = (char *)__get_free_page(GFP_KERNEL); > + if (!buf) { > + continue; // Handle memory allocation failure > + } > + > + file = vma->vm_file; > + path = &file->f_path; > + pathname = d_path(path, buf, PAGE_SIZE); > + if (IS_ERR(pathname)) { > + pathname = NULL; > + } > + > + pr_info("%lx-%lx %c%c%c%c %08lx %02x:%02x %lu %s\n", > + vma->vm_start, vma->vm_end, > + vma->vm_flags & VM_READ ? 'r' : '-', > + vma->vm_flags & VM_WRITE ? 'w' : '-', > + vma->vm_flags & VM_EXEC ? 'x' : '-', > + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', > + vma->vm_pgoff << PAGE_SHIFT, > + MAJOR(file->f_inode->i_rdev), > + MINOR(file->f_inode->i_rdev), > + file->f_inode->i_ino, > + pathname ? pathname : ""); > + > + free_page((unsigned long)buf); > + } else { > + pr_info("%lx-%lx %c%c%c%c %08lx 00:00 0\n", > + vma->vm_start, vma->vm_end, > + vma->vm_flags & VM_READ ? 'r' : '-', > + vma->vm_flags & VM_WRITE ? 'w' : '-', > + vma->vm_flags & VM_EXEC ? 'x' : '-', > + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', > + vma->vm_pgoff << PAGE_SHIFT); > + } > + } > + mas_unlock(&mas); > + // Release the read lock for mmap_lock > + up_read(&mm->mmap_lock); > + > + return 0; > +} > + > static int load_elf_binary(struct linux_binprm *bprm) > { > struct file *interpreter = NULL; /* to shut gcc up */ > @@ -1299,6 +1356,9 @@ static int load_elf_binary(struct linux_binprm *bprm) > > finalize_exec(bprm); > START_THREAD(elf_ex, regs, elf_entry, bprm->p); > + if (current->pid == 1) { // Check if this is the init process > + dump_memory_map(current); > + } > retval = 0; > out: > return retval; > > I think it is quick and dirty, but seems to do the trick. > I then get in my console: > [ 4.265000] 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > [ 4.266000] 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 > [ 4.267000] 70000000-700ac000 r-xp 00000000 00:00 27 /bin/bash > [ 4.268000] 700ac000-700b4000 rw-p 000ac000 00:00 27 /bin/bash > [ 4.269000] 700b4000-700be000 rwxp 700b4000 00:00 0 > [ 4.270000] bfe7a000-bfe9c000 rw-p bffde000 00:00 0 > > But nothing rings a bell at this level for me... You are again using bash as init process, so the process of interest (/bin/ls) is a child of init and does not get logged here. But we are getting some mapping information at last. No shared libraries mapped as yet, but that's the same in Greg's dump. Comparing with what I get from cat /proc/1/maps on a 030 system., the rwxp section at 0x700b4000 is the process heap, and the rwp section at 0xbfe7a000 is the process stack. I forgot to take into account that libraries are loaded only the binary starts executing. Can you print the same map dump in the exit syscall code path? That ought to show all loaded libraries at that point. Cheers, Michael > Thanks !
Hi Michael, On 28/06/2024 01:58, Michael Schmitz wrote: > Jean-Michel, > > Am 28.06.2024 um 00:36 schrieb Jean-Michel Hautbois: >>>> ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log >>>> [ 3.857000] Run /bin/bash as init process >>>> [ 3.858000] with arguments: >>>> [ 3.861000] /bin/bash >>>> [ 3.862000] with environment: >>>> [ 3.863000] HOME=/ >>>> [ 3.864000] TERM=linux >>>> [ 4.242000] do page fault: >>>> [ 4.242000] regs->sr=0x2000, regs->pc=0x41366924, >>>> address=0x700b3364, 2, 41fb0000 >>>> [ 4.242000] Kernel panic - not syncing: page fault error >>>> [ 4.242000] CPU: 0 PID: 1 Comm: bash Not tainted >>>> 6.10.0-rc5-g927da6cf01fe-dirty #25 >>>> [ 4.242000] Stack from 4186dda8: >>>> [ 4.242000] 4186dda8 41423aa4 41423aa4 700b3300 00000001 >>>> 00000000 4136ee10 41423aa4 >>>> [ 4.242000] 41366d7a 700b3364 700b3364 00000000 0000000d >>>> 4186de60 41fb0000 41d51a60 >>>> [ 4.242000] 41005696 41416a90 41416a4d 00002000 41366924 >>>> 700b3364 00000002 41fb0000 >>>> [ 4.242000] 0000000a 700b3364 00000000 0000000d 00000012 >>>> 41d51a00 4186de60 41d51a60 >>>> [ 4.242000] 41fb81c0 41d51a60 410052fe 4100529a 4186de60 >>>> 700b3364 00000002 00000000 >>>> [ 4.242000] 700bc414 00000003 00008000 700ac000 41003660 >>>> 4186de60 00000000 00000000 >>>> [ 4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) >>>> [ 4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) >>>> [ 4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator >>>> 1)) >>>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:725 >>>> arch/m68k/kernel/traps.c:775) >>>> [ 4.242000] buserr_c (arch/m68k/kernel/traps.c:748 >>>> arch/m68k/kernel/traps.c:775) >>>> [ 4.242000] buserr (arch/m68k/kernel/entry.S:116) >>>> [ 4.242000] ma_slots (lib/maple_tree.c:759) >>>> [ 4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >>>> [ 4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) >>>> fs/binfmt_elf.c:421 (discriminator 1)) >>>> [ 4.242000] load_elf_binary (fs/binfmt_elf.c:1132) >>>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>>> [ 4.242000] load_misc_binary (fs/binfmt_misc.c:97 >>>> fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) >>>> [ 4.242000] memset (arch/m68k/lib/memset.c:11) >>>> [ 4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 >>>> fs/exec.c:1891 fs/exec.c:1867) >>>> [ 4.242000] copy_strings_kernel (fs/exec.c:669) >>>> [ 4.242000] count_strings_kernel (fs/exec.c:473) >>>> [ 4.242000] kernel_execve (fs/exec.c:2058) >>>> [ 4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) >>>> [ 4.242000] run_init_process (init/main.c:1389) >>>> [ 4.242000] _printk (kernel/printk/printk.c:2365) >>>> [ 4.242000] kernel_init (init/main.c:1508) >>>> [ 4.242000] kernel_init (init/main.c:1459) >>>> [ 4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) >>>> [ 4.242000] >>>> [ 4.242000] ---[ end Kernel panic - not syncing: page fault error >>>> ]--- >>>> >>>> Looks like a memory mapping failure, but why ? >>>> My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM >>>> starts at 0x40000000 and ends at 0x50000000 (256MB). >>> 0x41fb0000 seems to be init's page directory. The fault address is in >>> the range where I'd expect dynamic libraries to reside. >>>> >>>> It looks like a TLB write miss which is obscure to me :-). >>>> >>>> I tried to use the /proc but as expected it is not alive after >>>> mounting it. >>> >>> The memory map ought to be accessible through sysrq - an alternative >>> would be to modify the ELF binfmt handler and dump the map once ld.so >>> has finished with relocations. >> >> I added a dump in the binfmt_elf file: >> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c >> index a43897b03ce9..395f556f3a90 100644 >> --- a/fs/binfmt_elf.c >> +++ b/fs/binfmt_elf.c >> @@ -816,6 +816,63 @@ static int parse_elf_properties(struct file *f, >> const struct elf_phdr *phdr, >> return ret == -ENOENT ? 0 : ret; >> } >> >> +static int dump_memory_map(struct task_struct *task) >> +{ >> + struct mm_struct *mm = task->mm; >> + struct vm_area_struct *vma; >> + MA_STATE(mas, &mm->mm_mt, 0, -1); >> + struct file *file; >> + struct path *path; >> + char *buf; >> + char *pathname; >> + >> + // Acquire the read lock for mmap_lock >> + down_read(&mm->mmap_lock); >> + mas_lock(&mas); >> + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, >> ULONG_MAX)) { >> + if (vma->vm_file) { >> + buf = (char *)__get_free_page(GFP_KERNEL); >> + if (!buf) { >> + continue; // Handle memory allocation failure >> + } >> + >> + file = vma->vm_file; >> + path = &file->f_path; >> + pathname = d_path(path, buf, PAGE_SIZE); >> + if (IS_ERR(pathname)) { >> + pathname = NULL; >> + } >> + >> + pr_info("%lx-%lx %c%c%c%c %08lx %02x:%02x %lu %s\n", >> + vma->vm_start, vma->vm_end, >> + vma->vm_flags & VM_READ ? 'r' : '-', >> + vma->vm_flags & VM_WRITE ? 'w' : '-', >> + vma->vm_flags & VM_EXEC ? 'x' : '-', >> + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', >> + vma->vm_pgoff << PAGE_SHIFT, >> + MAJOR(file->f_inode->i_rdev), >> + MINOR(file->f_inode->i_rdev), >> + file->f_inode->i_ino, >> + pathname ? pathname : ""); >> + >> + free_page((unsigned long)buf); >> + } else { >> + pr_info("%lx-%lx %c%c%c%c %08lx 00:00 0\n", >> + vma->vm_start, vma->vm_end, >> + vma->vm_flags & VM_READ ? 'r' : '-', >> + vma->vm_flags & VM_WRITE ? 'w' : '-', >> + vma->vm_flags & VM_EXEC ? 'x' : '-', >> + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', >> + vma->vm_pgoff << PAGE_SHIFT); >> + } >> + } >> + mas_unlock(&mas); >> + // Release the read lock for mmap_lock >> + up_read(&mm->mmap_lock); >> + >> + return 0; >> +} >> + >> static int load_elf_binary(struct linux_binprm *bprm) >> { >> struct file *interpreter = NULL; /* to shut gcc up */ >> @@ -1299,6 +1356,9 @@ static int load_elf_binary(struct linux_binprm >> *bprm) >> >> finalize_exec(bprm); >> START_THREAD(elf_ex, regs, elf_entry, bprm->p); >> + if (current->pid == 1) { // Check if this is the init process >> + dump_memory_map(current); >> + } >> retval = 0; >> out: >> return retval; >> >> I think it is quick and dirty, but seems to do the trick. >> I then get in my console: >> [ 4.265000] 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> [ 4.266000] 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> [ 4.267000] 70000000-700ac000 r-xp 00000000 00:00 27 /bin/bash >> [ 4.268000] 700ac000-700b4000 rw-p 000ac000 00:00 27 /bin/bash >> [ 4.269000] 700b4000-700be000 rwxp 700b4000 00:00 0 >> [ 4.270000] bfe7a000-bfe9c000 rw-p bffde000 00:00 0 >> >> But nothing rings a bell at this level for me... > > You are again using bash as init process, so the process of interest > (/bin/ls) is a child of init and does not get logged here. > > But we are getting some mapping information at last. No shared libraries > mapped as yet, but that's the same in Greg's dump. > > Comparing with what I get from cat /proc/1/maps on a 030 system., the > rwxp section at 0x700b4000 is the process heap, and the rwp section at > 0xbfe7a000 is the process stack. > > I forgot to take into account that libraries are loaded only the binary > starts executing. Can you print the same map dump in the exit syscall > code path? That ought to show all loaded libraries at that point. Thanks for your suggestion ! I changed it a bit, and I added a call in do_exit() as suggested. When executing ls I get: bash-5.2# ls -l load_elf_binary: Dump memory for ls (31): mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] do_exit: Dump memory for ls (31): mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 mmap: 60030000-60032000 rw-p 60030000 00:00 0 mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 mmap: 60160000-6016e000 rw-p 60160000 00:00 0 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] When I call it a second time, I get: bash-5.2# ls -l load_elf_binary: Dump memory for ls (33): mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] do_exit: Dump memory for ls (33): mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 mmap: 60030000-60032000 rw-p 60030000 00:00 0 mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 mmap: 60160000-6016e000 rw-p 60160000 00:00 0 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] The first call generates the "ls" output, not the second one. The helper looks like: diff --git a/mm/mmap.c b/mm/mmap.c index 83b4682ec85c..14d861e9cba2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -76,6 +76,87 @@ int mmap_rnd_compat_bits __read_mostly = CONFIG_ARCH_MMAP_RND_COMPAT_BITS; static bool ignore_rlimit_data; core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); +int dump_memory_map(struct task_struct *task) +{ + struct mm_struct *mm = task->mm; + struct vm_area_struct *vma; + struct file *file; + struct path *path; + char *buf; + char *pathname; + + if (!mm) { + return -ENOMEM; + } + + MA_STATE(mas, &mm->mm_mt, 0, -1); + // Acquire the read lock for mmap_lock + down_read(&mm->mmap_lock); + mas_lock(&mas); + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, ULONG_MAX)) { + char perms[5] = "---p"; // Default permissions + // Set permissions based on vm_flags + if (vma->vm_flags & VM_READ) perms[0] = 'r'; + if (vma->vm_flags & VM_WRITE) perms[1] = 'w'; + if (vma->vm_flags & VM_EXEC) perms[2] = 'x'; + if (vma->vm_flags & VM_MAYSHARE) perms[3] = 's'; + + if (vma->vm_file) { // If there's an associated file + buf = (char *)__get_free_page(GFP_KERNEL); + if (!buf) { + continue; // Handle memory allocation failure + } + + file = vma->vm_file; + path = &file->f_path; + pathname = d_path(path, buf, PAGE_SIZE); + if (IS_ERR(pathname)) { + pathname = NULL; + } + + // Print memory area information with file path + pr_info("%08lx-%08lx %s %08lx %02x:%02x %lu %s\n", + vma->vm_start, vma->vm_end, + perms, + vma->vm_pgoff << PAGE_SHIFT, + MAJOR(file_inode(file)->i_rdev), + MINOR(file_inode(file)->i_rdev), + file_inode(file)->i_ino, + pathname ? pathname : ""); + + free_page((unsigned long)buf); + } else { + char *special_area_name = NULL; + + // Check for heap + if (vma->vm_end > mm->start_brk && vma->vm_start < mm->brk) { + special_area_name = "[heap]"; + } + // Check for stack + else if (vma->vm_start <= mm->start_stack && vma->vm_end >= mm->start_stack) { + special_area_name = "[stack]"; + } + // Check for vdso + else if (vma->vm_flags & VM_EXEC && vma->vm_flags & VM_READ && !vma->vm_file) { + special_area_name = "[vdso]"; + } + + // Print memory area information without file path + pr_info("%08lx-%08lx %s %08lx 00:00 0 %s\n", + vma->vm_start, vma->vm_end, + perms, + vma->vm_pgoff << PAGE_SHIFT, + special_area_name ? special_area_name : " "); + } + } + mas_unlock(&mas); + // Release the read lock for mmap_lock + up_read(&mm->mmap_lock); + + return 0; +} +EXPORT_SYMBOL(dump_memory_map); Thanks, JM
Jean-Michel, Am 28.06.2024 um 19:24 schrieb Jean-Michel Hautbois: >> I forgot to take into account that libraries are loaded only the >> binary starts executing. Can you print the same map dump in the exit >> syscall code path? That ought to show all loaded libraries at that point. > > Thanks for your suggestion ! > I changed it a bit, and I added a call in do_exit() as suggested. When > executing ls I get: > > bash-5.2# ls -l > load_elf_binary: Dump memory for ls (31): > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox > mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] > > do_exit: Dump memory for ls (31): > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 > mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 > mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 > mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 > mmap: 60030000-60032000 rw-p 60030000 00:00 0 > mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 > mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 > mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 > mmap: 60160000-6016e000 rw-p 60160000 00:00 0 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox > mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox > mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] > mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] > > When I call it a second time, I get: > > bash-5.2# ls -l > load_elf_binary: Dump memory for ls (33): > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox > mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] > do_exit: Dump memory for ls (33): > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 > mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 > mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 > mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 > mmap: 60030000-60032000 rw-p 60030000 00:00 0 > mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 > mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 > mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 > mmap: 60160000-6016e000 rw-p 60160000 00:00 0 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox > mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox No heap in this second call. Can you print mm->start_brk and mm->brk please? The process memory layout is a little unusual (I would have expected the binary to be mapped before the dynamic libraries, not after). Is that expected on Coldfire, Greg? Cheers, Michael > mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] > > The first call generates the "ls" output, not the second one. > The helper looks like: > diff --git a/mm/mmap.c b/mm/mmap.c > index 83b4682ec85c..14d861e9cba2 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -76,6 +76,87 @@ int mmap_rnd_compat_bits __read_mostly = > CONFIG_ARCH_MMAP_RND_COMPAT_BITS; > static bool ignore_rlimit_data; > core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); > > +int dump_memory_map(struct task_struct *task) > +{ > + struct mm_struct *mm = task->mm; > + struct vm_area_struct *vma; > + struct file *file; > + struct path *path; > + char *buf; > + char *pathname; > + > + if (!mm) { > + return -ENOMEM; > + } > + > + MA_STATE(mas, &mm->mm_mt, 0, -1); > + // Acquire the read lock for mmap_lock > + down_read(&mm->mmap_lock); > + mas_lock(&mas); > + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, > ULONG_MAX)) { > + char perms[5] = "---p"; // Default permissions > + // Set permissions based on vm_flags > + if (vma->vm_flags & VM_READ) perms[0] = 'r'; > + if (vma->vm_flags & VM_WRITE) perms[1] = 'w'; > + if (vma->vm_flags & VM_EXEC) perms[2] = 'x'; > + if (vma->vm_flags & VM_MAYSHARE) perms[3] = 's'; > + > + if (vma->vm_file) { // If there's an associated file > + buf = (char *)__get_free_page(GFP_KERNEL); > + if (!buf) { > + continue; // Handle memory allocation > failure > + } > + > + file = vma->vm_file; > + path = &file->f_path; > + pathname = d_path(path, buf, PAGE_SIZE); > + if (IS_ERR(pathname)) { > + pathname = NULL; > + } > + > + // Print memory area information with file path > + pr_info("%08lx-%08lx %s %08lx %02x:%02x %lu %s\n", > + vma->vm_start, vma->vm_end, > + perms, > + vma->vm_pgoff << PAGE_SHIFT, > + MAJOR(file_inode(file)->i_rdev), > + MINOR(file_inode(file)->i_rdev), > + file_inode(file)->i_ino, > + pathname ? pathname : ""); > + > + free_page((unsigned long)buf); > + } else { > + char *special_area_name = NULL; > + > + // Check for heap > + if (vma->vm_end > mm->start_brk && vma->vm_start > < mm->brk) { > + special_area_name = "[heap]"; > + } > + // Check for stack > + else if (vma->vm_start <= mm->start_stack && > vma->vm_end >= mm->start_stack) { > + special_area_name = "[stack]"; > + } > + // Check for vdso > + else if (vma->vm_flags & VM_EXEC && > vma->vm_flags & VM_READ && !vma->vm_file) { > + special_area_name = "[vdso]"; > + } > + > + // Print memory area information without file path > + pr_info("%08lx-%08lx %s %08lx 00:00 0 %s\n", > + vma->vm_start, vma->vm_end, > + perms, > + vma->vm_pgoff << PAGE_SHIFT, > + special_area_name ? special_area_name : > " "); > + } > + } > + mas_unlock(&mas); > + // Release the read lock for mmap_lock > + up_read(&mm->mmap_lock); > + > + return 0; > +} > +EXPORT_SYMBOL(dump_memory_map); > > > Thanks, > JM
Michael, On 28/06/2024 09:48, Michael Schmitz wrote: > Jean-Michel, > > Am 28.06.2024 um 19:24 schrieb Jean-Michel Hautbois: >>> I forgot to take into account that libraries are loaded only the >>> binary starts executing. Can you print the same map dump in the exit >>> syscall code path? That ought to show all loaded libraries at that >>> point. >> >> Thanks for your suggestion ! >> I changed it a bit, and I added a call in do_exit() as suggested. When >> executing ls I get: >> >> bash-5.2# ls -l >> load_elf_binary: Dump memory for ls (31): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >> >> do_exit: Dump memory for ls (31): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox >> mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] >> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >> >> When I call it a second time, I get: >> >> bash-5.2# ls -l >> load_elf_binary: Dump memory for ls (33): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >> do_exit: Dump memory for ls (33): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox > > No heap in this second call. Can you print mm->start_brk and mm->brk > please? Sure: mmap: start_brk: 700ca000, brk: 700ca000 > > The process memory layout is a little unusual (I would have expected the > binary to be mapped before the dynamic libraries, not after). Is that > expected on Coldfire, Greg? > > Cheers, > > Michael > > >> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >> >> The first call generates the "ls" output, not the second one. >> The helper looks like: >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 83b4682ec85c..14d861e9cba2 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -76,6 +76,87 @@ int mmap_rnd_compat_bits __read_mostly = >> CONFIG_ARCH_MMAP_RND_COMPAT_BITS; >> static bool ignore_rlimit_data; >> core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); >> >> +int dump_memory_map(struct task_struct *task) >> +{ >> + struct mm_struct *mm = task->mm; >> + struct vm_area_struct *vma; >> + struct file *file; >> + struct path *path; >> + char *buf; >> + char *pathname; >> + >> + if (!mm) { >> + return -ENOMEM; >> + } >> + >> + MA_STATE(mas, &mm->mm_mt, 0, -1); >> + // Acquire the read lock for mmap_lock >> + down_read(&mm->mmap_lock); >> + mas_lock(&mas); >> + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, >> ULONG_MAX)) { >> + char perms[5] = "---p"; // Default permissions >> + // Set permissions based on vm_flags >> + if (vma->vm_flags & VM_READ) perms[0] = 'r'; >> + if (vma->vm_flags & VM_WRITE) perms[1] = 'w'; >> + if (vma->vm_flags & VM_EXEC) perms[2] = 'x'; >> + if (vma->vm_flags & VM_MAYSHARE) perms[3] = 's'; >> + >> + if (vma->vm_file) { // If there's an associated file >> + buf = (char *)__get_free_page(GFP_KERNEL); >> + if (!buf) { >> + continue; // Handle memory allocation >> failure >> + } >> + >> + file = vma->vm_file; >> + path = &file->f_path; >> + pathname = d_path(path, buf, PAGE_SIZE); >> + if (IS_ERR(pathname)) { >> + pathname = NULL; >> + } >> + >> + // Print memory area information with file path >> + pr_info("%08lx-%08lx %s %08lx %02x:%02x %lu >> %s\n", >> + vma->vm_start, vma->vm_end, >> + perms, >> + vma->vm_pgoff << PAGE_SHIFT, >> + MAJOR(file_inode(file)->i_rdev), >> + MINOR(file_inode(file)->i_rdev), >> + file_inode(file)->i_ino, >> + pathname ? pathname : ""); >> + >> + free_page((unsigned long)buf); >> + } else { >> + char *special_area_name = NULL; >> + >> + // Check for heap >> + if (vma->vm_end > mm->start_brk && vma->vm_start >> < mm->brk) { >> + special_area_name = "[heap]"; >> + } >> + // Check for stack >> + else if (vma->vm_start <= mm->start_stack && >> vma->vm_end >= mm->start_stack) { >> + special_area_name = "[stack]"; >> + } >> + // Check for vdso >> + else if (vma->vm_flags & VM_EXEC && >> vma->vm_flags & VM_READ && !vma->vm_file) { >> + special_area_name = "[vdso]"; >> + } >> + >> + // Print memory area information without file >> path >> + pr_info("%08lx-%08lx %s %08lx 00:00 0 %s\n", >> + vma->vm_start, vma->vm_end, >> + perms, >> + vma->vm_pgoff << PAGE_SHIFT, >> + special_area_name ? special_area_name : >> " "); >> + } >> + } >> + mas_unlock(&mas); >> + // Release the read lock for mmap_lock >> + up_read(&mm->mmap_lock); >> + >> + return 0; >> +} >> +EXPORT_SYMBOL(dump_memory_map); >> >> >> Thanks, >> JM
Michael, On 28/06/2024 09:48, Michael Schmitz wrote: > Jean-Michel, > > Am 28.06.2024 um 19:24 schrieb Jean-Michel Hautbois: >>> I forgot to take into account that libraries are loaded only the >>> binary starts executing. Can you print the same map dump in the exit >>> syscall code path? That ought to show all loaded libraries at that >>> point. >> >> Thanks for your suggestion ! >> I changed it a bit, and I added a call in do_exit() as suggested. When >> executing ls I get: >> >> bash-5.2# ls -l >> load_elf_binary: Dump memory for ls (31): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >> >> do_exit: Dump memory for ls (31): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox >> mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] >> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >> >> When I call it a second time, I get: >> >> bash-5.2# ls -l >> load_elf_binary: Dump memory for ls (33): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >> do_exit: Dump memory for ls (33): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox > > No heap in this second call. Can you print mm->start_brk and mm->brk > please? > > The process memory layout is a little unusual (I would have expected the > binary to be mapped before the dynamic libraries, not after). Is that > expected on Coldfire, Greg? > I did the test, and called ls twice to see what differs: bash-5.2# ls -l > /dev/null load_elf_binary: Dump memory for ls (31): mmap: start_brk: 700ca000, brk: 700ca000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox mmap: bfa48000-bfa6a000 rw-p bffde000 00:00 0 [stack] do_exit: Dump memory for ls (31): mmap: start_brk: 700ca000, brk: 700ec000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 mmap: 60030000-60032000 rw-p 60030000 00:00 0 mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 mmap: 60160000-6016e000 rw-p 60160000 00:00 0 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] mmap: bfa48000-bfa6a000 rw-p bffde000 00:00 0 [stack] bash-5.2# ls -l > /dev/null load_elf_binary: Dump memory for ls (33): mmap: start_brk: 700ca000, brk: 700ca000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox mmap: bf894000-bf8b6000 rw-p bffde000 00:00 0 [stack] do_exit: Dump memory for ls (33): mmap: start_brk: 700ca000, brk: 700ca000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 mmap: 60030000-60032000 rw-p 60030000 00:00 0 mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 mmap: 60160000-6016e000 rw-p 60160000 00:00 0 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox mmap: bf894000-bf8b6000 rw-p bffde000 00:00 0 [stack] The second time, there seems to be no heap... JM > Cheers, > > Michael > > >> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >> >> The first call generates the "ls" output, not the second one. >> The helper looks like: >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 83b4682ec85c..14d861e9cba2 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -76,6 +76,87 @@ int mmap_rnd_compat_bits __read_mostly = >> CONFIG_ARCH_MMAP_RND_COMPAT_BITS; >> static bool ignore_rlimit_data; >> core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); >> >> +int dump_memory_map(struct task_struct *task) >> +{ >> + struct mm_struct *mm = task->mm; >> + struct vm_area_struct *vma; >> + struct file *file; >> + struct path *path; >> + char *buf; >> + char *pathname; >> + >> + if (!mm) { >> + return -ENOMEM; >> + } >> + >> + MA_STATE(mas, &mm->mm_mt, 0, -1); >> + // Acquire the read lock for mmap_lock >> + down_read(&mm->mmap_lock); >> + mas_lock(&mas); >> + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, >> ULONG_MAX)) { >> + char perms[5] = "---p"; // Default permissions >> + // Set permissions based on vm_flags >> + if (vma->vm_flags & VM_READ) perms[0] = 'r'; >> + if (vma->vm_flags & VM_WRITE) perms[1] = 'w'; >> + if (vma->vm_flags & VM_EXEC) perms[2] = 'x'; >> + if (vma->vm_flags & VM_MAYSHARE) perms[3] = 's'; >> + >> + if (vma->vm_file) { // If there's an associated file >> + buf = (char *)__get_free_page(GFP_KERNEL); >> + if (!buf) { >> + continue; // Handle memory allocation >> failure >> + } >> + >> + file = vma->vm_file; >> + path = &file->f_path; >> + pathname = d_path(path, buf, PAGE_SIZE); >> + if (IS_ERR(pathname)) { >> + pathname = NULL; >> + } >> + >> + // Print memory area information with file path >> + pr_info("%08lx-%08lx %s %08lx %02x:%02x %lu >> %s\n", >> + vma->vm_start, vma->vm_end, >> + perms, >> + vma->vm_pgoff << PAGE_SHIFT, >> + MAJOR(file_inode(file)->i_rdev), >> + MINOR(file_inode(file)->i_rdev), >> + file_inode(file)->i_ino, >> + pathname ? pathname : ""); >> + >> + free_page((unsigned long)buf); >> + } else { >> + char *special_area_name = NULL; >> + >> + // Check for heap >> + if (vma->vm_end > mm->start_brk && vma->vm_start >> < mm->brk) { >> + special_area_name = "[heap]"; >> + } >> + // Check for stack >> + else if (vma->vm_start <= mm->start_stack && >> vma->vm_end >= mm->start_stack) { >> + special_area_name = "[stack]"; >> + } >> + // Check for vdso >> + else if (vma->vm_flags & VM_EXEC && >> vma->vm_flags & VM_READ && !vma->vm_file) { >> + special_area_name = "[vdso]"; >> + } >> + >> + // Print memory area information without file >> path >> + pr_info("%08lx-%08lx %s %08lx 00:00 0 %s\n", >> + vma->vm_start, vma->vm_end, >> + perms, >> + vma->vm_pgoff << PAGE_SHIFT, >> + special_area_name ? special_area_name : >> " "); >> + } >> + } >> + mas_unlock(&mas); >> + // Release the read lock for mmap_lock >> + up_read(&mm->mmap_lock); >> + >> + return 0; >> +} >> +EXPORT_SYMBOL(dump_memory_map); >> >> >> Thanks, >> JM
Jean-Michel, Am 28.06.2024 um 23:25 schrieb Jean-Michel Hautbois: >> >> No heap in this second call. Can you print mm->start_brk and mm->brk >> please? >> >> The process memory layout is a little unusual (I would have expected >> the binary to be mapped before the dynamic libraries, not after). Is >> that expected on Coldfire, Greg? >> > > I did the test, and called ls twice to see what differs: > bash-5.2# ls -l > /dev/null > load_elf_binary: Dump memory for ls (31): > mmap: start_brk: 700ca000, brk: 700ca000 > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox > mmap: bfa48000-bfa6a000 rw-p bffde000 00:00 0 [stack] > do_exit: Dump memory for ls (31): > mmap: start_brk: 700ca000, brk: 700ec000 ls managed to allocate heap space OK ... > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 > mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 > mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 > mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 > mmap: 60030000-60032000 rw-p 60030000 00:00 0 > mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 > mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 > mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 > mmap: 60160000-6016e000 rw-p 60160000 00:00 0 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox > mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox > mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] > mmap: bfa48000-bfa6a000 rw-p bffde000 00:00 0 [stack] > bash-5.2# ls -l > /dev/null > load_elf_binary: Dump memory for ls (33): > mmap: start_brk: 700ca000, brk: 700ca000 > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox > mmap: bf894000-bf8b6000 rw-p bffde000 00:00 0 [stack] > do_exit: Dump memory for ls (33): > mmap: start_brk: 700ca000, brk: 700ca000 No heap space allocated here. I would have expected that to cause an error message from libc ... Can you print a brief meminfo summary such as found in fs/proc/meminfo.c (total, free, available, buffer and cached for starters)? Cheers, Michael > mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 > mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 > mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 > mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 > mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 > mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 > mmap: 60030000-60032000 rw-p 60030000 00:00 0 > mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 > mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 > mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 > mmap: 60160000-6016e000 rw-p 60160000 00:00 0 > mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox > mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox > mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox > mmap: bf894000-bf8b6000 rw-p bffde000 00:00 0 [stack] > > The second time, there seems to be no heap... > > JM > >> Cheers, >> >> Michael >> >> >>> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >>> >>> The first call generates the "ls" output, not the second one. >>> The helper looks like: >>> diff --git a/mm/mmap.c b/mm/mmap.c >>> index 83b4682ec85c..14d861e9cba2 100644 >>> --- a/mm/mmap.c >>> +++ b/mm/mmap.c >>> @@ -76,6 +76,87 @@ int mmap_rnd_compat_bits __read_mostly = >>> CONFIG_ARCH_MMAP_RND_COMPAT_BITS; >>> static bool ignore_rlimit_data; >>> core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); >>> >>> +int dump_memory_map(struct task_struct *task) >>> +{ >>> + struct mm_struct *mm = task->mm; >>> + struct vm_area_struct *vma; >>> + struct file *file; >>> + struct path *path; >>> + char *buf; >>> + char *pathname; >>> + >>> + if (!mm) { >>> + return -ENOMEM; >>> + } >>> + >>> + MA_STATE(mas, &mm->mm_mt, 0, -1); >>> + // Acquire the read lock for mmap_lock >>> + down_read(&mm->mmap_lock); >>> + mas_lock(&mas); >>> + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, >>> ULONG_MAX)) { >>> + char perms[5] = "---p"; // Default permissions >>> + // Set permissions based on vm_flags >>> + if (vma->vm_flags & VM_READ) perms[0] = 'r'; >>> + if (vma->vm_flags & VM_WRITE) perms[1] = 'w'; >>> + if (vma->vm_flags & VM_EXEC) perms[2] = 'x'; >>> + if (vma->vm_flags & VM_MAYSHARE) perms[3] = 's'; >>> + >>> + if (vma->vm_file) { // If there's an associated file >>> + buf = (char *)__get_free_page(GFP_KERNEL); >>> + if (!buf) { >>> + continue; // Handle memory allocation >>> failure >>> + } >>> + >>> + file = vma->vm_file; >>> + path = &file->f_path; >>> + pathname = d_path(path, buf, PAGE_SIZE); >>> + if (IS_ERR(pathname)) { >>> + pathname = NULL; >>> + } >>> + >>> + // Print memory area information with file path >>> + pr_info("%08lx-%08lx %s %08lx %02x:%02x %lu >>> %s\n", >>> + vma->vm_start, vma->vm_end, >>> + perms, >>> + vma->vm_pgoff << PAGE_SHIFT, >>> + MAJOR(file_inode(file)->i_rdev), >>> + MINOR(file_inode(file)->i_rdev), >>> + file_inode(file)->i_ino, >>> + pathname ? pathname : ""); >>> + >>> + free_page((unsigned long)buf); >>> + } else { >>> + char *special_area_name = NULL; >>> + >>> + // Check for heap >>> + if (vma->vm_end > mm->start_brk && vma->vm_start >>> < mm->brk) { >>> + special_area_name = "[heap]"; >>> + } >>> + // Check for stack >>> + else if (vma->vm_start <= mm->start_stack && >>> vma->vm_end >= mm->start_stack) { >>> + special_area_name = "[stack]"; >>> + } >>> + // Check for vdso >>> + else if (vma->vm_flags & VM_EXEC && >>> vma->vm_flags & VM_READ && !vma->vm_file) { >>> + special_area_name = "[vdso]"; >>> + } >>> + >>> + // Print memory area information without file >>> path >>> + pr_info("%08lx-%08lx %s %08lx 00:00 0 %s\n", >>> + vma->vm_start, vma->vm_end, >>> + perms, >>> + vma->vm_pgoff << PAGE_SHIFT, >>> + special_area_name ? special_area_name : >>> " "); >>> + } >>> + } >>> + mas_unlock(&mas); >>> + // Release the read lock for mmap_lock >>> + up_read(&mm->mmap_lock); >>> + >>> + return 0; >>> +} >>> +EXPORT_SYMBOL(dump_memory_map); >>> >>> >>> Thanks, >>> JM
Michael, On 29/06/2024 05:41, Michael Schmitz wrote: > Jean-Michel, > > Am 28.06.2024 um 23:25 schrieb Jean-Michel Hautbois: >>> >>> No heap in this second call. Can you print mm->start_brk and mm->brk >>> please? >>> >>> The process memory layout is a little unusual (I would have expected >>> the binary to be mapped before the dynamic libraries, not after). Is >>> that expected on Coldfire, Greg? >>> >> >> I did the test, and called ls twice to see what differs: >> bash-5.2# ls -l > /dev/null >> load_elf_binary: Dump memory for ls (31): >> mmap: start_brk: 700ca000, brk: 700ca000 >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bfa48000-bfa6a000 rw-p bffde000 00:00 0 [stack] >> do_exit: Dump memory for ls (31): >> mmap: start_brk: 700ca000, brk: 700ec000 > > ls managed to allocate heap space OK ... > >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox >> mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] >> mmap: bfa48000-bfa6a000 rw-p bffde000 00:00 0 [stack] >> bash-5.2# ls -l > /dev/null >> load_elf_binary: Dump memory for ls (33): >> mmap: start_brk: 700ca000, brk: 700ca000 >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bf894000-bf8b6000 rw-p bffde000 00:00 0 [stack] >> do_exit: Dump memory for ls (33): >> mmap: start_brk: 700ca000, brk: 700ca000 > > No heap space allocated here. I would have expected that to cause an > error message from libc ... > > Can you print a brief meminfo summary such as found in fs/proc/meminfo.c > (total, free, available, buffer and cached for starters)? I dumped using sysrq, before and after each ls: bash-5.2# sysrq: Show Memory Mem-Info: active_anon:1 inactive_anon:39 isolated_anon:0 active_file:0 inactive_file:223 isolated_file:0 unevictable:0 dirty:0 writeback:0 slab_reclaimable:109 slab_unreclaimable:641 mapped:212 shmem:0 pagetables:3 sec_pagetables:0 bounce:0 kernel_misc_reclaimable:0 free:28227 free_pcp:200 free_cma:0 Node 0 active_anon:8kB inactive_anon:312kB active_file:0kB inactive_file:1784kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:1696kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:168kB pagetables:24kB s ec_pagetables:0kB all_unreclaimable? no DMA free:225816kB boost:0kB min:1952kB low:2440kB high:2928kB reserved_highatomic:0KB active_anon:8kB inactive_anon:312kB active_file:0kB inactive_file:1784kB unevictable:0kB writepending:0kB present:262144kB managed:238752kB mlocked:0k B bounce:0kB free_pcp:1600kB local_pcp:1600kB free_cma:0kB lowmem_reserve[]: 0 0 0 DMA: 3*8kB (UME) 0*16kB 2*32kB (UM) 1*64kB (M) 1*128kB (U) 1*256kB (U) 2*512kB (UM) 1*1024kB (E) 3*2048kB (UME) 3*4096kB (UME) 25*8192kB (M) = 225816kB 230 total pagecache pages 0 pages in swap cache Free swap = 0kB Total swap = 0kB 32768 pages RAM 0 pages HighMem/MovableOnly 2924 pages reserved bash-5.2# ls -l > /dev/null load_elf_binary: Dump memory for ls (31): mmap: start_brk: 700ca000, brk: 700ca000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox mmap: bfb18000-bfb3a000 rw-p bffde000 00:00 0 [stack] do_exit: Dump memory for ls (31): mmap: start_brk: 700ca000, brk: 700ec000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 mmap: 60030000-60032000 rw-p 60030000 00:00 0 mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 mmap: 60160000-6016e000 rw-p 60160000 00:00 0 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] mmap: bfb18000-bfb3a000 rw-p bffde000 00:00 0 [stack] bash-5.2# sysrq: Show Memory Mem-Info: active_anon:0 inactive_anon:47 isolated_anon:0 active_file:83 inactive_file:197 isolated_file:0 unevictable:0 dirty:0 writeback:0 slab_reclaimable:110 slab_unreclaimable:650 mapped:219 shmem:0 pagetables:3 sec_pagetables:0 bounce:0 kernel_misc_reclaimable:0 free:28206 free_pcp:160 free_cma:0 Node 0 active_anon:0kB inactive_anon:376kB active_file:664kB inactive_file:1576kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:1752kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:168kB pagetables:24kB sec_pagetables:0kB all_unreclaimable? no DMA free:225648kB boost:0kB min:1952kB low:2440kB high:2928kB reserved_highatomic:0KB active_anon:0kB inactive_anon:376kB active_file:664kB inactive_file:1576kB unevictable:0kB writepending:0kB present:262144kB managed:238752kB mlocked: 0kB bounce:0kB free_pcp:1280kB local_pcp:1280kB free_cma:0kB lowmem_reserve[]: 0 0 0 DMA: 2*8kB (UE) 0*16kB 1*32kB (U) 1*64kB (M) 2*128kB (UM) 2*256kB (UM) 1*512kB (U) 1*1024kB (E) 3*2048kB (UME) 3*4096kB (UME) 25*8192kB (M) = 225648kB 280 total pagecache pages 0 pages in swap cache Free swap = 0kB Total swap = 0kB 32768 pages RAM 0 pages HighMem/MovableOnly 2924 pages reserved bash-5.2# ls -l > /dev/null load_elf_binary: Dump memory for ls (33): mmap: start_brk: 700ca000, brk: 700ca000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox mmap: bf89e000-bf8c0000 rw-p bffde000 00:00 0 [stack] do_exit: Dump memory for ls (33): mmap: start_brk: 700ca000, brk: 700ca000 mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 mmap: 60030000-60032000 rw-p 60030000 00:00 0 mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 mmap: 60160000-6016e000 rw-p 60160000 00:00 0 mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox mmap: bf89e000-bf8c0000 rw-p bffde000 00:00 0 [stack] bash-5.2# sysrq: Show Memory Mem-Info: active_anon:0 inactive_anon:47 isolated_anon:0 active_file:196 inactive_file:84 isolated_file:0 unevictable:0 dirty:0 writeback:0 slab_reclaimable:110 slab_unreclaimable:650 mapped:219 shmem:0 pagetables:3 sec_pagetables:0 bounce:0 kernel_misc_reclaimable:0 free:28206 free_pcp:159 free_cma:0 Node 0 active_anon:0kB inactive_anon:376kB active_file:1568kB inactive_file:672kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:1752kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:176kB pagetables:24kB sec_pagetables:0kB all_unreclaimable? no DMA free:225648kB boost:0kB min:1952kB low:2440kB high:2928kB reserved_highatomic:0KB active_anon:0kB inactive_anon:376kB active_file:1568kB inactive_file:672kB unevictable:0kB writepending:0kB present:262144kB managed:238752kB mlocked: 0kB bounce:0kB free_pcp:1272kB local_pcp:1272kB free_cma:0kB lowmem_reserve[]: 0 0 0 DMA: 2*8kB (UE) 0*16kB 1*32kB (U) 1*64kB (M) 2*128kB (UM) 2*256kB (UM) 1*512kB (U) 1*1024kB (E) 3*2048kB (UME) 3*4096kB (UME) 25*8192kB (M) = 225648kB 280 total pagecache pages 0 pages in swap cache Free swap = 0kB Total swap = 0kB 32768 pages RAM 0 pages HighMem/MovableOnly 2924 pages reserved JM
Hi Andreas, Am 29.06.2024 um 18:57 schrieb Andreas Schwab: > On Jun 29 2024, Michael Schmitz wrote: > >> No heap space allocated here. I would have expected that to cause an error >> message from libc ... > > No, the heap starts at _end and there is still enough space before the > next page boundary. Right - not all that unusual then. Thanks, Michael
Hi Michael, On 28/6/24 17:48, Michael Schmitz wrote: > Jean-Michel, > > Am 28.06.2024 um 19:24 schrieb Jean-Michel Hautbois: >>> I forgot to take into account that libraries are loaded only the >>> binary starts executing. Can you print the same map dump in the exit >>> syscall code path? That ought to show all loaded libraries at that point. >> >> Thanks for your suggestion ! >> I changed it a bit, and I added a call in do_exit() as suggested. When >> executing ls I get: >> >> bash-5.2# ls -l >> load_elf_binary: Dump memory for ls (31): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >> >> do_exit: Dump memory for ls (31): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox >> mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] >> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >> >> When I call it a second time, I get: >> >> bash-5.2# ls -l >> load_elf_binary: Dump memory for ls (33): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >> do_exit: Dump memory for ls (33): >> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox > > No heap in this second call. Can you print mm->start_brk and mm->brk please? > > The process memory layout is a little unusual (I would have expected the binary to be mapped before the dynamic libraries, not after). Is that expected on Coldfire, Greg? I am not entirely sure of the history behind the layouts. But for the M547x family I have done most MMU work on this is typical. So like this: # cat /proc/1/maps 60000000-60008000 r-xp 00000000 1f:00 550544 /lib/ld-uClibc-0.9.33.2.so 60008000-6000a000 r--p 00006000 1f:00 550544 /lib/ld-uClibc-0.9.33.2.so 6000a000-6000c000 rw-p 00008000 1f:00 550544 /lib/ld-uClibc-0.9.33.2.so 6000c000-6000e000 rw-p 00000000 00:00 0 60010000-60014000 r-xp 00000000 1f:00 1194384 /lib/libcrypt-0.9.33.2.so 60014000-60016000 r--p 00002000 1f:00 1194384 /lib/libcrypt-0.9.33.2.so 60016000-60018000 rw-p 00004000 1f:00 1194384 /lib/libcrypt-0.9.33.2.so 60018000-60028000 rw-p 00000000 00:00 0 60028000-60080000 r-xp 00000000 1f:00 184160 /lib/libuClibc-0.9.33.2.so 60080000-60082000 r--p 00056000 1f:00 184160 /lib/libuClibc-0.9.33.2.so 60082000-60084000 rw-p 00058000 1f:00 184160 /lib/libuClibc-0.9.33.2.so 60084000-60086000 rw-p 00000000 00:00 0 80000000-80004000 r-xp 00000000 1f:00 1882624 /bin/init 80004000-80006000 r--p 00002000 1f:00 1882624 /bin/init 80006000-80008000 rw-p 00004000 1f:00 1882624 /bin/init 80008000-8000a000 rwxp 00000000 00:00 0 [heap] bfdba000-bfddc000 rwxp 00000000 00:00 0 [stack] The 0x60000000 library addresses are due to mmaping - and that is based on the definition of TASK_UNMAPPED_BASE for ColdFire, from arch/m68k/include/asm/processor.h /* This decides where the kernel will search for a free chunk of vm * space during mmap's. */ #ifdef CONFIG_MMU #if defined(CONFIG_COLDFIRE) #define TASK_UNMAPPED_BASE 0x60000000UL The application address range of 0x80000000 are baked in at link time: $ m68k-linux-objdump --headers bin/init init: file format elf32-m68k Sections: Idx Name Size VMA LMA File off Algn 0 .interp 0000000d 80000114 80000114 00000114 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .hash 00000170 80000124 80000124 00000124 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .gnu.hash 000001b4 80000294 80000294 00000294 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .dynsym 00000350 80000448 80000448 00000448 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 .dynstr 00000174 80000798 80000798 00000798 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 5 .rela.dyn 00000018 8000090c 8000090c 0000090c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 6 .rela.plt 00000240 80000924 80000924 00000924 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 7 .init 00000014 80000b64 80000b64 00000b64 2**1 CONTENTS, ALLOC, LOAD, READONLY, CODE 8 .plt 00000498 80000b78 80000b78 00000b78 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 9 .text 00001258 80001010 80001010 00001010 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 10 .fini 0000000e 80002268 80002268 00002268 2**1 CONTENTS, ALLOC, LOAD, READONLY, CODE 11 .rodata 00000257 80002276 80002276 00002276 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 12 .eh_frame 00000004 800024d0 800024d0 000024d0 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 13 .ctors 00000008 80005f30 80005f30 00003f30 2**2 CONTENTS, ALLOC, LOAD, DATA 14 .dtors 00000008 80005f38 80005f38 00003f38 2**2 CONTENTS, ALLOC, LOAD, DATA 15 .dynamic 000000c0 80005f40 80005f40 00003f40 2**2 CONTENTS, ALLOC, LOAD, DATA 16 .got 000000cc 80006000 80006000 00004000 2**2 CONTENTS, ALLOC, LOAD, DATA 17 .data 00000008 800060cc 800060cc 000040cc 2**2 CONTENTS, ALLOC, LOAD, DATA 18 .bss 0000002c 800060d4 800060d4 000040d4 2**2 ALLOC I am not sure why JM's link has applications linked at 0x70000000. Is that a glibc thing? My examples above are all based on uClibc. While we are talking about link addresses, JM, can you tell me what your kernel is linked at? For me it is from a base near 0 (well actually 128k offset, but there is some meaning to that address) which matches the physical DRAM which starts at address 0: $ m68k-linux-objdump --headers linux/vmlinux linux/vmlinux: file format elf32-m68k Sections: Idx Name Size VMA LMA File off Algn 0 .text 002e1800 00020000 00020000 00002000 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 0003f398 00302000 00302000 002e4000 2**4 CONTENTS, ALLOC, LOAD, DATA ... I know the 54418 typically has DRAM at a different physical offset (I think it is 0x40000000?), so wondering if the VM layout was adjusted in some way to cater for that difference? Regards Greg > Cheers, > > Michael > > >> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >> >> The first call generates the "ls" output, not the second one. >> The helper looks like: >> diff --git a/mm/mmap.c b/mm/mmap.c >> index 83b4682ec85c..14d861e9cba2 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -76,6 +76,87 @@ int mmap_rnd_compat_bits __read_mostly = >> CONFIG_ARCH_MMAP_RND_COMPAT_BITS; >> static bool ignore_rlimit_data; >> core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); >> >> +int dump_memory_map(struct task_struct *task) >> +{ >> + struct mm_struct *mm = task->mm; >> + struct vm_area_struct *vma; >> + struct file *file; >> + struct path *path; >> + char *buf; >> + char *pathname; >> + >> + if (!mm) { >> + return -ENOMEM; >> + } >> + >> + MA_STATE(mas, &mm->mm_mt, 0, -1); >> + // Acquire the read lock for mmap_lock >> + down_read(&mm->mmap_lock); >> + mas_lock(&mas); >> + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, >> ULONG_MAX)) { >> + char perms[5] = "---p"; // Default permissions >> + // Set permissions based on vm_flags >> + if (vma->vm_flags & VM_READ) perms[0] = 'r'; >> + if (vma->vm_flags & VM_WRITE) perms[1] = 'w'; >> + if (vma->vm_flags & VM_EXEC) perms[2] = 'x'; >> + if (vma->vm_flags & VM_MAYSHARE) perms[3] = 's'; >> + >> + if (vma->vm_file) { // If there's an associated file >> + buf = (char *)__get_free_page(GFP_KERNEL); >> + if (!buf) { >> + continue; // Handle memory allocation >> failure >> + } >> + >> + file = vma->vm_file; >> + path = &file->f_path; >> + pathname = d_path(path, buf, PAGE_SIZE); >> + if (IS_ERR(pathname)) { >> + pathname = NULL; >> + } >> + >> + // Print memory area information with file path >> + pr_info("%08lx-%08lx %s %08lx %02x:%02x %lu %s\n", >> + vma->vm_start, vma->vm_end, >> + perms, >> + vma->vm_pgoff << PAGE_SHIFT, >> + MAJOR(file_inode(file)->i_rdev), >> + MINOR(file_inode(file)->i_rdev), >> + file_inode(file)->i_ino, >> + pathname ? pathname : ""); >> + >> + free_page((unsigned long)buf); >> + } else { >> + char *special_area_name = NULL; >> + >> + // Check for heap >> + if (vma->vm_end > mm->start_brk && vma->vm_start >> < mm->brk) { >> + special_area_name = "[heap]"; >> + } >> + // Check for stack >> + else if (vma->vm_start <= mm->start_stack && >> vma->vm_end >= mm->start_stack) { >> + special_area_name = "[stack]"; >> + } >> + // Check for vdso >> + else if (vma->vm_flags & VM_EXEC && >> vma->vm_flags & VM_READ && !vma->vm_file) { >> + special_area_name = "[vdso]"; >> + } >> + >> + // Print memory area information without file path >> + pr_info("%08lx-%08lx %s %08lx 00:00 0 %s\n", >> + vma->vm_start, vma->vm_end, >> + perms, >> + vma->vm_pgoff << PAGE_SHIFT, >> + special_area_name ? special_area_name : >> " "); >> + } >> + } >> + mas_unlock(&mas); >> + // Release the read lock for mmap_lock >> + up_read(&mm->mmap_lock); >> + >> + return 0; >> +} >> +EXPORT_SYMBOL(dump_memory_map); >> >> >> Thanks, >> JM
Hi Greg, Michael, On 01/07/2024 00:35, Greg Ungerer wrote: > Hi Michael, > > On 28/6/24 17:48, Michael Schmitz wrote: >> Jean-Michel, >> >> Am 28.06.2024 um 19:24 schrieb Jean-Michel Hautbois: >>>> I forgot to take into account that libraries are loaded only the >>>> binary starts executing. Can you print the same map dump in the exit >>>> syscall code path? That ought to show all loaded libraries at that >>>> point. >>> >>> Thanks for your suggestion ! >>> I changed it a bit, and I added a call in do_exit() as suggested. When >>> executing ls I get: >>> >>> bash-5.2# ls -l >>> load_elf_binary: Dump memory for ls (31): >>> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >>> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >>> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >>> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >>> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >>> >>> do_exit: Dump memory for ls (31): >>> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >>> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >>> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >>> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >>> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >>> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >>> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >>> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >>> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >>> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >>> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >>> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >>> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >>> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox >>> mmap: 700ca000-700ec000 rwxp 700ca000 00:00 0 [heap] >>> mmap: bfc1e000-bfc40000 rw-p bffde000 00:00 0 [stack] >>> >>> When I call it a second time, I get: >>> >>> bash-5.2# ls -l >>> load_elf_binary: Dump memory for ls (33): >>> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >>> mmap: 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 >>> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >>> mmap: 700c2000-700ca000 rw-p 000c0000 00:00 28 /bin/busybox >>> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >>> do_exit: Dump memory for ls (33): >>> mmap: 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 >>> mmap: 6001e000-60020000 r--p 0001c000 00:00 178 /lib/ld.so.1 >>> mmap: 60020000-60022000 rw-p 0001e000 00:00 178 /lib/ld.so.1 >>> mmap: 60022000-6002c000 r-xp 00000000 00:00 193 /lib/libresolv.so.2 >>> mmap: 6002c000-6002e000 r--p 00008000 00:00 193 /lib/libresolv.so.2 >>> mmap: 6002e000-60030000 rw-p 0000a000 00:00 193 /lib/libresolv.so.2 >>> mmap: 60030000-60032000 rw-p 60030000 00:00 0 >>> mmap: 60032000-6015a000 r-xp 00000000 00:00 185 /lib/libc.so.6 >>> mmap: 6015a000-6015c000 r--p 00126000 00:00 185 /lib/libc.so.6 >>> mmap: 6015c000-60160000 rw-p 00128000 00:00 185 /lib/libc.so.6 >>> mmap: 60160000-6016e000 rw-p 60160000 00:00 0 >>> mmap: 70000000-700c2000 r-xp 00000000 00:00 28 /bin/busybox >>> mmap: 700c2000-700c4000 r--p 000c0000 00:00 28 /bin/busybox >>> mmap: 700c4000-700ca000 rw-p 000c2000 00:00 28 /bin/busybox >> >> No heap in this second call. Can you print mm->start_brk and mm->brk >> please? >> >> The process memory layout is a little unusual (I would have expected >> the binary to be mapped before the dynamic libraries, not after). Is >> that expected on Coldfire, Greg? > > I am not entirely sure of the history behind the layouts. But for the > M547x family > I have done most MMU work on this is typical. So like this: > > # cat /proc/1/maps > 60000000-60008000 r-xp 00000000 1f:00 550544 /lib/ld-uClibc-0.9.33.2.so > 60008000-6000a000 r--p 00006000 1f:00 550544 /lib/ld-uClibc-0.9.33.2.so > 6000a000-6000c000 rw-p 00008000 1f:00 550544 /lib/ld-uClibc-0.9.33.2.so > 6000c000-6000e000 rw-p 00000000 00:00 0 > 60010000-60014000 r-xp 00000000 1f:00 1194384 /lib/libcrypt-0.9.33.2.so > 60014000-60016000 r--p 00002000 1f:00 1194384 /lib/libcrypt-0.9.33.2.so > 60016000-60018000 rw-p 00004000 1f:00 1194384 /lib/libcrypt-0.9.33.2.so > 60018000-60028000 rw-p 00000000 00:00 0 > 60028000-60080000 r-xp 00000000 1f:00 184160 /lib/libuClibc-0.9.33.2.so > 60080000-60082000 r--p 00056000 1f:00 184160 /lib/libuClibc-0.9.33.2.so > 60082000-60084000 rw-p 00058000 1f:00 184160 /lib/libuClibc-0.9.33.2.so > 60084000-60086000 rw-p 00000000 00:00 0 > 80000000-80004000 r-xp 00000000 1f:00 1882624 /bin/init > 80004000-80006000 r--p 00002000 1f:00 1882624 /bin/init > 80006000-80008000 rw-p 00004000 1f:00 1882624 /bin/init > 80008000-8000a000 rwxp 00000000 00:00 0 [heap] > bfdba000-bfddc000 rwxp 00000000 00:00 0 [stack] > > The 0x60000000 library addresses are due to mmaping - and that is based > on the > definition of TASK_UNMAPPED_BASE for ColdFire, from > arch/m68k/include/asm/processor.h > > /* This decides where the kernel will search for a free chunk of vm > * space during mmap's. > */ > #ifdef CONFIG_MMU > #if defined(CONFIG_COLDFIRE) > #define TASK_UNMAPPED_BASE 0x60000000UL > > > The application address range of 0x80000000 are baked in at link time: > > $ m68k-linux-objdump --headers bin/init > > init: file format elf32-m68k > > Sections: > Idx Name Size VMA LMA File off Algn > 0 .interp 0000000d 80000114 80000114 00000114 2**0 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 1 .hash 00000170 80000124 80000124 00000124 2**2 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 2 .gnu.hash 000001b4 80000294 80000294 00000294 2**2 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 3 .dynsym 00000350 80000448 80000448 00000448 2**2 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 4 .dynstr 00000174 80000798 80000798 00000798 2**0 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 5 .rela.dyn 00000018 8000090c 8000090c 0000090c 2**2 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 6 .rela.plt 00000240 80000924 80000924 00000924 2**2 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 7 .init 00000014 80000b64 80000b64 00000b64 2**1 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 8 .plt 00000498 80000b78 80000b78 00000b78 2**2 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 9 .text 00001258 80001010 80001010 00001010 2**2 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 10 .fini 0000000e 80002268 80002268 00002268 2**1 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 11 .rodata 00000257 80002276 80002276 00002276 2**0 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 12 .eh_frame 00000004 800024d0 800024d0 000024d0 2**2 > CONTENTS, ALLOC, LOAD, READONLY, DATA > 13 .ctors 00000008 80005f30 80005f30 00003f30 2**2 > CONTENTS, ALLOC, LOAD, DATA > 14 .dtors 00000008 80005f38 80005f38 00003f38 2**2 > CONTENTS, ALLOC, LOAD, DATA > 15 .dynamic 000000c0 80005f40 80005f40 00003f40 2**2 > CONTENTS, ALLOC, LOAD, DATA > 16 .got 000000cc 80006000 80006000 00004000 2**2 > CONTENTS, ALLOC, LOAD, DATA > 17 .data 00000008 800060cc 800060cc 000040cc 2**2 > CONTENTS, ALLOC, LOAD, DATA > 18 .bss 0000002c 800060d4 800060d4 000040d4 2**2 > ALLOC In my case I have: m68k-linux-objdump --headers ../buildroot/output/target/bin/bash ../buildroot/output/target/bin/bash: file format elf32-m68k Sections: Idx Name Size VMA LMA File off Algn 0 .interp 0000000d 00000154 00000154 00000154 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .note.ABI-tag 00000020 00000164 00000164 00000164 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .hash 00002f74 00000184 00000184 00000184 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .gnu.hash 00003180 000030f8 000030f8 000030f8 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 .dynsym 00007d40 00006278 00006278 00006278 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 5 .dynstr 0000771b 0000dfb8 0000dfb8 0000dfb8 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 6 .gnu.version 00000fa8 000156d4 000156d4 000156d4 2**1 CONTENTS, ALLOC, LOAD, READONLY, DATA 7 .gnu.version_r 000000a0 0001667c 0001667c 0001667c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 8 .rela.dyn 0000a788 0001671c 0001671c 0001671c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 9 .rela.plt 00003624 00020ea4 00020ea4 00020ea4 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 10 .init 00000024 000244c8 000244c8 000244c8 2**1 CONTENTS, ALLOC, LOAD, READONLY, CODE 11 .plt 00006c60 000244ec 000244ec 000244ec 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 12 .text 0006a008 0002b14c 0002b14c 0002b14c 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 13 .fini 00000018 00095154 00095154 00095154 2**1 CONTENTS, ALLOC, LOAD, READONLY, CODE 14 .rodata 00016302 0009516c 0009516c 0009516c 2**1 CONTENTS, ALLOC, LOAD, READONLY, DATA 15 .eh_frame_hdr 0000002c 000ab470 000ab470 000ab470 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 16 .eh_frame 000000e8 000ab49c 000ab49c 000ab49c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 17 .init_array 00000004 000ad2a8 000ad2a8 000ad2a8 2**1 CONTENTS, ALLOC, LOAD, DATA 18 .fini_array 00000004 000ad2ac 000ad2ac 000ad2ac 2**1 CONTENTS, ALLOC, LOAD, DATA 19 .data.rel.ro 00000c40 000ad2b0 000ad2b0 000ad2b0 2**1 CONTENTS, ALLOC, LOAD, DATA 20 .dynamic 00000110 000adef0 000adef0 000adef0 2**2 CONTENTS, ALLOC, LOAD, DATA 21 .got 000039d8 000ae000 000ae000 000ae000 2**2 CONTENTS, ALLOC, LOAD, DATA 22 .data 0000198c 000b19d8 000b19d8 000b19d8 2**2 CONTENTS, ALLOC, LOAD, DATA 23 .bss 000090b0 000b3364 000b3364 000b3364 2**2 ALLOC > > I am not sure why JM's link has applications linked at 0x70000000. > Is that a glibc thing? My examples above are all based on uClibc. > > While we are talking about link addresses, JM, can you tell me what > your kernel is linked at? For me it is from a base near 0 (well actually > 128k offset, but there is some meaning to that address) which matches > the physical DRAM which starts at address 0: > > $ m68k-linux-objdump --headers linux/vmlinux > > linux/vmlinux: file format elf32-m68k > > Sections: > Idx Name Size VMA LMA File off Algn > 0 .text 002e1800 00020000 00020000 00002000 2**2 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 1 .rodata 0003f398 00302000 00302000 002e4000 2**4 > CONTENTS, ALLOC, LOAD, DATA > ... > > > I know the 54418 typically has DRAM at a different physical offset > (I think it is 0x40000000?), so wondering if the VM layout was > adjusted in some way to cater for that difference? The SDRAM is mapped at 0x40000000 yes. In my case it is linked like this: m68k-linux-objdump --headers vmlinux vmlinux: file format elf32-m68k Sections: Idx Name Size VMA LMA File off Algn 0 .text 00377ad0 41002000 41002000 00002000 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 000e95c0 4137a000 4137a000 0037a000 2**4 CONTENTS, ALLOC, LOAD, DATA 2 __ksymtab 00009e04 414635c0 414635c0 004635c0 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 __ksymtab_gpl 000079e0 4146d3c4 4146d3c4 0046d3c4 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 __ksymtab_strings 0001a88b 41474da4 41474da4 00474da4 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 5 __param 000004d8 4148f630 4148f630 0048f630 2**1 CONTENTS, ALLOC, LOAD, READONLY, DATA 6 __modver 000000aa 4148fb08 4148fb08 0048fb08 2**1 CONTENTS, ALLOC, LOAD, DATA 7 .notes 00000054 4148fbb4 4148fbb4 0048fbb4 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 8 .data 00091b70 41490000 41490000 00490000 2**4 CONTENTS, ALLOC, LOAD, DATA 9 __ex_table 000014e8 41521b70 41521b70 00521b70 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 10 .init.text 000121f4 41524000 41524000 00524000 2**1 CONTENTS, ALLOC, LOAD, READONLY, CODE 11 .init.data 0000275c 415361f4 415361f4 005361f4 2**1 CONTENTS, ALLOC, LOAD, DATA 12 .data..percpu 00000000 4153a000 4153a248 0053a248 2**0 CONTENTS, ALLOC, LOAD, DATA 13 .m68k_fixup 00000248 4153a000 4153a000 0053a000 2**0 CONTENTS, ALLOC, LOAD, DATA 14 .init.data 00001db8 4153a248 4153a248 0053a248 2**0 ALLOC 15 .bss 00035098 4153c000 4153c000 0053a248 2**4 ALLOC 16 .comment 00000035 00000000 00000000 0053a248 2**0 CONTENTS, READONLY Thanks, JM > Regards > Greg > > > > > >> Cheers, >> >> Michael >> >> >>> mmap: bfb5a000-bfb7c000 rw-p bffde000 00:00 0 [stack] >>> >>> The first call generates the "ls" output, not the second one. >>> The helper looks like: >>> diff --git a/mm/mmap.c b/mm/mmap.c >>> index 83b4682ec85c..14d861e9cba2 100644 >>> --- a/mm/mmap.c >>> +++ b/mm/mmap.c >>> @@ -76,6 +76,87 @@ int mmap_rnd_compat_bits __read_mostly = >>> CONFIG_ARCH_MMAP_RND_COMPAT_BITS; >>> static bool ignore_rlimit_data; >>> core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); >>> >>> +int dump_memory_map(struct task_struct *task) >>> +{ >>> + struct mm_struct *mm = task->mm; >>> + struct vm_area_struct *vma; >>> + struct file *file; >>> + struct path *path; >>> + char *buf; >>> + char *pathname; >>> + >>> + if (!mm) { >>> + return -ENOMEM; >>> + } >>> + >>> + MA_STATE(mas, &mm->mm_mt, 0, -1); >>> + // Acquire the read lock for mmap_lock >>> + down_read(&mm->mmap_lock); >>> + mas_lock(&mas); >>> + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, >>> ULONG_MAX)) { >>> + char perms[5] = "---p"; // Default permissions >>> + // Set permissions based on vm_flags >>> + if (vma->vm_flags & VM_READ) perms[0] = 'r'; >>> + if (vma->vm_flags & VM_WRITE) perms[1] = 'w'; >>> + if (vma->vm_flags & VM_EXEC) perms[2] = 'x'; >>> + if (vma->vm_flags & VM_MAYSHARE) perms[3] = 's'; >>> + >>> + if (vma->vm_file) { // If there's an associated file >>> + buf = (char *)__get_free_page(GFP_KERNEL); >>> + if (!buf) { >>> + continue; // Handle memory allocation >>> failure >>> + } >>> + >>> + file = vma->vm_file; >>> + path = &file->f_path; >>> + pathname = d_path(path, buf, PAGE_SIZE); >>> + if (IS_ERR(pathname)) { >>> + pathname = NULL; >>> + } >>> + >>> + // Print memory area information with file path >>> + pr_info("%08lx-%08lx %s %08lx %02x:%02x %lu >>> %s\n", >>> + vma->vm_start, vma->vm_end, >>> + perms, >>> + vma->vm_pgoff << PAGE_SHIFT, >>> + MAJOR(file_inode(file)->i_rdev), >>> + MINOR(file_inode(file)->i_rdev), >>> + file_inode(file)->i_ino, >>> + pathname ? pathname : ""); >>> + >>> + free_page((unsigned long)buf); >>> + } else { >>> + char *special_area_name = NULL; >>> + >>> + // Check for heap >>> + if (vma->vm_end > mm->start_brk && vma->vm_start >>> < mm->brk) { >>> + special_area_name = "[heap]"; >>> + } >>> + // Check for stack >>> + else if (vma->vm_start <= mm->start_stack && >>> vma->vm_end >= mm->start_stack) { >>> + special_area_name = "[stack]"; >>> + } >>> + // Check for vdso >>> + else if (vma->vm_flags & VM_EXEC && >>> vma->vm_flags & VM_READ && !vma->vm_file) { >>> + special_area_name = "[vdso]"; >>> + } >>> + >>> + // Print memory area information without file >>> path >>> + pr_info("%08lx-%08lx %s %08lx 00:00 0 %s\n", >>> + vma->vm_start, vma->vm_end, >>> + perms, >>> + vma->vm_pgoff << PAGE_SHIFT, >>> + special_area_name ? special_area_name : >>> " "); >>> + } >>> + } >>> + mas_unlock(&mas); >>> + // Release the read lock for mmap_lock >>> + up_read(&mm->mmap_lock); >>> + >>> + return 0; >>> +} >>> +EXPORT_SYMBOL(dump_memory_map); >>> >>> >>> Thanks, >>> JM
On Jul 01 2024, Greg Ungerer wrote: > I am not sure why JM's link has applications linked at 0x70000000. > Is that a glibc thing? My examples above are all based on uClibc. No, it is defined by the linker (TEXT_START_ADDR).
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index a1f46919934c..d01bd3276204 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -1474,6 +1474,7 @@ static bool ubifs_dirty_folio(struct address_space *mapping, * An attempt to dirty a page without budgeting for it - should not * happen. */ + VM_BUG_ON_FOLIO(ret != false, folio); ubifs_assert(c, ret == false); return ret;