Message ID | 20220113100622.12783-2-farbere@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | edac_align_ptr() bug fix and refactoring | expand |
On Thu, Jan 13, 2022 at 10:06:19AM +0000, Eliav Farber wrote: > Do alignment logic properly and use 'ptr' for calculating the remainder > of the alignment. > > This became an issue because 'struct edac_mc_layer' has a size that is > not zero modulo eight, and the next offset that was prepared for the > private-data was unaligned, causing an alignment exception. How exactly did this "become an issue"? I'm asking because I have been hearing about weird bugs with that pointer alignment contraption and have never managed to reproduce them myself or hear a proper explanation from people. And that thing is an abomination, frankly, and I'd like to get rid of it but maybe some other time... So, please explain more verbosely, a specific example or how I could reproduce it, would be even better. Thx.
On 1/25/2022 4:37 PM, Borislav Petkov wrote: > How exactly did this "become an issue"? One of the fields in our private-data structure is 'struct notifier_block' which has a next field of type 'struct notifier_block __rcu *'. The size of our private-data structure is greater than 8, and it comes after 'struct edac_mc_layer' which has a size that is not zero modulo eight, and also ends at an address that is not zero modulo eight. Because of the bug in edac_align_ptr(), our private-data structure which should have been aligned to 8 wasn't (it was aligned to 4), so notifier_block was also not aligned to 8, and finally next wasn't aligned to 8. > So, please explain more verbosely, a specific example or how I could > reproduce it, would be even better. Our al_mc_edac driver calls atomic_notifier_chain_register() on probe, to add the notifier_block to panic_notifier_list. We probe the driver more than once, and each time we use the same value for the priority field in the notifier_block (so the newer notifier_block should come later in panic_notifier_list). When the driver is probed for the second time, we get an unable to handle kernel paging request panic at rcu_assign_pointer() which is called from notifier_chain_register(). It happens when rcu_assign_pointer() tries to set the unaligned next pointer from the first probe, to point to the new notifier_block of the second probe. Unable to handle kernel paging request at virtual address ffff8013e8037f4c Mem abort info: ESR = 0x96000061 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000061 CM = 0, WnR = 1 swapper pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____) [ffff8013e8037f4c] pgd=00000013ffff8003, pud=00680013c0000711 Internal error: Oops: 96000061 [#1] SMP Modules linked in: Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) CPU: 10 PID: 1 Comm: swapper/0 Not tainted 4.19.191 #1016 Hardware name: Annapurna Labs Alpine V3 EVP (DT) pstate: 20000085 (nzCv daIf -PAN -UAO) pc : atomic_notifier_chain_register+0x80/0xb8 lr : atomic_notifier_chain_register+0x38/0xb8 sp : ffff0000097d3b10 x29: ffff0000097d3b10 x28: ffff000009108068 x27: ffff0000095ca000 x26: ffff8013ed0a8744 x25: ffff0000096e3000 x24: ffff000009199000 x23: ffff8013ed016810 x22: ffff8013ed016800 x21: 0000000000000000 x20: ffff8013ed0a8744 x19: ffff0000095ca6d8 x18: ffffffffffffffff x17: 0000000000000000 x16: 0000000000000000 x15: ffff0000091996c8 x14: 2820564544203a63 x13: 6d5f6c612072656c x12: 6c6f72746e6f6320 x11: 636164655f636d5f x10: ffff000009199918 x9 : ffff000009173018 x8 : ffff00000878ec80 x7 : 676e69766947203a x6 : 00000000000002b1 x5 : 000000000000003f x4 : 0000000000000000 x3 : 000000007fffffff x2 : ffff8013e8037f4c x1 : 0000000000000096 x0 : ffff0000091bacf8 Call trace: atomic_notifier_chain_register+0x80/0xb8 al_mc_edac_probe+0x224/0x468 platform_drv_probe+0x58/0xa8 really_probe+0x2cc/0x3b8 driver_probe_device+0x12c/0x148 __driver_attach+0x148/0x150 bus_for_each_dev+0x84/0xd8 driver_attach+0x30/0x40 bus_add_driver+0x174/0x2a8 driver_register+0x64/0x110 __platform_driver_register+0x54/0x60 al_mc_edac_driver_init+0x20/0x28 do_one_initcall+0x54/0x208 kernel_init_freeable+0x294/0x354 kernel_init+0x18/0x118 ret_from_fork+0x10/0x18 Code: 91002002 f9400400 b5ffff60 f9000680 (c89ffc54) ---[ end trace dba8c8c6291afa5b ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x0,20006008 Memory Limit: none ---[ end Kernel panic - not syncing: Fatal exception ]---
On Thu, Jan 27, 2022 at 11:58:58AM +0200, Farber, Eliav wrote: > One of the fields in our private-data structure is 'struct notifier_block' > which has a next field of type 'struct notifier_block __rcu *'. Just to make sure I understand you correctly: you're talking about some internal version of al_mc_edac - not what's upstream? > The size of our private-data structure is greater than 8, and it comes after > 'struct edac_mc_layer' which has a size that is not zero modulo eight, and > also ends at an address that is not zero modulo eight. > Because of the bug in edac_align_ptr(), our private-data structure which > should have been aligned to 8 wasn't (it was aligned to 4), so > notifier_block was also not aligned to 8, and finally next wasn't aligned > to 8. Ok, I think I see what's going on. So this: 8447c4d15e35 ("edac: Do alignment logic properly in edac_align_ptr()") changed that remainder computation thing and it was supposed to do what your patch does. Hell, even the commit message says so: "The logic was checking the sizeof the structure being allocated to determine whether an alignment fixup was required. This isn't right; what we actually care about is the alignment of the actual pointer that's about to be returned." So if we really do care about the alignment of the actual pointer we're about to return, then yes, we should check ptr - not p. Lemme add that newly discovered info to your patch and queue it. Thx.
On 2/15/2022 2:34 PM, Borislav Petkov wrote: > Just to make sure I understand you correctly: you're talking about some > internal version of al_mc_edac - not what's upstream? Yes, I'm talking about an internal version that wasn't up-streamed yet. -- Regards, Eliav
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c index fd440b35d76e..61d72bd96754 100644 --- a/drivers/edac/edac_mc.c +++ b/drivers/edac/edac_mc.c @@ -265,7 +265,7 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems) else return (char *)ptr; - r = (unsigned long)p % align; + r = (unsigned long)ptr % align; if (r == 0) return (char *)ptr;
Do alignment logic properly and use 'ptr' for calculating the remainder of the alignment. This became an issue because 'struct edac_mc_layer' has a size that is not zero modulo eight, and the next offset that was prepared for the private-data was unaligned, causing an alignment exception. Signed-off-by: Eliav Farber <farbere@amazon.com> --- drivers/edac/edac_mc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)