Message ID | 56C31D1D.50708@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote: > > On 02/15/2016 09:59 PM, Catalin Marinas wrote: > > On Mon, Feb 15, 2016 at 05:28:02PM +0300, Andrey Ryabinin wrote: > >> On 02/12/2016 07:06 PM, Catalin Marinas wrote: > >>> So far, we have: > >>> > >>> KASAN+for-next/kernmap goes wrong > >>> KASAN+UBSAN goes wrong > >>> > >>> Enabled individually, KASAN, UBSAN and for-next/kernmap seem fine. I may > >>> have to trim for-next/core down until we figure out where the problem > >>> is. > >>> > >>> BUG: KASAN: stack-out-of-bounds in find_busiest_group+0x164/0x16a0 at addr ffffffc93665bc8c > >> > >> Can it be related to TLB conflicts, which supposed to be fixed in > >> "arm64: kasan: avoid TLB conflicts" patch from "arm64: mm: rework page > >> table creation" series ? > > > > I can very easily reproduce this with a vanilla 4.5-rc1 series by > > enabling inline instrumentation (maybe Mark's theory is true w.r.t. > > image size). > > > > Some information, maybe you can shed some light on this. It seems to > > happen only for secondary CPUs on the swapper stack (I think allocated > > via fork_idle()). The code generated looks sane to me, so KASAN should > > not complain but maybe there is some uninitialised shadow, hence the > > error. > > > > The report: > > > > Actually, the first report is a bit more useful. It shows that shadow memory was corrupted: > > ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1 > > ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3 > ^ > F1 - left redzone, it indicates start of stack frame > F3 - right redzone, it should be the end of stack frame. > > But here we have the second set of F1s without F3s which should close the first set of F1s. > Also those two F3s in the middle cannot be right. > > So shadow is corrupted. > Some hypotheses: > > 1) We share stack between several tasks (e.g. stack overflow, somehow corrupted SP). > But this probably should cause kernel crash later, after kasan reports. > > 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return. > If we use some tricky way to exit from function this could cause false-positives like that. > E.g. some hand-written assembly return code. > > 3) Screwed shadow mapping. I think the patch below should uncover such problem. > It boot-tested on qemu and didn't show any problem With that path applied I get: [ 0.000000] kasan: screwed shadow mapping 62184, 62182 [ 0.000000] kasan: KernelAddressSanitizer initialized I'm using v4.5-rc1 with KASAN_INLINE, and a random collection of debug options to bloat the kernel per prior theory that the text size had somethign to do with the issue. Later in the boot process I see lots of failures like: [ 13.292190] ================================================================== [ 13.299543] BUG: KASAN: stack-out-of-bounds in find_busiest_group+0x1950/0x19b8 at addr ffffffc936ad3c8c [ 13.309090] Read of size 4 by task swapper/3/0 [ 13.313575] page:ffffffbde6dab4c0 count:0 mapcount:0 mapping: (null) index:0x0 [ 13.321657] flags: 0x4000000000000000() [ 13.325539] page dumped because: kasan: bad access detected [ 13.331150] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.5.0-rc1+ #19 [ 13.337528] Hardware name: ARM Juno development board (r1) (DT) [ 13.343471] Call trace: [ 13.345978] [<ffffffc000091400>] dump_backtrace+0x0/0x3c0 [ 13.351416] [<ffffffc0000917e4>] show_stack+0x24/0x30 [ 13.356507] [<ffffffc0008c3a64>] dump_stack+0xc4/0x150 [ 13.361685] [<ffffffc0004032bc>] kasan_report_error+0x52c/0x558 [ 13.367640] [<ffffffc0004033fc>] __asan_report_load4_noabort+0x54/0x60 [ 13.374200] [<ffffffc0001a46e8>] find_busiest_group+0x1950/0x19b8 [ 13.380327] [<ffffffc0001a49ec>] load_balance+0x29c/0x19e0 [ 13.385851] [<ffffffc0001a67c0>] pick_next_task_fair+0x690/0xd88 [ 13.391896] [<ffffffc001213cf4>] __schedule+0x85c/0x13c8 [ 13.397248] [<ffffffc001214d7c>] schedule+0xe4/0x228 [ 13.402256] [<ffffffc00121549c>] schedule_preempt_disabled+0x24/0xb8 [ 13.408642] [<ffffffc0001b97f8>] cpu_startup_entry+0x188/0x738 [ 13.414511] [<ffffffc00009bcfc>] secondary_start_kernel+0x244/0x2b8 [ 13.420806] [<0000000080082efc>] 0x80082efc [ 13.425023] Memory state around the buggy address: [ 13.429854] ffffffc936ad3b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 13.437153] ffffffc936ad3c00: 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 00 00 f3 f3 [ 13.444451] >ffffffc936ad3c80: f3 f3 00 00 00 00 00 00 00 f4 f4 f4 f3 f3 f3 f3 [ 13.451742] ^ [ 13.455274] ffffffc936ad3d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 13.462572] ffffffc936ad3d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 [ 13.469863] ================================================================== I guess memroy layout has something to do with this. FWIW on this board my memory map comes from EFI: [ 0.000000] Processing EFI memory map: [ 0.000000] 0x000008000000-0x00000bffffff [Memory Mapped I/O |RUN| |XP| | | | | | | |UC] [ 0.000000] 0x00001c170000-0x00001c170fff [Memory Mapped I/O |RUN| |XP| | | | | | | |UC] [ 0.000000] 0x000080000000-0x00008000ffff [Loader Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x000080010000-0x00008007ffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x000080080000-0x000081dbffff [Loader Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x000081dc0000-0x00009fdfffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x00009fe00000-0x00009fe0ffff [Loader Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x00009fe10000-0x0000dfffffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000e00f0000-0x0000f5a58fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000f5a59000-0x0000f7793fff [Loader Code | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000f7794000-0x0000f9431fff [Loader Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000f9432000-0x0000f944ffff [Loader Code | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000f9450000-0x0000f945ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9460000-0x0000f94dffff [ACPI Reclaim Memory| | | | | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f94e0000-0x0000f94effff [ACPI Memory NVS | | | | | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f94f0000-0x0000f94fffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9500000-0x0000f950ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* [ 0.000000] 0x0000f9510000-0x0000f953ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9540000-0x0000f954ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* [ 0.000000] 0x0000f9550000-0x0000f956ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9570000-0x0000f958ffff [ACPI Reclaim Memory| | | | | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9590000-0x0000f960ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9610000-0x0000f961ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* [ 0.000000] 0x0000f9620000-0x0000f96effff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f96f0000-0x0000f96fffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9700000-0x0000f970ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* [ 0.000000] 0x0000f9710000-0x0000f974ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9750000-0x0000f975ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* [ 0.000000] 0x0000f9760000-0x0000f97cffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f97d0000-0x0000f97dffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f97e0000-0x0000f97effff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* [ 0.000000] 0x0000f97f0000-0x0000f981ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f9820000-0x0000f9820fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000f9821000-0x0000f9827fff [Loader Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000f9828000-0x0000f982bfff [Reserved | | | | | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000f982c000-0x0000fdaedfff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fdaee000-0x0000fdfbefff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fdfbf000-0x0000fdfbffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fdfc0000-0x0000fdffbfff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fdffc000-0x0000fe018fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe019000-0x0000fe020fff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe021000-0x0000fe022fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe023000-0x0000fe02bfff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe02c000-0x0000fe03afff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe03b000-0x0000fe03dfff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe03e000-0x0000fe04efff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe04f000-0x0000fe057fff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe058000-0x0000fe073fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe074000-0x0000fe074fff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe075000-0x0000fe078fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe079000-0x0000fe07bfff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe07c000-0x0000fe07dfff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe07e000-0x0000fe085fff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe086000-0x0000fe087fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe088000-0x0000fe171fff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe172000-0x0000fe198fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe199000-0x0000fe65ffff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe660000-0x0000fe6a2fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe6a3000-0x0000fe7effff [Boot Code | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe7f0000-0x0000fe7fffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000fe800000-0x0000fe80ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* [ 0.000000] 0x0000fe810000-0x0000fe82ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000fe830000-0x0000fe83ffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe840000-0x0000fe88ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* [ 0.000000] 0x0000fe890000-0x0000fe891fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x0000fe892000-0x0000feffffff [Boot Data | | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x000880000000-0x00099bffffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] [ 0.000000] 0x00099c000000-0x0009ffffffff [Loader Data | | | | | | | |WB|WT|WC|UC] Thanks, Mark.
On Tue, Feb 16, 2016 at 02:12:59PM +0000, Mark Rutland wrote: > On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote: > > So shadow is corrupted. > > Some hypotheses: > > > > 1) We share stack between several tasks (e.g. stack overflow, somehow corrupted SP). > > But this probably should cause kernel crash later, after kasan reports. > > > > 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return. > > If we use some tricky way to exit from function this could cause false-positives like that. > > E.g. some hand-written assembly return code. > > > > 3) Screwed shadow mapping. I think the patch below should uncover such problem. > > It boot-tested on qemu and didn't show any problem > > With that path applied I get: > > [ 0.000000] kasan: screwed shadow mapping 62184, 62182 > [ 0.000000] kasan: KernelAddressSanitizer initialized > > I'm using v4.5-rc1 with KASAN_INLINE, and a random collection of debug options > to bloat the kernel per prior theory that the text size had somethign to do > with the issue. I hacked kasan_init to dump info as it created each shadow region: [ 0.000000] kasan_init shadowing [ffffffc000000000-ffffffc060000000] @ [ffffff8800000000-ffffff880c000001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc0600f0000-ffffffc079450000] @ [ffffff880c01e000-ffffff880f28a001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc079450000-ffffffc079820000] @ [ffffff880f28a000-ffffff880f304001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc079820000-ffffffc079821000] @ [ffffff880f304000-ffffff880f304201] nid 0 [ 0.000000] kasan_init shadowing [ffffffc079821000-ffffffc079822000] @ [ffffff880f304200-ffffff880f304401] nid 0 [ 0.000000] kasan_init shadowing [ffffffc079822000-ffffffc079828000] @ [ffffff880f304400-ffffff880f305001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc079828000-ffffffc07982c000] @ [ffffff880f305000-ffffff880f305801] nid 0 [ 0.000000] kasan_init shadowing [ffffffc07982c000-ffffffc07e7f0000] @ [ffffff880f305800-ffffff880fcfe001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc07e7f0000-ffffffc07e830000] @ [ffffff880fcfe000-ffffff880fd06001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc07e830000-ffffffc07e840000] @ [ffffff880fd06000-ffffff880fd08001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc07e840000-ffffffc07e890000] @ [ffffff880fd08000-ffffff880fd12001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc07e890000-ffffffc07f000000] @ [ffffff880fd12000-ffffff880fe00001] nid 0 [ 0.000000] kasan_init shadowing [ffffffc800000000-ffffffc980000000] @ [ffffff8900000000-ffffff8930000001] nid 0 [ 0.000000] kasan: screwed shadow mapping 62184, 62182 [ 0.000000] kasan: KernelAddressSanitizer initialized I note the the end of each shadow region overlaps the beginning of the next due to the intentional end+1... Other than the waste of memory (and the TLB conflict that gets solved by my pgtable rework), I'm not sure though I'm not sure that's a problem, though. Mark. > I guess memroy layout has something to do with this. FWIW on this board my > memory map comes from EFI: > > [ 0.000000] Processing EFI memory map: > [ 0.000000] 0x000008000000-0x00000bffffff [Memory Mapped I/O |RUN| |XP| | | | | | | |UC] > [ 0.000000] 0x00001c170000-0x00001c170fff [Memory Mapped I/O |RUN| |XP| | | | | | | |UC] > [ 0.000000] 0x000080000000-0x00008000ffff [Loader Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x000080010000-0x00008007ffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x000080080000-0x000081dbffff [Loader Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x000081dc0000-0x00009fdfffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x00009fe00000-0x00009fe0ffff [Loader Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x00009fe10000-0x0000dfffffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000e00f0000-0x0000f5a58fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000f5a59000-0x0000f7793fff [Loader Code | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000f7794000-0x0000f9431fff [Loader Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000f9432000-0x0000f944ffff [Loader Code | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000f9450000-0x0000f945ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9460000-0x0000f94dffff [ACPI Reclaim Memory| | | | | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f94e0000-0x0000f94effff [ACPI Memory NVS | | | | | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f94f0000-0x0000f94fffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9500000-0x0000f950ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9510000-0x0000f953ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9540000-0x0000f954ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9550000-0x0000f956ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9570000-0x0000f958ffff [ACPI Reclaim Memory| | | | | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9590000-0x0000f960ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9610000-0x0000f961ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9620000-0x0000f96effff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f96f0000-0x0000f96fffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9700000-0x0000f970ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9710000-0x0000f974ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9750000-0x0000f975ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9760000-0x0000f97cffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f97d0000-0x0000f97dffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f97e0000-0x0000f97effff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* > [ 0.000000] 0x0000f97f0000-0x0000f981ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f9820000-0x0000f9820fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000f9821000-0x0000f9827fff [Loader Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000f9828000-0x0000f982bfff [Reserved | | | | | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000f982c000-0x0000fdaedfff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fdaee000-0x0000fdfbefff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fdfbf000-0x0000fdfbffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fdfc0000-0x0000fdffbfff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fdffc000-0x0000fe018fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe019000-0x0000fe020fff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe021000-0x0000fe022fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe023000-0x0000fe02bfff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe02c000-0x0000fe03afff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe03b000-0x0000fe03dfff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe03e000-0x0000fe04efff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe04f000-0x0000fe057fff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe058000-0x0000fe073fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe074000-0x0000fe074fff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe075000-0x0000fe078fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe079000-0x0000fe07bfff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe07c000-0x0000fe07dfff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe07e000-0x0000fe085fff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe086000-0x0000fe087fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe088000-0x0000fe171fff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe172000-0x0000fe198fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe199000-0x0000fe65ffff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe660000-0x0000fe6a2fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe6a3000-0x0000fe7effff [Boot Code | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe7f0000-0x0000fe7fffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000fe800000-0x0000fe80ffff [Runtime Code |RUN| | | | |RO| |WB|WT|WC|UC]* > [ 0.000000] 0x0000fe810000-0x0000fe82ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000fe830000-0x0000fe83ffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe840000-0x0000fe88ffff [Runtime Data |RUN| |XP| | | | |WB|WT|WC|UC]* > [ 0.000000] 0x0000fe890000-0x0000fe891fff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x0000fe892000-0x0000feffffff [Boot Data | | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x000880000000-0x00099bffffff [Conventional Memory| | | | | | | |WB|WT|WC|UC] > [ 0.000000] 0x00099c000000-0x0009ffffffff [Loader Data | | | | | | | |WB|WT|WC|UC] > > Thanks, > Mark. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On 16 February 2016 at 13:59, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: > > > On 02/15/2016 09:59 PM, Catalin Marinas wrote: >> On Mon, Feb 15, 2016 at 05:28:02PM +0300, Andrey Ryabinin wrote: >>> On 02/12/2016 07:06 PM, Catalin Marinas wrote: >>>> So far, we have: >>>> >>>> KASAN+for-next/kernmap goes wrong >>>> KASAN+UBSAN goes wrong >>>> >>>> Enabled individually, KASAN, UBSAN and for-next/kernmap seem fine. I may >>>> have to trim for-next/core down until we figure out where the problem >>>> is. >>>> >>>> BUG: KASAN: stack-out-of-bounds in find_busiest_group+0x164/0x16a0 at addr ffffffc93665bc8c >>> >>> Can it be related to TLB conflicts, which supposed to be fixed in >>> "arm64: kasan: avoid TLB conflicts" patch from "arm64: mm: rework page >>> table creation" series ? >> >> I can very easily reproduce this with a vanilla 4.5-rc1 series by >> enabling inline instrumentation (maybe Mark's theory is true w.r.t. >> image size). >> >> Some information, maybe you can shed some light on this. It seems to >> happen only for secondary CPUs on the swapper stack (I think allocated >> via fork_idle()). The code generated looks sane to me, so KASAN should >> not complain but maybe there is some uninitialised shadow, hence the >> error. >> >> The report: >> > > Actually, the first report is a bit more useful. It shows that shadow memory was corrupted: > > ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1 >> ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3 > ^ > F1 - left redzone, it indicates start of stack frame > F3 - right redzone, it should be the end of stack frame. > > But here we have the second set of F1s without F3s which should close the first set of F1s. > Also those two F3s in the middle cannot be right. > > So shadow is corrupted. > Some hypotheses: > > 1) We share stack between several tasks (e.g. stack overflow, somehow corrupted SP). > But this probably should cause kernel crash later, after kasan reports. > > 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return. > If we use some tricky way to exit from function this could cause false-positives like that. > E.g. some hand-written assembly return code. > > 3) Screwed shadow mapping. I think the patch below should uncover such problem. > It boot-tested on qemu and didn't show any problem > I think this patch gives false positive warnings in some cases: > > --- > arch/arm64/mm/kasan_init.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 55 insertions(+) > > diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c > index cf038c7..25d685c 100644 > --- a/arch/arm64/mm/kasan_init.c > +++ b/arch/arm64/mm/kasan_init.c > @@ -117,6 +117,59 @@ static void __init cpu_set_ttbr1(unsigned long ttbr1) > : "r" (ttbr1)); > } > > +static void verify_shadow(void) > +{ > + struct memblock_region *reg; > + int i = 0; > + > + for_each_memblock(memory, reg) { > + void *start = (void *)__phys_to_virt(reg->base); > + void *end = (void *)__phys_to_virt(reg->base + reg->size); > + int *shadow_start, *shadow_end; > + > + if (start >= end) > + break; > + shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1)); > + shadow_end = (int *)kasan_mem_to_shadow(end); shadow_start and shadow_end can refer to the same page as in the previous iteration. For instance, I have these two regions 0x00006e090000-0x00006e0adfff [Conventional Memory| | | | | | | |WB|WT|WC|UC] 0x00006e0ae000-0x00006e0affff [Loader Data | | | | | | | |WB|WT|WC|UC] which are covered by different memblocks since the second one is marked as MEMBLOCK_NOMAP, due to the fact that it contains the UEFI memory map. I get the following output kasan: screwed shadow mapping 23575, 23573 which I think is simply a result from the fact the shadow_start refers to the same page as in the previous iteration(s) > + for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) { > + *shadow_start = i; > + i++; > + } > + } > + > + i = 0; > + for_each_memblock(memory, reg) { > + void *start = (void *)__phys_to_virt(reg->base); > + void *end = (void *)__phys_to_virt(reg->base + reg->size); > + int *shadow_start, *shadow_end; > + > + if (start >= end) > + break; > + shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1)); > + shadow_end = (int *)kasan_mem_to_shadow(end); > + for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) { > + if (*shadow_start != i) { > + pr_err("screwed shadow mapping %d, %d\n", *shadow_start, i); > + goto clear; > + } > + i++; > + } > + } > +clear: > + for_each_memblock(memory, reg) { > + void *start = (void *)__phys_to_virt(reg->base); > + void *end = (void *)__phys_to_virt(reg->base + reg->size); > + unsigned long shadow_start, shadow_end; > + > + if (start >= end) > + break; > + shadow_start = ((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1)); > + shadow_end = (unsigned long)kasan_mem_to_shadow(end); > + memset((void *)shadow_start, 0, shadow_end - shadow_start); > + } > + > +} > + > void __init kasan_init(void) > { > struct memblock_region *reg; > @@ -159,6 +212,8 @@ void __init kasan_init(void) > cpu_set_ttbr1(__pa(swapper_pg_dir)); > flush_tlb_all(); > > + verify_shadow(); > + > /* At this point kasan is fully initialized. Enable error messages */ > init_task.kasan_depth = 0; > pr_info("KernelAddressSanitizer initialized\n"); > -- > > > > >
On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote: > Actually, the first report is a bit more useful. It shows that shadow memory was corrupted: > > ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1 > > ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3 > ^ > F1 - left redzone, it indicates start of stack frame > F3 - right redzone, it should be the end of stack frame. > > But here we have the second set of F1s without F3s which should close the first set of F1s. > Also those two F3s in the middle cannot be right. > > So shadow is corrupted. > Some hypotheses: > 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return. > If we use some tricky way to exit from function this could cause false-positives like that. > E.g. some hand-written assembly return code. I think this is what's happenening, at least for the idle case. A second attempt at bisecting led me to commit e679660dbb8347f2 ("ARM: 8481/2: drivers: psci: replace psci firmware calls"). Reverting that makes v4.5-rc1 boot without KASAN splats. That patch turned __invoke_psci_fn_{smc,hvc} into (ASAN-instrumented) C functions. Prior to that commit, __invoke_psci_fn_{smc,hvc} were pure assembly functions which used no stack. When we go down for idle, in __cpu_suspend_enter we stash some context to the stack (in assembly). The CPU may return from a cold state via cpu_resume, where we restore context from the stack. However, after storing the context we call psci_suspend_finisher, which calls psci_cpu_suspend, which calls invoke_psci_fn_*. As psci_cpu_suspend and invoke_psci_fn_* are instrumented, they poison memory on function entrance, but we never perform the unpoisoning. That was always the case for psci_suspend_finisher, so there was a latent issue that we were somehow avoiding. Perhaps we got luck with stack layout and never hit the poison. I'm not sure how we fix that, as invoke_psci_fn_* may or may not return for arbitrary reasons (e.g. a CPU_SUSPEND_CALL may or may not return depending on whether an interrupt comes in at the right time). Perhaps the simplest option is to not instrument invoke_psci_fn_* and psci_suspend_finisher. Do we have a per-function annotation to avoid KASAN instrumentation, like notrace? I need to investigate, but we may also need notrace for similar reasons. Andrey, on a tangential note, what do we do around hotplug? I assume that we must unpooison the shadow region for the stack of a dead CPU, but I wasn't able to figure out where we do that. Hopefuly we're not just getting lucky? Thanks, Mark.
On 02/17/2016 05:39 PM, Mark Rutland wrote: > On Tue, Feb 16, 2016 at 03:59:09PM +0300, Andrey Ryabinin wrote: >> Actually, the first report is a bit more useful. It shows that shadow memory was corrupted: >> >> ffffffc93665bc00: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f1 f1 >>> ffffffc93665bc80: f1 f1 00 00 00 00 f3 f3 00 f4 f4 f4 f3 f3 f3 f3 >> ^ >> F1 - left redzone, it indicates start of stack frame >> F3 - right redzone, it should be the end of stack frame. >> >> But here we have the second set of F1s without F3s which should close the first set of F1s. >> Also those two F3s in the middle cannot be right. >> >> So shadow is corrupted. >> Some hypotheses: > >> 2) Shadow memory wasn't cleared. GCC poison memory on function entrance and unpoisons it before return. >> If we use some tricky way to exit from function this could cause false-positives like that. >> E.g. some hand-written assembly return code. > > I think this is what's happenening, at least for the idle case. > > A second attempt at bisecting led me to commit e679660dbb8347f2 ("ARM: > 8481/2: drivers: psci: replace psci firmware calls"). Reverting that > makes v4.5-rc1 boot without KASAN splats. > > That patch turned __invoke_psci_fn_{smc,hvc} into (ASAN-instrumented) C > functions. Prior to that commit, __invoke_psci_fn_{smc,hvc} were > pure assembly functions which used no stack. > > When we go down for idle, in __cpu_suspend_enter we stash some context > to the stack (in assembly). The CPU may return from a cold state via > cpu_resume, where we restore context from the stack. > > However, after storing the context we call psci_suspend_finisher, which > calls psci_cpu_suspend, which calls invoke_psci_fn_*. As > psci_cpu_suspend and invoke_psci_fn_* are instrumented, they poison > memory on function entrance, but we never perform the unpoisoning. > > That was always the case for psci_suspend_finisher, so there was a > latent issue that we were somehow avoiding. Perhaps we got luck with > stack layout and never hit the poison. > > I'm not sure how we fix that, as invoke_psci_fn_* may or may not return > for arbitrary reasons (e.g. a CPU_SUSPEND_CALL may or may not return > depending on whether an interrupt comes in at the right time). > > Perhaps the simplest option is to not instrument invoke_psci_fn_* and > psci_suspend_finisher. Do we have a per-function annotation to avoid > KASAN instrumentation, like notrace? I need to investigate, but we may > also need notrace for similar reasons. include/linux/compiler-gcc.h: /* * Tell the compiler that address safety instrumentation (KASAN) * should not be applied to that function. * Conflicts with inlining: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368 */ #define __no_sanitize_address __attribute__((no_sanitize_address)) > > Andrey, on a tangential note, what do we do around hotplug? I assume > that we must unpooison the shadow region for the stack of a dead CPU, > but I wasn't able to figure out where we do that. Hopefuly we're not > just getting lucky? > We do nothing about it. AFAIU we need to clear swapper's stack, somewhere in secondary_start_kernel() perhaps. > Thanks, > Mark. >
On Wed, Feb 17, 2016 at 07:31:43PM +0300, Andrey Ryabinin wrote: > On 02/17/2016 05:39 PM, Mark Rutland wrote: > > Andrey, on a tangential note, what do we do around hotplug? I assume > > that we must unpooison the shadow region for the stack of a dead CPU, > > but I wasn't able to figure out where we do that. Hopefuly we're not > > just getting lucky? > > We do nothing about it. AFAIU we need to clear swapper's stack, > somewhere in secondary_start_kernel() perhaps. Oh, joy... Surely other architectures (e.g. x86) will need to do something similar? Do they do anything currently? I can't see that they do... Mark.
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index cf038c7..25d685c 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -117,6 +117,59 @@ static void __init cpu_set_ttbr1(unsigned long ttbr1) : "r" (ttbr1)); } +static void verify_shadow(void) +{ + struct memblock_region *reg; + int i = 0; + + for_each_memblock(memory, reg) { + void *start = (void *)__phys_to_virt(reg->base); + void *end = (void *)__phys_to_virt(reg->base + reg->size); + int *shadow_start, *shadow_end; + + if (start >= end) + break; + shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1)); + shadow_end = (int *)kasan_mem_to_shadow(end); + for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) { + *shadow_start = i; + i++; + } + } + + i = 0; + for_each_memblock(memory, reg) { + void *start = (void *)__phys_to_virt(reg->base); + void *end = (void *)__phys_to_virt(reg->base + reg->size); + int *shadow_start, *shadow_end; + + if (start >= end) + break; + shadow_start = (int *)((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1)); + shadow_end = (int *)kasan_mem_to_shadow(end); + for (; shadow_start < shadow_end; shadow_start += PAGE_SIZE/sizeof(int)) { + if (*shadow_start != i) { + pr_err("screwed shadow mapping %d, %d\n", *shadow_start, i); + goto clear; + } + i++; + } + } +clear: + for_each_memblock(memory, reg) { + void *start = (void *)__phys_to_virt(reg->base); + void *end = (void *)__phys_to_virt(reg->base + reg->size); + unsigned long shadow_start, shadow_end; + + if (start >= end) + break; + shadow_start = ((unsigned long)kasan_mem_to_shadow(start) & ~(PAGE_SIZE - 1)); + shadow_end = (unsigned long)kasan_mem_to_shadow(end); + memset((void *)shadow_start, 0, shadow_end - shadow_start); + } + +} + void __init kasan_init(void) { struct memblock_region *reg; @@ -159,6 +212,8 @@ void __init kasan_init(void) cpu_set_ttbr1(__pa(swapper_pg_dir)); flush_tlb_all(); + verify_shadow(); + /* At this point kasan is fully initialized. Enable error messages */ init_task.kasan_depth = 0; pr_info("KernelAddressSanitizer initialized\n");