Message ID | CAG_fn=Vc5134sX6JRUoGp=W0to6eg56DuW3YErqeWuR_W_O9gQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 25 June 2018 at 16:59, Alexander Potapenko <glider@google.com> wrote: > Hi Ard, Mark, Andrew and others, > > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: > huge-vmap: fail gracefully on unexpected huge vmap mappings") was > supposed to make vmalloc_to_page() return NULL for pointers not > returned by vmalloc(). > But when I call vmalloc_to_page() for the pointer returned by > acpi_os_ioremap() (see the patch below) I see that the resulting > `struct page *` points to unmapped memory: > Why do you assume it maps memory? It could map a device's MMIO registers as well, which don't have struct pages associated with them. > ============================ > ACPI: Enabled 2 GPEs in block 00 to 0F > phys: 00000000fed00000, vmalloc: ffffc9000019a000, page: Isn't that phys address something like the HPET on a x86 system? > ffff8800fed00000 [ffffea0003fb4000] > BUG: unable to handle kernel paging request at ffffea0003fb4000 > PGD 3f7d5067 P4D 3f7d5067 PUD 3f7d4067 PMD 0 > Oops: 0000 [#1] SMP PTI > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc2+ #1325 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/= > 2014 > RIP: 0010:acpi_os_map_iomem+0x1c5/0x210 ??:? > Code: 00 88 ff ff 4d 89 f8 48 c1 f9 06 4c 89 f6 48 c7 c7 60 5f 01 82 > 48 c1 e1 0c 48 01 c1 e8 6d 42 70 ff 4d 85 ff 0f 84 14 ff ff ff <49> 8b > 37 48 c7 c7 d2 61 01 82 e8 55 42 70 ff e9 00 ff ff ff 48 c7 > RSP: 0000:ffff88003e253840 EFLAGS: 00010286 > RAX: 000000000000005c RBX: ffff88003d857b80 RCX: ffffffff82245d38 > RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff8288e86c > RBP: 00000000fed00000 R08: 00000000000000ae R09: 0000000000000007 > R10: 0000000000000000 R11: ffffffff828908ad R12: 0000000000001000 > R13: ffffc9000019a000 R14: 00000000fed00000 R15: ffffea0003fb4000 > FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:000000000000000= > 0 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffea0003fb4000 CR3: 000000000220a000 CR4: 00000000000006f0 > Call Trace: > acpi_ex_system_memory_space_handler+0xca/0x19f ??:? > ============================ > > For memory error detection purposes I'm trying to map the addresses > from the vmalloc range to valid struct pages, or at least make sure > there's no struct page for a given address. > Looking up the vmap_area_root rbtree isn't an option, as this must be > done from instrumented code, including interrupt handlers. > I've been trying to employ vmalloc_to_page(), but looks like it > doesn't work for ioremapped addresses. > Is this at all possible? > > Patch showing the problem follows. I'm building using GCC 7.1.1 on a > defconfig for x86_64. > > --- a/drivers/acpi/osl.c > +++ b/drivers/acpi/osl.c > @@ -279,14 +279,23 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size si= > ze) > static void __iomem *acpi_map(acpi_physical_address pg_off, unsigned > long pg_sz) > { > unsigned long pfn; > + void __iomem *ret; > + struct page *page; > > pfn =3D pg_off >> PAGE_SHIFT; > if (should_use_kmap(pfn)) { > if (pg_sz > PAGE_SIZE) > return NULL; > return (void __iomem __force *)kmap(pfn_to_page(pfn)); > - } else > - return acpi_os_ioremap(pg_off, pg_sz); > + } else { > + ret =3D acpi_os_ioremap(pg_off, pg_sz); > + BUG_ON(!is_vmalloc_addr(ret)); > + page =3D vmalloc_to_page(ret); > + pr_err("phys: %px, vmalloc: %px, page: %px [%px]\n", > pg_off, ret, page_address(page), page); > + if (page) > + pr_err("flags: %d\n", page->flags); > + return ret; > + } > } > > Thanks, > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg
On Mon, Jun 25, 2018 at 04:59:23PM +0200, Alexander Potapenko wrote: > Hi Ard, Mark, Andrew and others, > > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: > huge-vmap: fail gracefully on unexpected huge vmap mappings") was > supposed to make vmalloc_to_page() return NULL for pointers not > returned by vmalloc(). It's a little more subtle than that -- avoiding an edge case where we unexpectedly hit huge mappings, rather than determining whether an address same from vmalloc(). > For memory error detection purposes I'm trying to map the addresses > from the vmalloc range to valid struct pages, or at least make sure > there's no struct page for a given address. > Looking up the vmap_area_root rbtree isn't an option, as this must be > done from instrumented code, including interrupt handlers. I'm not sure how you can do this without looking at VMAs. In general, the vmalloc area can contain addresses which are not memory, and this cannot be detremined from the address alone. You *might* be able to get away with pfn_valid(vmalloc_to_pfn(x)), but IIRC there's some disagreement on the precise meaning of pfn_valid(), so that might just tell you that the address happens to fall close to some valid memory. Thanks, Mark.
On Mon, Jun 25, 2018 at 5:37 PM Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > On 25 June 2018 at 16:59, Alexander Potapenko <glider@google.com> wrote: > > Hi Ard, Mark, Andrew and others, > > > > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: > > huge-vmap: fail gracefully on unexpected huge vmap mappings") was > > supposed to make vmalloc_to_page() return NULL for pointers not > > returned by vmalloc(). > > But when I call vmalloc_to_page() for the pointer returned by > > acpi_os_ioremap() (see the patch below) I see that the resulting > > `struct page *` points to unmapped memory: > > > > Why do you assume it maps memory? It could map a device's MMIO > registers as well, which don't have struct pages associated with them. I might have been unclear. I'm just assuming that vmalloc_to_page() returns either a valid struct page or NULL for a valid pointer belonging to vmalloc area. In this case vmalloc_to_page() returns a wild pointer. > > ============================ > > ACPI: Enabled 2 GPEs in block 00 to 0F > > phys: 00000000fed00000, vmalloc: ffffc9000019a000, page: > > Isn't that phys address something like the HPET on a x86 system? Yes, probably. I just came across it while trying to instrument every memory access with code doing something along the lines of: if (is_vmalloc_addr(addr)) return vmalloc_to_page(addr)->metadata; else if (virt_to_page(addr)) return virt_to_page(addr)->metadata; , I don't think there's anything specific to that physical address. > > ffff8800fed00000 [ffffea0003fb4000] > > BUG: unable to handle kernel paging request at ffffea0003fb4000 > > PGD 3f7d5067 P4D 3f7d5067 PUD 3f7d4067 PMD 0 > > Oops: 0000 [#1] SMP PTI > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc2+ #1325 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/= > > 2014 > > RIP: 0010:acpi_os_map_iomem+0x1c5/0x210 ??:? > > Code: 00 88 ff ff 4d 89 f8 48 c1 f9 06 4c 89 f6 48 c7 c7 60 5f 01 82 > > 48 c1 e1 0c 48 01 c1 e8 6d 42 70 ff 4d 85 ff 0f 84 14 ff ff ff <49> 8b > > 37 48 c7 c7 d2 61 01 82 e8 55 42 70 ff e9 00 ff ff ff 48 c7 > > RSP: 0000:ffff88003e253840 EFLAGS: 00010286 > > RAX: 000000000000005c RBX: ffff88003d857b80 RCX: ffffffff82245d38 > > RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff8288e86c > > RBP: 00000000fed00000 R08: 00000000000000ae R09: 0000000000000007 > > R10: 0000000000000000 R11: ffffffff828908ad R12: 0000000000001000 > > R13: ffffc9000019a000 R14: 00000000fed00000 R15: ffffea0003fb4000 > > FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:000000000000000= > > 0 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: ffffea0003fb4000 CR3: 000000000220a000 CR4: 00000000000006f0 > > Call Trace: > > acpi_ex_system_memory_space_handler+0xca/0x19f ??:? > > ============================ > > > > For memory error detection purposes I'm trying to map the addresses > > from the vmalloc range to valid struct pages, or at least make sure > > there's no struct page for a given address. > > Looking up the vmap_area_root rbtree isn't an option, as this must be > > done from instrumented code, including interrupt handlers. > > I've been trying to employ vmalloc_to_page(), but looks like it > > doesn't work for ioremapped addresses. > > Is this at all possible? > > > > Patch showing the problem follows. I'm building using GCC 7.1.1 on a > > defconfig for x86_64. > > > > --- a/drivers/acpi/osl.c > > +++ b/drivers/acpi/osl.c > > @@ -279,14 +279,23 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size si= > > ze) > > static void __iomem *acpi_map(acpi_physical_address pg_off, unsigned > > long pg_sz) > > { > > unsigned long pfn; > > + void __iomem *ret; > > + struct page *page; > > > > pfn =3D pg_off >> PAGE_SHIFT; > > if (should_use_kmap(pfn)) { > > if (pg_sz > PAGE_SIZE) > > return NULL; > > return (void __iomem __force *)kmap(pfn_to_page(pfn)); > > - } else > > - return acpi_os_ioremap(pg_off, pg_sz); > > + } else { > > + ret =3D acpi_os_ioremap(pg_off, pg_sz); > > + BUG_ON(!is_vmalloc_addr(ret)); > > + page =3D vmalloc_to_page(ret); > > + pr_err("phys: %px, vmalloc: %px, page: %px [%px]\n", > > pg_off, ret, page_address(page), page); > > + if (page) > > + pr_err("flags: %d\n", page->flags); > > + return ret; > > + } > > } > > > > Thanks, > > Alexander Potapenko > > Software Engineer > > > > Google Germany GmbH > > Erika-Mann-Straße, 33 > > 80636 München > > > > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > > Registergericht und -nummer: Hamburg, HRB 86891 > > Sitz der Gesellschaft: Hamburg
On 25 June 2018 at 18:07, Alexander Potapenko <glider@google.com> wrote: > On Mon, Jun 25, 2018 at 5:37 PM Ard Biesheuvel > <ard.biesheuvel@linaro.org> wrote: >> >> On 25 June 2018 at 16:59, Alexander Potapenko <glider@google.com> wrote: >> > Hi Ard, Mark, Andrew and others, >> > >> > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: >> > huge-vmap: fail gracefully on unexpected huge vmap mappings") was >> > supposed to make vmalloc_to_page() return NULL for pointers not >> > returned by vmalloc(). >> > But when I call vmalloc_to_page() for the pointer returned by >> > acpi_os_ioremap() (see the patch below) I see that the resulting >> > `struct page *` points to unmapped memory: >> > >> >> Why do you assume it maps memory? It could map a device's MMIO >> registers as well, which don't have struct pages associated with them. > I might have been unclear. I'm just assuming that vmalloc_to_page() > returns either a valid struct page or NULL for a valid pointer > belonging to vmalloc area. > In this case vmalloc_to_page() returns a wild pointer. > is_vmalloc_addr() only checks whether the mapping is within the boundaries of the VMALLOC region, and does not check whether it is in fact a VM_ALLOC or VM_MAP mapping, and vmalloc_to_page() should probably only be called on mappings of that type. The reason it is implemented like this may well be the issue that you highlight, i.e., that this cannot be done from every context. As for the patch, it was intended to ensure that vmalloc_to_page() does not blindly assume that VM_MAP regions are mapped down to pages, which is not the case for mappings of the kernel image on arm64. >> > ============================ >> > ACPI: Enabled 2 GPEs in block 00 to 0F >> > phys: 00000000fed00000, vmalloc: ffffc9000019a000, page: >> >> Isn't that phys address something like the HPET on a x86 system? > Yes, probably. I just came across it while trying to instrument every > memory access with code doing something along the lines of: > if (is_vmalloc_addr(addr)) > return vmalloc_to_page(addr)->metadata; > else > if (virt_to_page(addr)) > return virt_to_page(addr)->metadata; > > , I don't think there's anything specific to that physical address. >> > ffff8800fed00000 [ffffea0003fb4000] >> > BUG: unable to handle kernel paging request at ffffea0003fb4000 >> > PGD 3f7d5067 P4D 3f7d5067 PUD 3f7d4067 PMD 0 >> > Oops: 0000 [#1] SMP PTI >> > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc2+ #1325 >> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/= >> > 2014 >> > RIP: 0010:acpi_os_map_iomem+0x1c5/0x210 ??:? >> > Code: 00 88 ff ff 4d 89 f8 48 c1 f9 06 4c 89 f6 48 c7 c7 60 5f 01 82 >> > 48 c1 e1 0c 48 01 c1 e8 6d 42 70 ff 4d 85 ff 0f 84 14 ff ff ff <49> 8b >> > 37 48 c7 c7 d2 61 01 82 e8 55 42 70 ff e9 00 ff ff ff 48 c7 >> > RSP: 0000:ffff88003e253840 EFLAGS: 00010286 >> > RAX: 000000000000005c RBX: ffff88003d857b80 RCX: ffffffff82245d38 >> > RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff8288e86c >> > RBP: 00000000fed00000 R08: 00000000000000ae R09: 0000000000000007 >> > R10: 0000000000000000 R11: ffffffff828908ad R12: 0000000000001000 >> > R13: ffffc9000019a000 R14: 00000000fed00000 R15: ffffea0003fb4000 >> > FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:000000000000000= >> > 0 >> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> > CR2: ffffea0003fb4000 CR3: 000000000220a000 CR4: 00000000000006f0 >> > Call Trace: >> > acpi_ex_system_memory_space_handler+0xca/0x19f ??:? >> > ============================ >> > >> > For memory error detection purposes I'm trying to map the addresses >> > from the vmalloc range to valid struct pages, or at least make sure >> > there's no struct page for a given address. >> > Looking up the vmap_area_root rbtree isn't an option, as this must be >> > done from instrumented code, including interrupt handlers. >> > I've been trying to employ vmalloc_to_page(), but looks like it >> > doesn't work for ioremapped addresses. >> > Is this at all possible? >> > >> > Patch showing the problem follows. I'm building using GCC 7.1.1 on a >> > defconfig for x86_64. >> > >> > --- a/drivers/acpi/osl.c >> > +++ b/drivers/acpi/osl.c >> > @@ -279,14 +279,23 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size si= >> > ze) >> > static void __iomem *acpi_map(acpi_physical_address pg_off, unsigned >> > long pg_sz) >> > { >> > unsigned long pfn; >> > + void __iomem *ret; >> > + struct page *page; >> > >> > pfn =3D pg_off >> PAGE_SHIFT; >> > if (should_use_kmap(pfn)) { >> > if (pg_sz > PAGE_SIZE) >> > return NULL; >> > return (void __iomem __force *)kmap(pfn_to_page(pfn)); >> > - } else >> > - return acpi_os_ioremap(pg_off, pg_sz); >> > + } else { >> > + ret =3D acpi_os_ioremap(pg_off, pg_sz); >> > + BUG_ON(!is_vmalloc_addr(ret)); >> > + page =3D vmalloc_to_page(ret); >> > + pr_err("phys: %px, vmalloc: %px, page: %px [%px]\n", >> > pg_off, ret, page_address(page), page); >> > + if (page) >> > + pr_err("flags: %d\n", page->flags); >> > + return ret; >> > + } >> > } >> > >> > Thanks, >> > Alexander Potapenko >> > Software Engineer >> > >> > Google Germany GmbH >> > Erika-Mann-Straße, 33 >> > 80636 München >> > >> > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado >> > Registergericht und -nummer: Hamburg, HRB 86891 >> > Sitz der Gesellschaft: Hamburg > > > > -- > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg
On Mon, Jun 25, 2018 at 6:00 PM Mark Rutland <mark.rutland@arm.com> wrote: > > On Mon, Jun 25, 2018 at 04:59:23PM +0200, Alexander Potapenko wrote: > > Hi Ard, Mark, Andrew and others, > > > > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: > > huge-vmap: fail gracefully on unexpected huge vmap mappings") was > > supposed to make vmalloc_to_page() return NULL for pointers not > > returned by vmalloc(). > > It's a little more subtle than that -- avoiding an edge case where we > unexpectedly hit huge mappings, rather than determining whether an > address same from vmalloc(). Ok, but anyway, acpi_os_ioremap() creates a huge page mapping via __ioremap_caller() (see https://elixir.bootlin.com/linux/latest/source/arch/x86/mm/ioremap.c#L133) Shouldn't these checks detect that as well? > > For memory error detection purposes I'm trying to map the addresses > > from the vmalloc range to valid struct pages, or at least make sure > > there's no struct page for a given address. > > Looking up the vmap_area_root rbtree isn't an option, as this must be > > done from instrumented code, including interrupt handlers. > > I'm not sure how you can do this without looking at VMAs. > > In general, the vmalloc area can contain addresses which are not memory, > and this cannot be detremined from the address alone. I thought this was exactly what vmalloc_to_page() did, but apparently no. > You *might* be able to get away with pfn_valid(vmalloc_to_pfn(x)), but > IIRC there's some disagreement on the precise meaning of pfn_valid(), so > that might just tell you that the address happens to fall close to some > valid memory. This appears to work, at least for ACPI mappings. I'll check other cases though. Thank you! > Thanks, > Mark.
On Mon, Jun 25, 2018 at 06:24:57PM +0200, Alexander Potapenko wrote: > On Mon, Jun 25, 2018 at 6:00 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Mon, Jun 25, 2018 at 04:59:23PM +0200, Alexander Potapenko wrote: > > > Hi Ard, Mark, Andrew and others, > > > > > > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: > > > huge-vmap: fail gracefully on unexpected huge vmap mappings") was > > > supposed to make vmalloc_to_page() return NULL for pointers not > > > returned by vmalloc(). > > > > It's a little more subtle than that -- avoiding an edge case where we > > unexpectedly hit huge mappings, rather than determining whether an > > address same from vmalloc(). > Ok, but anyway, acpi_os_ioremap() creates a huge page mapping via > __ioremap_caller() (see > https://elixir.bootlin.com/linux/latest/source/arch/x86/mm/ioremap.c#L133) > Shouldn't these checks detect that as well? It should catch such mappings, yes. > > > For memory error detection purposes I'm trying to map the addresses > > > from the vmalloc range to valid struct pages, or at least make sure > > > there's no struct page for a given address. > > > Looking up the vmap_area_root rbtree isn't an option, as this must be > > > done from instrumented code, including interrupt handlers. > > > > I'm not sure how you can do this without looking at VMAs. > > > > In general, the vmalloc area can contain addresses which are not memory, > > and this cannot be detremined from the address alone. > I thought this was exactly what vmalloc_to_page() did, but apparently no. > > > You *might* be able to get away with pfn_valid(vmalloc_to_pfn(x)), but > > IIRC there's some disagreement on the precise meaning of pfn_valid(), so > > that might just tell you that the address happens to fall close to some > > valid memory. > This appears to work, at least for ACPI mappings. I'll check other cases though. > Thank you! Great! Thanks, Mark.
On Mon, Jun 25, 2018 at 6:27 PM Mark Rutland <mark.rutland@arm.com> wrote: > > On Mon, Jun 25, 2018 at 06:24:57PM +0200, Alexander Potapenko wrote: > > On Mon, Jun 25, 2018 at 6:00 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > On Mon, Jun 25, 2018 at 04:59:23PM +0200, Alexander Potapenko wrote: > > > > Hi Ard, Mark, Andrew and others, > > > > > > > > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: > > > > huge-vmap: fail gracefully on unexpected huge vmap mappings") was > > > > supposed to make vmalloc_to_page() return NULL for pointers not > > > > returned by vmalloc(). > > > > > > It's a little more subtle than that -- avoiding an edge case where we > > > unexpectedly hit huge mappings, rather than determining whether an > > > address same from vmalloc(). > > Ok, but anyway, acpi_os_ioremap() creates a huge page mapping via > > __ioremap_caller() (see > > https://elixir.bootlin.com/linux/latest/source/arch/x86/mm/ioremap.c#L133) > > Shouldn't these checks detect that as well? > > It should catch such mappings, yes. > > > > > For memory error detection purposes I'm trying to map the addresses > > > > from the vmalloc range to valid struct pages, or at least make sure > > > > there's no struct page for a given address. > > > > Looking up the vmap_area_root rbtree isn't an option, as this must be > > > > done from instrumented code, including interrupt handlers. > > > > > > I'm not sure how you can do this without looking at VMAs. > > > > > > In general, the vmalloc area can contain addresses which are not memory, > > > and this cannot be detremined from the address alone. > > I thought this was exactly what vmalloc_to_page() did, but apparently no. > > > > > You *might* be able to get away with pfn_valid(vmalloc_to_pfn(x)), but > > > IIRC there's some disagreement on the precise meaning of pfn_valid(), so > > > that might just tell you that the address happens to fall close to some > > > valid memory. > > This appears to work, at least for ACPI mappings. I'll check other cases though. > > Thank you! pfn_valid(vmalloc_to_pfn(x)) works for me, so I'll stick to this solution for now. Thanks again! But just to clarify, should vmalloc_to_page() return NULL for a huge mapping returned by __ioremap_caller()? Your answer and that of Ard seem to be contradictory. Maybe it's a good idea to add the pfn_valid() check to vmalloc_to_page() just to be sure? > Great! > > Thanks, > Mark.
On Tue, Jun 26, 2018 at 12:00:00PM +0200, Alexander Potapenko wrote: > On Mon, Jun 25, 2018 at 6:27 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Mon, Jun 25, 2018 at 06:24:57PM +0200, Alexander Potapenko wrote: > > > On Mon, Jun 25, 2018 at 6:00 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > On Mon, Jun 25, 2018 at 04:59:23PM +0200, Alexander Potapenko wrote: > > > > > Hi Ard, Mark, Andrew and others, > > > > > > > > > > AFAIU, commit 029c54b09599573015a5c18dbe59cbdf42742237 ("mm/vmalloc.c: > > > > > huge-vmap: fail gracefully on unexpected huge vmap mappings") was > > > > > supposed to make vmalloc_to_page() return NULL for pointers not > > > > > returned by vmalloc(). > > > > > > > > It's a little more subtle than that -- avoiding an edge case where we > > > > unexpectedly hit huge mappings, rather than determining whether an > > > > address same from vmalloc(). > > > Ok, but anyway, acpi_os_ioremap() creates a huge page mapping via > > > __ioremap_caller() (see > > > https://elixir.bootlin.com/linux/latest/source/arch/x86/mm/ioremap.c#L133) > > > Shouldn't these checks detect that as well? > > > > It should catch such mappings, yes. > > > > > > > For memory error detection purposes I'm trying to map the addresses > > > > > from the vmalloc range to valid struct pages, or at least make sure > > > > > there's no struct page for a given address. > > > > > Looking up the vmap_area_root rbtree isn't an option, as this must be > > > > > done from instrumented code, including interrupt handlers. > > > > > > > > I'm not sure how you can do this without looking at VMAs. > > > > > > > > In general, the vmalloc area can contain addresses which are not memory, > > > > and this cannot be detremined from the address alone. > > > I thought this was exactly what vmalloc_to_page() did, but apparently no. > > > > > > > You *might* be able to get away with pfn_valid(vmalloc_to_pfn(x)), but > > > > IIRC there's some disagreement on the precise meaning of pfn_valid(), so > > > > that might just tell you that the address happens to fall close to some > > > > valid memory. > > > This appears to work, at least for ACPI mappings. I'll check other cases though. > > > Thank you! > pfn_valid(vmalloc_to_pfn(x)) works for me, so I'll stick to this > solution for now. Thanks again! > > But just to clarify, should vmalloc_to_page() return NULL for a huge > mapping returned by __ioremap_caller()? It will not always do so. It *may* return NULL, or it may return a potentially invalid pointer to struct page. > Your answer and that of Ard seem to be contradictory. > Maybe it's a good idea to add the pfn_valid() check to > vmalloc_to_page() just to be sure? Perhaps, though it really depends on the intended use case of vmalloc_to_page(). Thanks, Mark.
============================ ACPI: Enabled 2 GPEs in block 00 to 0F phys: 00000000fed00000, vmalloc: ffffc9000019a000, page: ffff8800fed00000 [ffffea0003fb4000] BUG: unable to handle kernel paging request at ffffea0003fb4000 PGD 3f7d5067 P4D 3f7d5067 PUD 3f7d4067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc2+ #1325 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/= 2014 RIP: 0010:acpi_os_map_iomem+0x1c5/0x210 ??:? Code: 00 88 ff ff 4d 89 f8 48 c1 f9 06 4c 89 f6 48 c7 c7 60 5f 01 82 48 c1 e1 0c 48 01 c1 e8 6d 42 70 ff 4d 85 ff 0f 84 14 ff ff ff <49> 8b 37 48 c7 c7 d2 61 01 82 e8 55 42 70 ff e9 00 ff ff ff 48 c7 RSP: 0000:ffff88003e253840 EFLAGS: 00010286 RAX: 000000000000005c RBX: ffff88003d857b80 RCX: ffffffff82245d38 RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff8288e86c RBP: 00000000fed00000 R08: 00000000000000ae R09: 0000000000000007 R10: 0000000000000000 R11: ffffffff828908ad R12: 0000000000001000 R13: ffffc9000019a000 R14: 00000000fed00000 R15: ffffea0003fb4000 FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:000000000000000= 0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffea0003fb4000 CR3: 000000000220a000 CR4: 00000000000006f0 Call Trace: acpi_ex_system_memory_space_handler+0xca/0x19f ??:? ============================ For memory error detection purposes I'm trying to map the addresses from the vmalloc range to valid struct pages, or at least make sure there's no struct page for a given address. Looking up the vmap_area_root rbtree isn't an option, as this must be done from instrumented code, including interrupt handlers. I've been trying to employ vmalloc_to_page(), but looks like it doesn't work for ioremapped addresses. Is this at all possible? Patch showing the problem follows. I'm building using GCC 7.1.1 on a defconfig for x86_64. --- a/drivers/acpi/osl.c +++ b/drivers/acpi/osl.c @@ -279,14 +279,23 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size si= ze) static void __iomem *acpi_map(acpi_physical_address pg_off, unsigned long pg_sz) { unsigned long pfn; + void __iomem *ret; + struct page *page; pfn =3D pg_off >> PAGE_SHIFT; if (should_use_kmap(pfn)) { if (pg_sz > PAGE_SIZE) return NULL; return (void __iomem __force *)kmap(pfn_to_page(pfn)); - } else - return acpi_os_ioremap(pg_off, pg_sz); + } else { + ret =3D acpi_os_ioremap(pg_off, pg_sz); + BUG_ON(!is_vmalloc_addr(ret)); + page =3D vmalloc_to_page(ret); + pr_err("phys: %px, vmalloc: %px, page: %px [%px]\n", pg_off, ret, page_address(page), page); + if (page) + pr_err("flags: %d\n", page->flags); + return ret; + }