Message ID | 20240305030516.41519-2-alexei.starovoitov@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 3e49a866c9dcbd8173e4f3e491293619a9e81fa4 |
Headers | show |
Series | mm: Enforce ioremap address space and introduce sparse vm_area | expand |
On 05.03.2024 04:05, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@kernel.org> > > There are various users of get_vm_area() + ioremap_page_range() APIs. > Enforce that get_vm_area() was requested as VM_IOREMAP type and range > passed to ioremap_page_range() matches created vm_area to avoid > accidentally ioremap-ing into wrong address range. > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > --- This patch landed in today's linux-next as commit 3e49a866c9dc ("mm: Enforce VM_IOREMAP flag and range in ioremap_page_range."). Unfortunately it triggers the following warning on all my test machines with PCI bridges. Here is an example reproduced with QEMU and ARM64 'virt' machine: pci-host-generic 4010000000.pcie: host bridge /pcie@10000000 ranges: pci-host-generic 4010000000.pcie: IO 0x003eff0000..0x003effffff -> 0x0000000000 pci-host-generic 4010000000.pcie: MEM 0x0010000000..0x003efeffff -> 0x0010000000 pci-host-generic 4010000000.pcie: MEM 0x8000000000..0xffffffffff -> 0x8000000000 ------------[ cut here ]------------ vm_area at addr fffffbfffe800000 is not marked as VM_IOREMAP WARNING: CPU: 0 PID: 1 at mm/vmalloc.c:315 ioremap_page_range+0x8c/0x174 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.8.0-rc6+ #14694 Hardware name: linux,dummy-virt (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ioremap_page_range+0x8c/0x174 lr : ioremap_page_range+0x8c/0x174 sp : ffff800083faba10 ... Call trace: ioremap_page_range+0x8c/0x174 pci_remap_iospace+0x74/0x88 devm_pci_remap_iospace+0x54/0xac devm_of_pci_bridge_init+0x160/0x1fc devm_pci_alloc_host_bridge+0xb4/0xd0 pci_host_common_probe+0x44/0x1a0 platform_probe+0x68/0xd8 really_probe+0x148/0x2b4 __driver_probe_device+0x78/0x12c driver_probe_device+0xdc/0x164 __driver_attach+0x9c/0x1ac bus_for_each_dev+0x74/0xd4 driver_attach+0x24/0x30 bus_add_driver+0xe4/0x1e8 driver_register+0x60/0x128 __platform_driver_register+0x28/0x34 gen_pci_driver_init+0x1c/0x28 do_one_initcall+0x74/0x2f4 kernel_init_freeable+0x28c/0x4dc kernel_init+0x24/0x1dc ret_from_fork+0x10/0x20 irq event stamp: 74360 hardirqs last enabled at (74359): [<ffff80008012cb9c>] console_unlock+0x120/0x12c hardirqs last disabled at (74360): [<ffff80008122daa0>] el1_dbg+0x24/0x8c softirqs last enabled at (71258): [<ffff800080010a60>] __do_softirq+0x4a0/0x4e8 softirqs last disabled at (71245): [<ffff8000800169b0>] ____do_softirq+0x10/0x1c ---[ end trace 0000000000000000 ]--- pci-host-generic 4010000000.pcie: error -22: failed to map resource [io 0x0000-0xffff] pci-host-generic 4010000000.pcie: Memory resource size exceeds max for 32 bits pci-host-generic 4010000000.pcie: ECAM at [mem 0x4010000000-0x401fffffff] for [bus 00-ff] pci-host-generic 4010000000.pcie: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [bus 00-ff] pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff] pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff] pci 0000:00:00.0: [1b36:0008] type 00 class 0x060000 conventional PCI endpoint It looks that PCI related code must be somehow adjusted for this change. > mm/vmalloc.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index d12a17fc0c17..f42f98a127d5 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -307,8 +307,21 @@ static int vmap_range_noflush(unsigned long addr, unsigned long end, > int ioremap_page_range(unsigned long addr, unsigned long end, > phys_addr_t phys_addr, pgprot_t prot) > { > + struct vm_struct *area; > int err; > > + area = find_vm_area((void *)addr); > + if (!area || !(area->flags & VM_IOREMAP)) { > + WARN_ONCE(1, "vm_area at addr %lx is not marked as VM_IOREMAP\n", addr); > + return -EINVAL; > + } > + if (addr != (unsigned long)area->addr || > + (void *)end != area->addr + get_vm_area_size(area)) { > + WARN_ONCE(1, "ioremap request [%lx,%lx) doesn't match vm_area [%lx, %lx)\n", > + addr, end, (long)area->addr, > + (long)area->addr + get_vm_area_size(area)); > + return -ERANGE; > + } > err = vmap_range_noflush(addr, end, phys_addr, pgprot_nx(prot), > ioremap_max_page_shift); > flush_cache_vmap(addr, end); Best regards
On Fri, Mar 8, 2024 at 9:14 AM Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > On 05.03.2024 04:05, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > > > There are various users of get_vm_area() + ioremap_page_range() APIs. > > Enforce that get_vm_area() was requested as VM_IOREMAP type and range > > passed to ioremap_page_range() matches created vm_area to avoid > > accidentally ioremap-ing into wrong address range. > > > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > --- > > This patch landed in today's linux-next as commit 3e49a866c9dc ("mm: > Enforce VM_IOREMAP flag and range in ioremap_page_range."). > Unfortunately it triggers the following warning on all my test machines > with PCI bridges. Here is an example reproduced with QEMU and ARM64 > 'virt' machine: Sorry about the breakage. Here is the thread where we're discussing the fix: https://lore.kernel.org/bpf/CAADnVQLP=dxBb+RiMGXoaCEuRrbK387J6B+pfzWKF_F=aRgCPQ@mail.gmail.com/
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index d12a17fc0c17..f42f98a127d5 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -307,8 +307,21 @@ static int vmap_range_noflush(unsigned long addr, unsigned long end, int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot) { + struct vm_struct *area; int err; + area = find_vm_area((void *)addr); + if (!area || !(area->flags & VM_IOREMAP)) { + WARN_ONCE(1, "vm_area at addr %lx is not marked as VM_IOREMAP\n", addr); + return -EINVAL; + } + if (addr != (unsigned long)area->addr || + (void *)end != area->addr + get_vm_area_size(area)) { + WARN_ONCE(1, "ioremap request [%lx,%lx) doesn't match vm_area [%lx, %lx)\n", + addr, end, (long)area->addr, + (long)area->addr + get_vm_area_size(area)); + return -ERANGE; + } err = vmap_range_noflush(addr, end, phys_addr, pgprot_nx(prot), ioremap_max_page_shift); flush_cache_vmap(addr, end);