Message ID | 20180130151446.24698-4-igor.stoppa@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 30 Jan 2018, Igor Stoppa wrote: > @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > > kmemleak_vmalloc(area, size, gfp_mask); > > + for (page_counter = 0; page_counter < area->nr_pages; page_counter++) > + area->pages[page_counter]->area = area; > + > return addr; Well this introduces significant overhead for large sized allocation. Does this not matter because the areas are small? Would it not be better to use compound page allocations here? page_head(whatever) gets you the head page where you can store all sorts of information about the chunk of memory. -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/02/18 02:00, Christopher Lameter wrote: > On Tue, 30 Jan 2018, Igor Stoppa wrote: > >> @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, >> >> kmemleak_vmalloc(area, size, gfp_mask); >> >> + for (page_counter = 0; page_counter < area->nr_pages; page_counter++) >> + area->pages[page_counter]->area = area; >> + >> return addr; > > Well this introduces significant overhead for large sized allocation. Does > this not matter because the areas are small? Relatively significant? I do not object to your comment, but in practice i see that: - vmalloc is used relatively little - allocations do not seem to be huge - there seem to be way larger overheads in the handling of virtual pages (see my proposal for the LFS/m summit, about collapsing struct vm_struct and struct vmap_area) > Would it not be better to use compound page allocations here? > page_head(whatever) gets you the head page where you can store all sorts > of information about the chunk of memory. Can you please point me to this function/macro? I don't seem to be able to find it, at least not in 4.15 During hardened user copy permission check, I need to confirm if the memory range that would be exposed to userspace is a legitimate sub-range of a pmalloc allocation. So, I start with the pair (address, size) and I must end up to something I can compare it against. The idea here is to pass through struct_page and then the related vm_struct/vmap_area, which already has the information about the specific chunk of virtual memory. I cannot comment on your proposal because I do not know where to find the reference you made, or maybe I do not understand what you mean :-( -- igor -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 1, 2018 at 11:42 PM, Igor Stoppa <igor.stoppa@huawei.com> wrote: > On 01/02/18 02:00, Christopher Lameter wrote: >> Would it not be better to use compound page allocations here? >> page_head(whatever) gets you the head page where you can store all sorts >> of information about the chunk of memory. > > Can you please point me to this function/macro? I don't seem to be able > to find it, at least not in 4.15 IIUC, he means PageHead(), which is also hard to grep for, since it is a constructed name, via Page##uname in include/linux/page-flags.h: __PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY) -Kees
On 01/02/18 23:11, Kees Cook wrote: > IIUC, he means PageHead(), which is also hard to grep for, since it is > a constructed name, via Page##uname in include/linux/page-flags.h: > > __PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY) Thank you, I'll try to provide a meaningful reply soon, but I'll be AFK during most of next 2 weeks, so it might be delayed :-( -- igor -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 1 Feb 2018, Igor Stoppa wrote: > > Would it not be better to use compound page allocations here? > > page_head(whatever) gets you the head page where you can store all sorts > > of information about the chunk of memory. > > Can you please point me to this function/macro? I don't seem to be able > to find it, at least not in 4.15 Ok its compound_head(). See also the use in the SLAB and SLUB allocator. > During hardened user copy permission check, I need to confirm if the > memory range that would be exposed to userspace is a legitimate > sub-range of a pmalloc allocation. If you save the size in the head page struct then you could do that pretty fast. > I cannot comment on your proposal because I do not know where to find > the reference you made, or maybe I do not understand what you mean :-( compund pages are higher order pages that are handled as a single page by the VM. See https://lwn.net/Articles/619514/ -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/02/18 20:43, Christopher Lameter wrote: > On Thu, 1 Feb 2018, Igor Stoppa wrote: > >>> Would it not be better to use compound page allocations here? [...] > Ok its compound_head(). See also the use in the SLAB and SLUB allocator. > >> During hardened user copy permission check, I need to confirm if the >> memory range that would be exposed to userspace is a legitimate >> sub-range of a pmalloc allocation. > > If you save the size in the head page struct then you could do that pretty > fast. Ok, now I get what you mean. But it doesn't seem to fit the intended use case, for other reasons (maybe the same, from 2 different POV): - compound pages are aggregates of regular pages, in numbers that are powers of 2, while the amount of pages to allocate is not known upfront. One *could* give a hint to pmalloc about how many pages to allocate every time there is a need to grow the pool. Iow it would be the size of a chunk. But I'm afraid the granularity would still be pretty low, so maybe it would be 2-4 times less. - the property of the compound page will affect the property of all the pages in the compound, so when one is write protected, it can generate a lot of wasted memory, if there is too much slack (because of the order) With vmalloc, I can allocate any number of pages, minimizing the waste. Finally, there was a discussion about optimization: http://www.openwall.com/lists/kernel-hardening/2017/08/07/2 The patch I sent does indeed take advantage of the new information, not just for pmalloc use. I have not measured if/where/what there is gain, but it does look like the extra info can be exploited also elsewhere. -- igor -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 3 Feb 2018, Igor Stoppa wrote: > - the property of the compound page will affect the property of all the > pages in the compound, so when one is write protected, it can generate a > lot of wasted memory, if there is too much slack (because of the order) > With vmalloc, I can allocate any number of pages, minimizing the waste. I thought the intend here is to create a pool where the whole pool becomes RO? -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 30, 2018 at 05:14:43PM +0200, Igor Stoppa wrote: > @@ -1744,6 +1748,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > const void *caller) > { > struct vm_struct *area; > + unsigned int page_counter; > void *addr; > unsigned long real_size = size; > > @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > > kmemleak_vmalloc(area, size, gfp_mask); > > + for (page_counter = 0; page_counter < area->nr_pages; page_counter++) > + area->pages[page_counter]->area = area; > + > return addr; > LOCAL variable names should be short, and to the point. If you have some random integer loop counter, it should probably be called ``i``. Calling it ``loop_counter`` is non-productive, if there is no chance of it being mis-understood. Similarly, ``tmp`` can be just about any type of variable that is used to hold a temporary value. (Documentation/process/coding-style.rst) -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/02/18 17:33, Christopher Lameter wrote: > On Sat, 3 Feb 2018, Igor Stoppa wrote: > >> - the property of the compound page will affect the property of all the >> pages in the compound, so when one is write protected, it can generate a >> lot of wasted memory, if there is too much slack (because of the order) >> With vmalloc, I can allocate any number of pages, minimizing the waste. > > I thought the intend here is to create a pool where the whole pool becomes > RO? Yes, but why would I force the number of pages in the pool to be a power of 2, when it can be any number? If a need, say, 17 pages, I would have to allocate 32. But it can be worse than that. Since the size of the overall allocated memory is not known upfront, I wold have a problem to decide how many pages to allocate, every time there is need to grow the pool. Or push the problem to the user of the API, who might be equally unaware. Notice that there is already a function (prealloc) available to the user of the API, if the size is known upfront. So I do not really see how using compound pages would make memory utilization better or even not worse. -- igor -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/02/18 14:37, Matthew Wilcox wrote: [...] > LOCAL variable names should be short, and to the point. [...] > (Documentation/process/coding-style.rst) ok, will do, thanks for the pointer! -- igor -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index cfd0ac4..2abd540 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -56,6 +56,7 @@ struct page { void *s_mem; /* slab first object */ atomic_t compound_mapcount; /* first tail page */ /* page_deferred_list().next -- second tail page */ + struct vm_struct *area; }; /* Second double word */ diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6739420..44c5dfc 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1466,13 +1466,16 @@ struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags, */ struct vm_struct *find_vm_area(const void *addr) { - struct vmap_area *va; + struct page *page; - va = find_vmap_area((unsigned long)addr); - if (va && va->flags & VM_VM_AREA) - return va->vm; + if (unlikely(!is_vmalloc_addr(addr))) + return NULL; - return NULL; + page = vmalloc_to_page(addr); + if (unlikely(!page)) + return NULL; + + return page->area; } /** @@ -1536,6 +1539,7 @@ static void __vunmap(const void *addr, int deallocate_pages) struct page *page = area->pages[i]; BUG_ON(!page); + page->area = NULL; __free_pages(page, 0); } @@ -1744,6 +1748,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, const void *caller) { struct vm_struct *area; + unsigned int page_counter; void *addr; unsigned long real_size = size; @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, kmemleak_vmalloc(area, size, gfp_mask); + for (page_counter = 0; page_counter < area->nr_pages; page_counter++) + area->pages[page_counter]->area = area; + return addr; fail:
When a page is used for virtual memory, it is often necessary to obtian a handler to the corresponding vm_struct, which refers to the virtually continuous area generated when invoking vmalloc. The struct page has a "mapping" field, which can be re-used, to store a pointer to the parent area. This will avoid more expensive searches. As example, the function find_vm_area is reimplemented, to take advantage of the newly introduced field. Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com> --- include/linux/mm_types.h | 1 + mm/vmalloc.c | 18 +++++++++++++----- 2 files changed, 14 insertions(+), 5 deletions(-)