diff mbox

[3/6] struct page: add field for vm_struct

Message ID 20180211031920.3424-4-igor.stoppa@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Igor Stoppa Feb. 11, 2018, 3:19 a.m. UTC
When a page is used for virtual memory, it is often necessary to obtian
a handler to the corresponding vm_struct, which refers to the virtually
continuous area generated when invoking vmalloc.

The struct page has a "mapping" field, which can be re-used, to store a
pointer to the parent area. This will avoid more expensive searches.

As example, the function find_vm_area is reimplemented, to take advantage
of the newly introduced field.

Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
---
 include/linux/mm_types.h |  1 +
 mm/vmalloc.c             | 18 +++++++++++++-----
 2 files changed, 14 insertions(+), 5 deletions(-)

Comments

Matthew Wilcox (Oracle) Feb. 11, 2018, 9:16 p.m. UTC | #1
On Sun, Feb 11, 2018 at 05:19:17AM +0200, Igor Stoppa wrote:
> The struct page has a "mapping" field, which can be re-used, to store a
> pointer to the parent area. This will avoid more expensive searches.
> 
> As example, the function find_vm_area is reimplemented, to take advantage
> of the newly introduced field.

Umm.  Is it more efficient?  You're replacing an rb-tree search with a
page-table walk.  You eliminate a spinlock, which is great, but is the
page-table walk more efficient?  I suppose it'll depend on the depth of
the rb-tree, and (at least on x86), the page tables should already be
in cache.

Unrelated to this patch, I'm working on a patch to give us page_type,
and I think I'll allocate a bit to mark pages which are vmalloced.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Stoppa Feb. 12, 2018, 4:24 p.m. UTC | #2
On 11/02/18 23:16, Matthew Wilcox wrote:
> On Sun, Feb 11, 2018 at 05:19:17AM +0200, Igor Stoppa wrote:
>> The struct page has a "mapping" field, which can be re-used, to store a
>> pointer to the parent area. This will avoid more expensive searches.
>>
>> As example, the function find_vm_area is reimplemented, to take advantage
>> of the newly introduced field.
> 
> Umm.  Is it more efficient?  You're replacing an rb-tree search with a
> page-table walk.  You eliminate a spinlock, which is great, but is the
> page-table walk more efficient?  I suppose it'll depend on the depth of
> the rb-tree, and (at least on x86), the page tables should already be
> in cache.

I thought the tradeoff favorable. How to verify it?

> Unrelated to this patch, I'm working on a patch to give us page_type,
> and I think I'll allocate a bit to mark pages which are vmalloced.

pmalloced too?

--
igor
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Stoppa Feb. 20, 2018, 7:53 p.m. UTC | #3
On 12/02/18 18:24, Igor Stoppa wrote:
> 
> 
> On 11/02/18 23:16, Matthew Wilcox wrote:
>> On Sun, Feb 11, 2018 at 05:19:17AM +0200, Igor Stoppa wrote:
>>> The struct page has a "mapping" field, which can be re-used, to store a
>>> pointer to the parent area. This will avoid more expensive searches.
>>>
>>> As example, the function find_vm_area is reimplemented, to take advantage
>>> of the newly introduced field.
>>
>> Umm.  Is it more efficient?  You're replacing an rb-tree search with a
>> page-table walk.  You eliminate a spinlock, which is great, but is the
>> page-table walk more efficient?  I suppose it'll depend on the depth of
>> the rb-tree, and (at least on x86), the page tables should already be
>> in cache.
> 
> I thought the tradeoff favorable.

It turns out that it's probably not so favorable.
The patch relies on the function vmalloc_to_page ... which will return
NULL when applied to huge mappings, while the original implementation
will still work.

It was found while testing on a configuration with framebuffer.

So it seems unlikely that there is any gain to be had, from this
perspective.

The use of the field still makes sense from the perspective of adding
pmalloc support to hardened usercopy, but there is no more need for the
field to exist as separate patch.

This patch can be simplified and merged with the pmalloc patch.

--
igor
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matthew Wilcox (Oracle) Feb. 20, 2018, 8:54 p.m. UTC | #4
On Tue, Feb 20, 2018 at 09:53:30PM +0200, Igor Stoppa wrote:
> The patch relies on the function vmalloc_to_page ... which will return
> NULL when applied to huge mappings, while the original implementation
> will still work.

Huh?  vmalloc_to_page() should work for huge mappings...

> It was found while testing on a configuration with framebuffer.

... ah.  You tried to use vmalloc_to_page() on something which wasn't
backed by a struct page.  That's *supposed* to return NULL, but my
guess is that after this patch it returned garbage.

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Stoppa Feb. 21, 2018, 12:01 p.m. UTC | #5
On 20/02/18 22:54, Matthew Wilcox wrote:
> On Tue, Feb 20, 2018 at 09:53:30PM +0200, Igor Stoppa wrote:

[...]

>> It was found while testing on a configuration with framebuffer.
> 
> ... ah.  You tried to use vmalloc_to_page() on something which wasn't
> backed by a struct page.  That's *supposed* to return NULL, but my
> guess is that after this patch it returned garbage.

it seems to return garbage also without this patch, but I need to clean
up the code, try it again and possibly come up with a demo patch for
triggering the problem.

I'll investigate it more. However it doesn't seem to be related to the
functionality I need. So I plan to treat it as separate issue and leave
find_vm_area untouched, at least in pmalloc scope.

--
igor



--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Igor Stoppa Feb. 22, 2018, 2:20 p.m. UTC | #6
On 21/02/18 14:01, Igor Stoppa wrote:

> it seems to return garbage also without this patch, but I need to clean
> up the code, try it again and possibly come up with a demo patch for
> triggering the problem.
> 
> I'll investigate it more. However it doesn't seem to be related to the
> functionality I need. So I plan to treat it as separate issue and leave
> find_vm_area untouched, at least in pmalloc scope.


Follow-up:

https://lkml.org/lkml/2018/2/22/427

--
igor
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index fd1af6b9591d..c3a4825e10c0 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -84,6 +84,7 @@  struct page {
 		void *s_mem;			/* slab first object */
 		atomic_t compound_mapcount;	/* first tail page */
 		/* page_deferred_list().next	 -- second tail page */
+		struct vm_struct *area;
 	};
 
 	/* Second double word */
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 673942094328..9404ffd0ee98 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1466,13 +1466,16 @@  struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
  */
 struct vm_struct *find_vm_area(const void *addr)
 {
-	struct vmap_area *va;
+	struct page *page;
 
-	va = find_vmap_area((unsigned long)addr);
-	if (va && va->flags & VM_VM_AREA)
-		return va->vm;
+	if (unlikely(!is_vmalloc_addr(addr)))
+		return NULL;
 
-	return NULL;
+	page = vmalloc_to_page(addr);
+	if (unlikely(!page))
+		return NULL;
+
+	return page->area;
 }
 
 /**
@@ -1536,6 +1539,7 @@  static void __vunmap(const void *addr, int deallocate_pages)
 			struct page *page = area->pages[i];
 
 			BUG_ON(!page);
+			page->area = NULL;
 			__free_pages(page, 0);
 		}
 
@@ -1744,6 +1748,7 @@  void *__vmalloc_node_range(unsigned long size, unsigned long align,
 			const void *caller)
 {
 	struct vm_struct *area;
+	unsigned int i;
 	void *addr;
 	unsigned long real_size = size;
 
@@ -1769,6 +1774,9 @@  void *__vmalloc_node_range(unsigned long size, unsigned long align,
 
 	kmemleak_vmalloc(area, size, gfp_mask);
 
+	for (i = 0; i < area->nr_pages; i++)
+		area->pages[i]->area = area;
+
 	return addr;
 
 fail: