Message ID | 20170331102424.11869-4-punit.agrawal@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 31, 2017 at 11:24:24AM +0100, Punit Agrawal wrote: > populate_physmap() calls alloc_heap_pages() per requested extent. As > alloc_heap_pages() performs icache maintenance operations affecting the > entire instruction cache, this leads to redundant cache flushes when > allocating multiple extents in populate_physmap(). > > To alleviate this problem, introduce a new flag "MEMF_no_icache_flush" > which can be used prevent alloc_heap_pages() to perform unnecessary > icache maintenance operations. Use the flag in populate_physmap() and > perform the required icache maintenance function at the end of the > operation. > > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > --- > xen/common/memory.c | 6 ++++++ > xen/common/page_alloc.c | 2 +- > xen/include/asm-x86/page.h | 4 ++++ > xen/include/xen/mm.h | 2 ++ > 4 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/xen/common/memory.c b/xen/common/memory.c > index ad0b33ceb6..507f363924 100644 > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -162,6 +162,8 @@ static void populate_physmap(struct memop_args *a) > if ( unlikely(!d->creation_finished) ) > a->memflags |= MEMF_no_tlbflush; > > + a->memflags |= MEMF_no_icache_flush; > + Not a good idea. I think what you need is probably to group this under !d->creation_finished. > for ( i = a->nr_done; i < a->nr_extents; i++ ) > { > if ( i != a->nr_done && hypercall_preempt_check() ) > @@ -253,6 +255,10 @@ static void populate_physmap(struct memop_args *a) > out: > if ( need_tlbflush ) > filtered_flush_tlb_mask(tlbflush_timestamp); > + > + if ( a->memflags & MEMF_no_icache_flush ) This should be inverted, right? The modification to this function is definitely wrong as-is. You end up always setting the no flush flag and later always flushing the cache. Wei.
Hi Wei, Thanks for taking a look at this RFC. Responses/questions below... Wei Liu <wei.liu2@citrix.com> writes: > On Fri, Mar 31, 2017 at 11:24:24AM +0100, Punit Agrawal wrote: >> populate_physmap() calls alloc_heap_pages() per requested extent. As >> alloc_heap_pages() performs icache maintenance operations affecting the >> entire instruction cache, this leads to redundant cache flushes when >> allocating multiple extents in populate_physmap(). >> >> To alleviate this problem, introduce a new flag "MEMF_no_icache_flush" >> which can be used prevent alloc_heap_pages() to perform unnecessary >> icache maintenance operations. Use the flag in populate_physmap() and >> perform the required icache maintenance function at the end of the >> operation. >> >> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> >> --- >> xen/common/memory.c | 6 ++++++ >> xen/common/page_alloc.c | 2 +- >> xen/include/asm-x86/page.h | 4 ++++ >> xen/include/xen/mm.h | 2 ++ >> 4 files changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/xen/common/memory.c b/xen/common/memory.c >> index ad0b33ceb6..507f363924 100644 >> --- a/xen/common/memory.c >> +++ b/xen/common/memory.c >> @@ -162,6 +162,8 @@ static void populate_physmap(struct memop_args *a) >> if ( unlikely(!d->creation_finished) ) >> a->memflags |= MEMF_no_tlbflush; >> >> + a->memflags |= MEMF_no_icache_flush; >> + > > Not a good idea. > > I think what you need is probably to group this under > !d->creation_finished. Can you explain why you think the above is not a good idea? Without additional synchronisation there is a race between d->creation_finished being set and used (Consider the case where populate_physmap() being called on a different cpu to where d->creation_finished is set to true). > >> for ( i = a->nr_done; i < a->nr_extents; i++ ) >> { >> if ( i != a->nr_done && hypercall_preempt_check() ) >> @@ -253,6 +255,10 @@ static void populate_physmap(struct memop_args *a) >> out: >> if ( need_tlbflush ) >> filtered_flush_tlb_mask(tlbflush_timestamp); >> + >> + if ( a->memflags & MEMF_no_icache_flush ) > > This should be inverted, right? The MEMF_no_icache_flush flag is used to indicate to alloc_heap_pages() (called via alloc_domheap_pages() here) that the caller is going to take care of synchronising the icache with the dcache. As such, when the flag is set, we want to perform icache maintenance operations here. > > The modification to this function is definitely wrong as-is. You end up > always setting the no flush flag and later always flushing the cache. Correct! invalidate_icache() flushes the entire instruction cache which ends up being called each time flush_page_to_ram() is invoked from alloc_heap_pages(). The patch prevents repeated calls to invalidate_icache() and moves the responsibility to populate_physmap() which was the intention. It would help if you can explain why you think the changes are wrong. Thanks, Punit > > > Wei.
On Fri, Mar 31, 2017 at 02:53:55PM +0100, Punit Agrawal wrote: [...] > > Correct! > > invalidate_icache() flushes the entire instruction cache which ends up > being called each time flush_page_to_ram() is invoked from > alloc_heap_pages(). The patch prevents repeated calls to > invalidate_icache() and moves the responsibility to populate_physmap() > which was the intention. > > It would help if you can explain why you think the changes are > wrong. > OK. So it turns out I misunderstood what you wanted to do here. You can do this in a less confusing way. You don't need to fiddle with a->memflags. Just add '| MEMF_no_icache_flush' to alloc_domheap_page. And then you always call invalidate_icache at the end. This would avoid people asking question whether the check should be inverted... Wei. > Thanks, > Punit > > > > > > > Wei.
Wei Liu <wei.liu2@citrix.com> writes: > On Fri, Mar 31, 2017 at 02:53:55PM +0100, Punit Agrawal wrote: > [...] >> >> Correct! >> >> invalidate_icache() flushes the entire instruction cache which ends up >> being called each time flush_page_to_ram() is invoked from >> alloc_heap_pages(). The patch prevents repeated calls to >> invalidate_icache() and moves the responsibility to populate_physmap() >> which was the intention. >> >> It would help if you can explain why you think the changes are >> wrong. >> > > OK. So it turns out I misunderstood what you wanted to do here. > > You can do this in a less confusing way. You don't need to fiddle with > a->memflags. Just add '| MEMF_no_icache_flush' to alloc_domheap_page. > And then you always call invalidate_icache at the end. Sure, I'll update it in the next version. Thanks for your comments. Punit > > This would avoid people asking question whether the check should be > inverted... > > Wei. > >> Thanks, >> Punit >> >> > >> > >> > Wei.
On Fri, 31 Mar 2017, Wei Liu wrote: > On Fri, Mar 31, 2017 at 02:53:55PM +0100, Punit Agrawal wrote: > [...] > > > > Correct! > > > > invalidate_icache() flushes the entire instruction cache which ends up > > being called each time flush_page_to_ram() is invoked from > > alloc_heap_pages(). The patch prevents repeated calls to > > invalidate_icache() and moves the responsibility to populate_physmap() > > which was the intention. > > > > It would help if you can explain why you think the changes are > > wrong. > > > > OK. So it turns out I misunderstood what you wanted to do here. > > You can do this in a less confusing way. You don't need to fiddle with > a->memflags. Just add '| MEMF_no_icache_flush' to alloc_domheap_page. > And then you always call invalidate_icache at the end. > > This would avoid people asking question whether the check should be > inverted... Yes, good suggestion.
Hi Punit, Sorry for the late answer. On 31/03/17 11:24, Punit Agrawal wrote: > populate_physmap() calls alloc_heap_pages() per requested extent. As > alloc_heap_pages() performs icache maintenance operations affecting the > entire instruction cache, this leads to redundant cache flushes when > allocating multiple extents in populate_physmap(). > > To alleviate this problem, introduce a new flag "MEMF_no_icache_flush" > which can be used prevent alloc_heap_pages() to perform unnecessary > icache maintenance operations. Use the flag in populate_physmap() and > perform the required icache maintenance function at the end of the > operation. > > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > --- > xen/common/memory.c | 6 ++++++ > xen/common/page_alloc.c | 2 +- > xen/include/asm-x86/page.h | 4 ++++ > xen/include/xen/mm.h | 2 ++ > 4 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/xen/common/memory.c b/xen/common/memory.c > index ad0b33ceb6..507f363924 100644 > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -162,6 +162,8 @@ static void populate_physmap(struct memop_args *a) > if ( unlikely(!d->creation_finished) ) > a->memflags |= MEMF_no_tlbflush; > > + a->memflags |= MEMF_no_icache_flush; > + > for ( i = a->nr_done; i < a->nr_extents; i++ ) > { > if ( i != a->nr_done && hypercall_preempt_check() ) > @@ -253,6 +255,10 @@ static void populate_physmap(struct memop_args *a) > out: > if ( need_tlbflush ) > filtered_flush_tlb_mask(tlbflush_timestamp); > + > + if ( a->memflags & MEMF_no_icache_flush ) > + invalidate_icache(); I think it would be good to explain why it is always fine to defer the cache invalidation from a security point of view for future reference. This would help us in the future to know why we made this choice. Cheers,
Hi Julien, Julien Grall <julien.grall@arm.com> writes: > Hi Punit, > > Sorry for the late answer. > > On 31/03/17 11:24, Punit Agrawal wrote: >> populate_physmap() calls alloc_heap_pages() per requested extent. As >> alloc_heap_pages() performs icache maintenance operations affecting the >> entire instruction cache, this leads to redundant cache flushes when >> allocating multiple extents in populate_physmap(). >> >> To alleviate this problem, introduce a new flag "MEMF_no_icache_flush" >> which can be used prevent alloc_heap_pages() to perform unnecessary >> icache maintenance operations. Use the flag in populate_physmap() and >> perform the required icache maintenance function at the end of the >> operation. >> >> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> >> --- >> xen/common/memory.c | 6 ++++++ >> xen/common/page_alloc.c | 2 +- >> xen/include/asm-x86/page.h | 4 ++++ >> xen/include/xen/mm.h | 2 ++ >> 4 files changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/xen/common/memory.c b/xen/common/memory.c >> index ad0b33ceb6..507f363924 100644 >> --- a/xen/common/memory.c >> +++ b/xen/common/memory.c >> @@ -162,6 +162,8 @@ static void populate_physmap(struct memop_args *a) >> if ( unlikely(!d->creation_finished) ) >> a->memflags |= MEMF_no_tlbflush; >> >> + a->memflags |= MEMF_no_icache_flush; >> + >> for ( i = a->nr_done; i < a->nr_extents; i++ ) >> { >> if ( i != a->nr_done && hypercall_preempt_check() ) >> @@ -253,6 +255,10 @@ static void populate_physmap(struct memop_args *a) >> out: >> if ( need_tlbflush ) >> filtered_flush_tlb_mask(tlbflush_timestamp); >> + >> + if ( a->memflags & MEMF_no_icache_flush ) >> + invalidate_icache(); > > I think it would be good to explain why it is always fine to defer the > cache invalidation from a security point of view for future > reference. This would help us in the future to know why we made this > choice. Thanks for raising this. Discussing this with folks familiar with ARM, it turns out that deferring icache invalidations risks other VCPUs to execute from stale caches. We don't really want to debug issues from this. :) Based on your suggestion (offlist), I'm going to try and restrict delaying icache invalidations only during domain creation. I'll work on getting a new version out in the next day or so. > > Cheers,
diff --git a/xen/common/memory.c b/xen/common/memory.c index ad0b33ceb6..507f363924 100644 --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -162,6 +162,8 @@ static void populate_physmap(struct memop_args *a) if ( unlikely(!d->creation_finished) ) a->memflags |= MEMF_no_tlbflush; + a->memflags |= MEMF_no_icache_flush; + for ( i = a->nr_done; i < a->nr_extents; i++ ) { if ( i != a->nr_done && hypercall_preempt_check() ) @@ -253,6 +255,10 @@ static void populate_physmap(struct memop_args *a) out: if ( need_tlbflush ) filtered_flush_tlb_mask(tlbflush_timestamp); + + if ( a->memflags & MEMF_no_icache_flush ) + invalidate_icache(); + a->nr_done = i; } diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 15450a3b6d..1a51bc6b15 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -838,7 +838,7 @@ static struct page_info *alloc_heap_pages( /* Ensure cache and RAM are consistent for platforms where the * guest can control its own visibility of/through the cache. */ - flush_page_to_ram(page_to_mfn(&pg[i]), true); + flush_page_to_ram(page_to_mfn(&pg[i]), !(memflags & MEMF_no_icache_flush)); } spin_unlock(&heap_lock); diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h index bc5946b9d2..13dc9e2299 100644 --- a/xen/include/asm-x86/page.h +++ b/xen/include/asm-x86/page.h @@ -375,6 +375,10 @@ perms_strictly_increased(uint32_t old_flags, uint32_t new_flags) #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK) +static inline void invalidate_icache(void) +{ +} + #endif /* __X86_PAGE_H__ */ /* diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 88de3c1fa6..ee50d4cd7b 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -224,6 +224,8 @@ struct npfec { #define MEMF_no_owner (1U<<_MEMF_no_owner) #define _MEMF_no_tlbflush 6 #define MEMF_no_tlbflush (1U<<_MEMF_no_tlbflush) +#define _MEMF_no_icache_flush 7 +#define MEMF_no_icache_flush (1U<<_MEMF_no_icache_flush) #define _MEMF_node 8 #define MEMF_node_mask ((1U << (8 * sizeof(nodeid_t))) - 1) #define MEMF_node(n) ((((n) + 1) & MEMF_node_mask) << _MEMF_node)
populate_physmap() calls alloc_heap_pages() per requested extent. As alloc_heap_pages() performs icache maintenance operations affecting the entire instruction cache, this leads to redundant cache flushes when allocating multiple extents in populate_physmap(). To alleviate this problem, introduce a new flag "MEMF_no_icache_flush" which can be used prevent alloc_heap_pages() to perform unnecessary icache maintenance operations. Use the flag in populate_physmap() and perform the required icache maintenance function at the end of the operation. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> --- xen/common/memory.c | 6 ++++++ xen/common/page_alloc.c | 2 +- xen/include/asm-x86/page.h | 4 ++++ xen/include/xen/mm.h | 2 ++ 4 files changed, 13 insertions(+), 1 deletion(-)