Message ID | 1368336956-6693-1-git-send-email-gmbnomis@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > it needs to handle are kernel mapped only. However, for example when doing > direct I/O, pages with user space mappings may occur. > > Thus, continue to do lazy flushing if there are no user space mappings. > Otherwise, flush the kernel cache lines directly. > > Signed-off-by: Simon Baatz <gmbnomis@gmail.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Russell King <linux@arm.linux.org.uk> An issue is that for kunmap_atomic() && VIVT we flush the same page twice. I don't think we should remove the cache flushing in __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require flush_kernel_dcache_page() (unless we just assume that highmem && vivt don't work together). Flushing the VIVT cache on kunmap is essential since we just lose the alias after that. > +void flush_kernel_dcache_page(struct page *page) > +{ > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { So you could add the ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page)) check back (as it was prior to commit f8b63c184, actually reverting it) and the mapping tests as a possible optimisation. But in this case there is no need to split __flush_dcache_page() in two since we don't care about the PageHighMem() case. And we need to fix kunmap() with VIVT to flush the cache as well (in case anyone uses highmem on VIVT hardware).
On Thu, 23 May 2013, Catalin Marinas wrote: > An issue is that for kunmap_atomic() && VIVT we flush the same page > twice. I don't think we should remove the cache flushing in > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > don't work together). VIVT and highmem do work together. Highmem for ARM was in fact developed on such a platform. Nicolas
Hi Catalin, On Thu, May 23, 2013 at 11:43:03AM +0100, Catalin Marinas wrote: > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > it needs to handle are kernel mapped only. However, for example when doing > > direct I/O, pages with user space mappings may occur. > > > > Thus, continue to do lazy flushing if there are no user space mappings. > > Otherwise, flush the kernel cache lines directly. > > > > Signed-off-by: Simon Baatz <gmbnomis@gmail.com> > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > Cc: Russell King <linux@arm.linux.org.uk> > > An issue is that for kunmap_atomic() && VIVT we flush the same page > twice. I don't think we should remove the cache flushing in > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > don't work together). Flushing the VIVT cache on kunmap is essential > since we just lose the alias after that. Not sure if I am missing something here. From our previous discussion: > > > > We probably should reuse it in flush_kernel_dcache_page() to flush > > > > the kernel mapping. (The last else clause looks strange though) > > > > > > I think it makes sense to reuse this logic in > > > flush_kernel_dcache_page(). If the page is already mapped with kmap, > > > then kmap_high_get() should return the actual address. If it was mapped > > > with kmap_atomic, kmap_high_get() would return NULL, hence the 'else' > > > clause and the additional kmap_atomic(). The cache_is_vipt() check is > > > useful because kunmap_atomic() would flush VIVT caches anyway. As you say, this logic does not flush on VIVT if the page was mapped via kmap_atomic() (and kmap_atomic() could not reuse an existing mapping via kmap_high_get()) kmap_atomic() tries to reuse an existing mapping and only allocates a new dedicated kmap if it finds none. Consequently, __kunmap_atomic() only flushes in the latter case (because we lose the alias). Thus, no double flushing should occur here. If kunmap_atomic() will flush, __flush_kernel_dcache_page() will not and vice versa. > > +void flush_kernel_dcache_page(struct page *page) > > +{ > > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > > So you could add the ((cache_is_vivt() || cache_is_vipt_aliasing()) && > !PageHighMem(page)) check back (as it was prior to commit f8b63c184, > actually reverting it) and the mapping tests as a possible optimisation. > But in this case there is no need to split __flush_dcache_page() in two > since we don't care about the PageHighMem() case. > > And we need to fix kunmap() with VIVT to flush the cache as well (in > case anyone uses highmem on VIVT hardware). I still think your previous comment that we need to handle the highmem case as well was right and reusing the respective part of __flush_dcache_page() does the right thing. But as said, perhaps I am missing something here. - Simon
On Sat, May 25, 2013 at 04:53:19AM +0100, Nicolas Pitre wrote: > On Thu, 23 May 2013, Catalin Marinas wrote: > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > twice. I don't think we should remove the cache flushing in > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > don't work together). > > VIVT and highmem do work together. Highmem for ARM was in fact > developed on such a platform. Thanks for confirming. Do you remember why kunmap() doesn't (need to) flush the cache alias (for VIVT)? Or if it does, where?
On Mon, May 27, 2013 at 10:42:45PM +0100, Simon Baatz wrote: > On Thu, May 23, 2013 at 11:43:03AM +0100, Catalin Marinas wrote: > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > it needs to handle are kernel mapped only. However, for example when doing > > > direct I/O, pages with user space mappings may occur. > > > > > > Thus, continue to do lazy flushing if there are no user space mappings. > > > Otherwise, flush the kernel cache lines directly. > > > > > > Signed-off-by: Simon Baatz <gmbnomis@gmail.com> > > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > > Cc: Russell King <linux@arm.linux.org.uk> > > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > twice. I don't think we should remove the cache flushing in > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > don't work together). Flushing the VIVT cache on kunmap is essential > > since we just lose the alias after that. > > Not sure if I am missing something here. From our previous > discussion: > > > > > > We probably should reuse it in flush_kernel_dcache_page() to flush > > > > > the kernel mapping. (The last else clause looks strange though) > > > > > > > > I think it makes sense to reuse this logic in > > > > flush_kernel_dcache_page(). If the page is already mapped with kmap, > > > > then kmap_high_get() should return the actual address. If it was mapped > > > > with kmap_atomic, kmap_high_get() would return NULL, hence the 'else' > > > > clause and the additional kmap_atomic(). The cache_is_vipt() check is > > > > useful because kunmap_atomic() would flush VIVT caches anyway. I think the context was that page_address() is not always valid after kmap_atomic(), so reusing the same logic would make sense. But the key point I try to understand is why kunmap() does not need to flush the VIVT cache. If it does need to, then we can simplify your patch. > As you say, this logic does not flush on VIVT if the page was mapped > via kmap_atomic() (and kmap_atomic() could not reuse an existing > mapping via kmap_high_get()) > > kmap_atomic() tries to reuse an existing mapping and only allocates a > new dedicated kmap if it finds none. Consequently, __kunmap_atomic() > only flushes in the latter case (because we lose the alias). > > Thus, no double flushing should occur here. If kunmap_atomic() will > flush, __flush_kernel_dcache_page() will not and vice versa. That's correct for the current code. Let's wait for some insight from Nico or Russell on whether kunmap() needs to lush the VIVT cache as well. If it doesn't need to, then my original comment made sense.
On Tue, 28 May 2013, Catalin Marinas wrote: > On Sat, May 25, 2013 at 04:53:19AM +0100, Nicolas Pitre wrote: > > On Thu, 23 May 2013, Catalin Marinas wrote: > > > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > > twice. I don't think we should remove the cache flushing in > > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > > don't work together). > > > > VIVT and highmem do work together. Highmem for ARM was in fact > > developed on such a platform. > > Thanks for confirming. Do you remember why kunmap() doesn't (need to) > flush the cache alias (for VIVT)? Or if it does, where? kunmap() calls kunmap_high() which actually does not unmap anything but only decrement a use count. The page is kept around in case it is kmap()'d again, in which case the kmap() will be almost free and the page still warm. It is only when the kmap() for a new page runs out of free pkmap entries that the unused but still mapped pages are unmapped. This unmapping is done in flush_all_zero_pkmaps() where all the unused pages are all unmapped at once. The cache for all those pages is flushed with flush_cache_kmaps() which on ARM maps to flush_cache_all() when the cache is VIVT. Nicolas
On Tue, 28 May 2013, Catalin Marinas wrote: > On Mon, May 27, 2013 at 10:42:45PM +0100, Simon Baatz wrote: > > On Thu, May 23, 2013 at 11:43:03AM +0100, Catalin Marinas wrote: > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > Thus, continue to do lazy flushing if there are no user space mappings. > > > > Otherwise, flush the kernel cache lines directly. > > > > > > > > Signed-off-by: Simon Baatz <gmbnomis@gmail.com> > > > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > > > Cc: Russell King <linux@arm.linux.org.uk> > > > > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > > twice. I don't think we should remove the cache flushing in > > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > > don't work together). Flushing the VIVT cache on kunmap is essential > > > since we just lose the alias after that. > > > > Not sure if I am missing something here. From our previous > > discussion: > > > > > > > > We probably should reuse it in flush_kernel_dcache_page() to flush > > > > > > the kernel mapping. (The last else clause looks strange though) > > > > > > > > > > I think it makes sense to reuse this logic in > > > > > flush_kernel_dcache_page(). If the page is already mapped with kmap, > > > > > then kmap_high_get() should return the actual address. If it was mapped > > > > > with kmap_atomic, kmap_high_get() would return NULL, hence the 'else' > > > > > clause and the additional kmap_atomic(). The cache_is_vipt() check is > > > > > useful because kunmap_atomic() would flush VIVT caches anyway. > > I think the context was that page_address() is not always valid after > kmap_atomic(), so reusing the same logic would make sense. Following my explanation of kunmap()... remember that kunmap()'d pages are not actually unmaped right away. So the only thing that kmap_high_get() does is to look for such a mapping and pin it with an increased use count if it exists, or return NULL if there is no such mapping. In the former case page_address() is valid, in the second case it is NULL. Yet, after kunmap_atomic() (or kunmap() as well) you never know when page_address() might suddenly return NULL if it returned an address unless you pin the page down with kmap_high_get(). > But the key point I try to understand is why kunmap() does not need to > flush the VIVT cache. If it does need to, then we can simplify your > patch. It does not and should not for performance reasons. > > As you say, this logic does not flush on VIVT if the page was mapped > > via kmap_atomic() (and kmap_atomic() could not reuse an existing > > mapping via kmap_high_get()) > > > > kmap_atomic() tries to reuse an existing mapping and only allocates a > > new dedicated kmap if it finds none. Consequently, __kunmap_atomic() > > only flushes in the latter case (because we lose the alias). That is exact. > > > > Thus, no double flushing should occur here. If kunmap_atomic() will > > flush, __flush_kernel_dcache_page() will not and vice versa. > > That's correct for the current code. Let's wait for some insight from > Nico or Russell on whether kunmap() needs to lush the VIVT cache as > well. If it doesn't need to, then my original comment made sense. I hope I've been able to shed some light on the way highmem caching works. I'm not sure I fully understand what the patch in this thread does though, but I've never fully grokked the subtleties behind flush_dcache_page either. Nicolas
On Tue, May 28, 2013 at 06:52:27PM +0100, Nicolas Pitre wrote: > On Tue, 28 May 2013, Catalin Marinas wrote: > > > On Sat, May 25, 2013 at 04:53:19AM +0100, Nicolas Pitre wrote: > > > On Thu, 23 May 2013, Catalin Marinas wrote: > > > > > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > > > twice. I don't think we should remove the cache flushing in > > > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > > > don't work together). > > > > > > VIVT and highmem do work together. Highmem for ARM was in fact > > > developed on such a platform. > > > > Thanks for confirming. Do you remember why kunmap() doesn't (need to) > > flush the cache alias (for VIVT)? Or if it does, where? > > kunmap() calls kunmap_high() which actually does not unmap anything but > only decrement a use count. The page is kept around in case it is > kmap()'d again, in which case the kmap() will be almost free and the > page still warm. > > It is only when the kmap() for a new page runs out of free pkmap > entries that the unused but still mapped pages are unmapped. This > unmapping is done in flush_all_zero_pkmaps() where all the unused pages > are all unmapped at once. The cache for all those pages is flushed with > flush_cache_kmaps() which on ARM maps to flush_cache_all() when the > cache is VIVT. Thanks for clarification Nico. So if kmap_high_get() returns NULL, there are no high aliases left in the cache.
On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > it needs to handle are kernel mapped only. However, for example when doing > direct I/O, pages with user space mappings may occur. After Nico's clarification, I think the original commit introducing this function was also incomplete (commit 73be1591 - [ARM] 5545/2: add flush_kernel_dcache_page() for ARM) since it ignores highmem pages and their flushing could be deferred for a long time. For my understanding (if I re-read this tread) - basically code like this should not leave the user mapping inconsistent: kmap() ... flush_kernel_dcache_page() kunmap() If we use the atomic variants, we get the cache flushing automatically but the kunmap_high() does not flush the cache immediately, so we need to handle it in flush_kernel_dcache_page(). > Thus, continue to do lazy flushing if there are no user space mappings. > Otherwise, flush the kernel cache lines directly. ... > /* > + * Ensure cache coherency for kernel mapping of this page. > + * > + * If the page only exists in the page cache and there are no user > + * space mappings, this is a no-op since the page was already marked > + * dirty at creation. Otherwise, we need to flush the dirty kernel > + * cache lines directly. > + */ > +void flush_kernel_dcache_page(struct page *page) > +{ > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > + struct address_space *mapping; > + > + mapping = page_mapping(page); > + > + if (!mapping || mapping_mapped(mapping)) > + __flush_kernel_dcache_page(page); > + } > +} BTW, does the mapping check optimise anything for the flush_kernel_dcache_page() uses? Would we have a mapping anyway (or anonymous page) in most cases? Otherwise the patch looks good.
On Wed, 29 May 2013, Catalin Marinas wrote: > On Tue, May 28, 2013 at 06:52:27PM +0100, Nicolas Pitre wrote: > > On Tue, 28 May 2013, Catalin Marinas wrote: > > > > > On Sat, May 25, 2013 at 04:53:19AM +0100, Nicolas Pitre wrote: > > > > On Thu, 23 May 2013, Catalin Marinas wrote: > > > > > > > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > > > > twice. I don't think we should remove the cache flushing in > > > > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > > > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > > > > don't work together). > > > > > > > > VIVT and highmem do work together. Highmem for ARM was in fact > > > > developed on such a platform. > > > > > > Thanks for confirming. Do you remember why kunmap() doesn't (need to) > > > flush the cache alias (for VIVT)? Or if it does, where? > > > > kunmap() calls kunmap_high() which actually does not unmap anything but > > only decrement a use count. The page is kept around in case it is > > kmap()'d again, in which case the kmap() will be almost free and the > > page still warm. > > > > It is only when the kmap() for a new page runs out of free pkmap > > entries that the unused but still mapped pages are unmapped. This > > unmapping is done in flush_all_zero_pkmaps() where all the unused pages > > are all unmapped at once. The cache for all those pages is flushed with > > flush_cache_kmaps() which on ARM maps to flush_cache_all() when the > > cache is VIVT. > > Thanks for clarification Nico. So if kmap_high_get() returns NULL, there > are no high aliases left in the cache. Exact. Nicolas
On Wed, 29 May 2013, Nicolas Pitre wrote: > On Wed, 29 May 2013, Catalin Marinas wrote: > > > On Tue, May 28, 2013 at 06:52:27PM +0100, Nicolas Pitre wrote: > > > On Tue, 28 May 2013, Catalin Marinas wrote: > > > > > > > On Sat, May 25, 2013 at 04:53:19AM +0100, Nicolas Pitre wrote: > > > > > On Thu, 23 May 2013, Catalin Marinas wrote: > > > > > > > > > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > > > > > twice. I don't think we should remove the cache flushing in > > > > > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > > > > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > > > > > don't work together). > > > > > > > > > > VIVT and highmem do work together. Highmem for ARM was in fact > > > > > developed on such a platform. > > > > > > > > Thanks for confirming. Do you remember why kunmap() doesn't (need to) > > > > flush the cache alias (for VIVT)? Or if it does, where? > > > > > > kunmap() calls kunmap_high() which actually does not unmap anything but > > > only decrement a use count. The page is kept around in case it is > > > kmap()'d again, in which case the kmap() will be almost free and the > > > page still warm. > > > > > > It is only when the kmap() for a new page runs out of free pkmap > > > entries that the unused but still mapped pages are unmapped. This > > > unmapping is done in flush_all_zero_pkmaps() where all the unused pages > > > are all unmapped at once. The cache for all those pages is flushed with > > > flush_cache_kmaps() which on ARM maps to flush_cache_all() when the > > > cache is VIVT. > > > > Thanks for clarification Nico. So if kmap_high_get() returns NULL, there > > are no high aliases left in the cache. > > Exact. Just to be clear, this is true for VIVT only. Unmapped highmem pages can still be cached with VIPT. Nicolas
On Wed, May 29, 2013 at 05:32:50PM +0100, Nicolas Pitre wrote: > On Wed, 29 May 2013, Nicolas Pitre wrote: > > > On Wed, 29 May 2013, Catalin Marinas wrote: > > > > > On Tue, May 28, 2013 at 06:52:27PM +0100, Nicolas Pitre wrote: > > > > On Tue, 28 May 2013, Catalin Marinas wrote: > > > > > > > > > On Sat, May 25, 2013 at 04:53:19AM +0100, Nicolas Pitre wrote: > > > > > > On Thu, 23 May 2013, Catalin Marinas wrote: > > > > > > > > > > > > > An issue is that for kunmap_atomic() && VIVT we flush the same page > > > > > > > twice. I don't think we should remove the cache flushing in > > > > > > > __kunmap_atomic() for VIVT since not all kunmap_atomic() calls require > > > > > > > flush_kernel_dcache_page() (unless we just assume that highmem && vivt > > > > > > > don't work together). > > > > > > > > > > > > VIVT and highmem do work together. Highmem for ARM was in fact > > > > > > developed on such a platform. > > > > > > > > > > Thanks for confirming. Do you remember why kunmap() doesn't (need to) > > > > > flush the cache alias (for VIVT)? Or if it does, where? > > > > > > > > kunmap() calls kunmap_high() which actually does not unmap anything but > > > > only decrement a use count. The page is kept around in case it is > > > > kmap()'d again, in which case the kmap() will be almost free and the > > > > page still warm. > > > > > > > > It is only when the kmap() for a new page runs out of free pkmap > > > > entries that the unused but still mapped pages are unmapped. This > > > > unmapping is done in flush_all_zero_pkmaps() where all the unused pages > > > > are all unmapped at once. The cache for all those pages is flushed with > > > > flush_cache_kmaps() which on ARM maps to flush_cache_all() when the > > > > cache is VIVT. > > > > > > Thanks for clarification Nico. So if kmap_high_get() returns NULL, there > > > are no high aliases left in the cache. > > > > Exact. > > Just to be clear, this is true for VIVT only. > Unmapped highmem pages can still be cached with VIPT. Indeed.
Hi Catalin, On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > it needs to handle are kernel mapped only. However, for example when doing > > direct I/O, pages with user space mappings may occur. > > After Nico's clarification, I think the original commit introducing this > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > their flushing could be deferred for a long time. Yes, I agree. > For my understanding (if I re-read this tread) - basically code like > this should not leave the user mapping inconsistent: > > kmap() > ... > flush_kernel_dcache_page() > kunmap() s/user mapping/kernel mapping/ The user mapping(s) can be assumed to be consistent when entering such code. Either there is none or the page was obtained by a mechanism like __get_user_pages(). Otherwise, we have a problem anyway when accessing the page via the kernel mapping. The visibility of a page to user space is just needed for optimization. If the page is not visible to user space, we can use our lazy flushing mechanism (PG_dcache_clean bit) and defer flushing until the page is mapped into user space (if ever). Otherwise, we need to flush the kernel mapping since only that can be dirty. And, we don't need to do anything in flush_kernel_dcache_page() for non-aliasing caches, of course. > If we use the atomic variants, we get the cache flushing automatically > but the kunmap_high() does not flush the cache immediately, so we need > to handle it in flush_kernel_dcache_page(). Exactly. But only if it is really needed. If we have no user space mapping and don't lose the kernel mapping, we just may do nothing. > > Thus, continue to do lazy flushing if there are no user space mappings. > > Otherwise, flush the kernel cache lines directly. > ... > > /* > > + * Ensure cache coherency for kernel mapping of this page. > > + * > > + * If the page only exists in the page cache and there are no user > > + * space mappings, this is a no-op since the page was already marked > > + * dirty at creation. Otherwise, we need to flush the dirty kernel > > + * cache lines directly. > > + */ > > +void flush_kernel_dcache_page(struct page *page) > > +{ > > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > > + struct address_space *mapping; > > + > > + mapping = page_mapping(page); > > + > > + if (!mapping || mapping_mapped(mapping)) > > + __flush_kernel_dcache_page(page); > > + } > > +} > > BTW, does the mapping check optimise anything for the > flush_kernel_dcache_page() uses? Would we have a mapping anyway (or > anonymous page) in most cases? Looking at the relevant uses in the kernel, we have: drivers/scsi/libsas/sas_host_smp.c drivers/mmc/host/mmc_spi.c drivers/block/ps3disk.c fs/exec.c lib/scatterlist.c That is not much (I omit my usual rant here that many uses of flush_dcache_page() in drivers are incorrect and should be uses of flush_kernel_dcache_page() ...). Without looking at the actual code, we seem to have two basic use cases: 1. Block drivers (the ones listed above + those using the memory scatterlist iterator in scatterlist.c) These will probably almost exclusively use page cache pages for which we can be lazy. Other pages only occur in special I/O situations like direct I/O. 2. fs/exec.c I think these are anonymous pages only Thus depending on the actual drivers used, usage can be dominated by page cache pages on one setup and anon pages on the other. I prefer the currently proposed way since it prevents massive overflushing if flush_kernel_dcache_page() is used in an I/O path. (Btw. I am still puzzled as to why making flush_kernel_dcache_page() the current nop apparently did not cause problems in fs/exec.c.) > Otherwise the patch looks good. Thanks. Especially, thanks for pointing out the highmem case. - Simon
On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > it needs to handle are kernel mapped only. However, for example when doing > > > direct I/O, pages with user space mappings may occur. > > > > After Nico's clarification, I think the original commit introducing this > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > their flushing could be deferred for a long time. > > Yes, I agree. > > > For my understanding (if I re-read this tread) - basically code like > > this should not leave the user mapping inconsistent: > > > > kmap() > > ... > > flush_kernel_dcache_page() > > kunmap() > > s/user mapping/kernel mapping/ > The user mapping(s) can be assumed to be consistent when entering > such code. Either there is none or the page was obtained by a > mechanism like __get_user_pages(). Otherwise, we have a problem > anyway when accessing the page via the kernel mapping. Indeed. What I meant was user mapping inconsistent with the kernel mapping. It's the latter that needs flushing. > > > Thus, continue to do lazy flushing if there are no user space mappings. > > > Otherwise, flush the kernel cache lines directly. > > ... > > > /* > > > + * Ensure cache coherency for kernel mapping of this page. > > > + * > > > + * If the page only exists in the page cache and there are no user > > > + * space mappings, this is a no-op since the page was already marked > > > + * dirty at creation. Otherwise, we need to flush the dirty kernel > > > + * cache lines directly. > > > + */ > > > +void flush_kernel_dcache_page(struct page *page) > > > +{ > > > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > > > + struct address_space *mapping; > > > + > > > + mapping = page_mapping(page); > > > + > > > + if (!mapping || mapping_mapped(mapping)) > > > + __flush_kernel_dcache_page(page); > > > + } > > > +} > > > > BTW, does the mapping check optimise anything for the > > flush_kernel_dcache_page() uses? Would we have a mapping anyway (or > > anonymous page) in most cases? > > Looking at the relevant uses in the kernel, we have: > > drivers/scsi/libsas/sas_host_smp.c > drivers/mmc/host/mmc_spi.c > drivers/block/ps3disk.c > fs/exec.c > lib/scatterlist.c > > That is not much (I omit my usual rant here that many uses of > flush_dcache_page() in drivers are incorrect and should be uses of > flush_kernel_dcache_page() ...). > > Without looking at the actual code, we seem to have two basic use > cases: > > 1. Block drivers (the ones listed above + those using the memory > scatterlist iterator in scatterlist.c) > > These will probably almost exclusively use page cache pages for which > we can be lazy. Other pages only occur in special I/O situations > like direct I/O. > > 2. fs/exec.c > > I think these are anonymous pages only > > Thus depending on the actual drivers used, usage can be dominated by > page cache pages on one setup and anon pages on the other. > > I prefer the currently proposed way since it prevents massive > overflushing if flush_kernel_dcache_page() is used in an I/O path. > > (Btw. I am still puzzled as to why making flush_kernel_dcache_page() > the current nop apparently did not cause problems in fs/exec.c.) We get extra flushing via flush_old_exec() -> exec_mmap() -> mmput() -> exit_mmap() -> flush_cache_mm() before we actually start the new exec so this would flush the arg page as well. > > Otherwise the patch looks good. > > Thanks. Especially, thanks for pointing out the highmem case. You can add my ack (before I forget the whole discussion ;)) Acked-by: Catalin Marinas <catalin.marinas@arm.com>
On Sun, May 12, 2013 at 1:35 PM, Simon Baatz <gmbnomis@gmail.com> wrote: > +void flush_kernel_dcache_page(struct page *page) > +{ > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > + struct address_space *mapping; > + > + mapping = page_mapping(page); > + > + if (!mapping || mapping_mapped(mapping)) It is better to replace mapping_mapped(mapping) with page_mapped(page). > + __flush_kernel_dcache_page(page); I am wondering if I-cache should be flushed here since the user mapping on the page may be executable and I/D coherency should be maintained. Thanks,
On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > direct I/O, pages with user space mappings may occur. > > > > > > After Nico's clarification, I think the original commit introducing this > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > their flushing could be deferred for a long time. > > > > Yes, I agree. > > > > > For my understanding (if I re-read this tread) - basically code like > > > this should not leave the user mapping inconsistent: > > > > > > kmap() > > > ... > > > flush_kernel_dcache_page() > > > kunmap() > > > > s/user mapping/kernel mapping/ > > The user mapping(s) can be assumed to be consistent when entering > > such code. Either there is none or the page was obtained by a > > mechanism like __get_user_pages(). Otherwise, we have a problem > > anyway when accessing the page via the kernel mapping. > > Indeed. What I meant was user mapping inconsistent with the kernel > mapping. It's the latter that needs flushing. > > > > > Thus, continue to do lazy flushing if there are no user space mappings. > > > > Otherwise, flush the kernel cache lines directly. > > > ... > > > > /* > > > > + * Ensure cache coherency for kernel mapping of this page. > > > > + * > > > > + * If the page only exists in the page cache and there are no user > > > > + * space mappings, this is a no-op since the page was already marked > > > > + * dirty at creation. Otherwise, we need to flush the dirty kernel > > > > + * cache lines directly. > > > > + */ > > > > +void flush_kernel_dcache_page(struct page *page) > > > > +{ > > > > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > > > > + struct address_space *mapping; > > > > + > > > > + mapping = page_mapping(page); > > > > + > > > > + if (!mapping || mapping_mapped(mapping)) > > > > + __flush_kernel_dcache_page(page); > > > > + } > > > > +} > > > > > > BTW, does the mapping check optimise anything for the > > > flush_kernel_dcache_page() uses? Would we have a mapping anyway (or > > > anonymous page) in most cases? > > > > Looking at the relevant uses in the kernel, we have: > > > > drivers/scsi/libsas/sas_host_smp.c > > drivers/mmc/host/mmc_spi.c > > drivers/block/ps3disk.c > > fs/exec.c > > lib/scatterlist.c > > > > That is not much (I omit my usual rant here that many uses of > > flush_dcache_page() in drivers are incorrect and should be uses of > > flush_kernel_dcache_page() ...). > > > > Without looking at the actual code, we seem to have two basic use > > cases: > > > > 1. Block drivers (the ones listed above + those using the memory > > scatterlist iterator in scatterlist.c) > > > > These will probably almost exclusively use page cache pages for which > > we can be lazy. Other pages only occur in special I/O situations > > like direct I/O. > > > > 2. fs/exec.c > > > > I think these are anonymous pages only > > > > Thus depending on the actual drivers used, usage can be dominated by > > page cache pages on one setup and anon pages on the other. > > > > I prefer the currently proposed way since it prevents massive > > overflushing if flush_kernel_dcache_page() is used in an I/O path. > > > > (Btw. I am still puzzled as to why making flush_kernel_dcache_page() > > the current nop apparently did not cause problems in fs/exec.c.) > > We get extra flushing via flush_old_exec() -> exec_mmap() -> mmput() -> > exit_mmap() -> flush_cache_mm() before we actually start the new exec so > this would flush the arg page as well. > > > > Otherwise the patch looks good. > > > > Thanks. Especially, thanks for pointing out the highmem case. > > You can add my ack (before I forget the whole discussion ;)) > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can Simon go ahead and put this in rmk's patch tracker and mention that it should go to all -stable trees? thx, Jason.
On Fri, May 31, 2013 at 01:05:19PM +0100, Jason Cooper wrote: > On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > After Nico's clarification, I think the original commit introducing this > > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > > their flushing could be deferred for a long time. > > > > > > Yes, I agree. > > > > > > > For my understanding (if I re-read this tread) - basically code like > > > > this should not leave the user mapping inconsistent: > > > > > > > > kmap() > > > > ... > > > > flush_kernel_dcache_page() > > > > kunmap() > > > > > > s/user mapping/kernel mapping/ > > > The user mapping(s) can be assumed to be consistent when entering > > > such code. Either there is none or the page was obtained by a > > > mechanism like __get_user_pages(). Otherwise, we have a problem > > > anyway when accessing the page via the kernel mapping. > > > > Indeed. What I meant was user mapping inconsistent with the kernel > > mapping. It's the latter that needs flushing. > > > > > > > Thus, continue to do lazy flushing if there are no user space mappings. > > > > > Otherwise, flush the kernel cache lines directly. > > > > ... > > > > > /* > > > > > + * Ensure cache coherency for kernel mapping of this page. > > > > > + * > > > > > + * If the page only exists in the page cache and there are no user > > > > > + * space mappings, this is a no-op since the page was already marked > > > > > + * dirty at creation. Otherwise, we need to flush the dirty kernel > > > > > + * cache lines directly. > > > > > + */ > > > > > +void flush_kernel_dcache_page(struct page *page) > > > > > +{ > > > > > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > > > > > + struct address_space *mapping; > > > > > + > > > > > + mapping = page_mapping(page); > > > > > + > > > > > + if (!mapping || mapping_mapped(mapping)) > > > > > + __flush_kernel_dcache_page(page); > > > > > + } > > > > > +} > > > > > > > > BTW, does the mapping check optimise anything for the > > > > flush_kernel_dcache_page() uses? Would we have a mapping anyway (or > > > > anonymous page) in most cases? > > > > > > Looking at the relevant uses in the kernel, we have: > > > > > > drivers/scsi/libsas/sas_host_smp.c > > > drivers/mmc/host/mmc_spi.c > > > drivers/block/ps3disk.c > > > fs/exec.c > > > lib/scatterlist.c > > > > > > That is not much (I omit my usual rant here that many uses of > > > flush_dcache_page() in drivers are incorrect and should be uses of > > > flush_kernel_dcache_page() ...). > > > > > > Without looking at the actual code, we seem to have two basic use > > > cases: > > > > > > 1. Block drivers (the ones listed above + those using the memory > > > scatterlist iterator in scatterlist.c) > > > > > > These will probably almost exclusively use page cache pages for which > > > we can be lazy. Other pages only occur in special I/O situations > > > like direct I/O. > > > > > > 2. fs/exec.c > > > > > > I think these are anonymous pages only > > > > > > Thus depending on the actual drivers used, usage can be dominated by > > > page cache pages on one setup and anon pages on the other. > > > > > > I prefer the currently proposed way since it prevents massive > > > overflushing if flush_kernel_dcache_page() is used in an I/O path. > > > > > > (Btw. I am still puzzled as to why making flush_kernel_dcache_page() > > > the current nop apparently did not cause problems in fs/exec.c.) > > > > We get extra flushing via flush_old_exec() -> exec_mmap() -> mmput() -> > > exit_mmap() -> flush_cache_mm() before we actually start the new exec so > > this would flush the arg page as well. > > > > > > Otherwise the patch looks good. > > > > > > Thanks. Especially, thanks for pointing out the highmem case. > > > > You can add my ack (before I forget the whole discussion ;)) > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can > Simon go ahead and put this in rmk's patch tracker and mention that it > should go to all -stable trees? I'm not sure how easy it is to apply this patch on past stable versions. Maybe something simpler for stable like always flushing the cache in flush_kernel_dcache_page() with optimisation only in recent stable versions.
On Fri, May 31, 2013 at 03:15:08PM +0100, Catalin Marinas wrote: > On Fri, May 31, 2013 at 01:05:19PM +0100, Jason Cooper wrote: > > On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > > > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > > > After Nico's clarification, I think the original commit introducing this > > > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > > > their flushing could be deferred for a long time. > > > > > > > > Yes, I agree. > > > > > > > > > For my understanding (if I re-read this tread) - basically code like > > > > > this should not leave the user mapping inconsistent: > > > > > > > > > > kmap() > > > > > ... > > > > > flush_kernel_dcache_page() > > > > > kunmap() > > > > > > > > s/user mapping/kernel mapping/ > > > > The user mapping(s) can be assumed to be consistent when entering > > > > such code. Either there is none or the page was obtained by a > > > > mechanism like __get_user_pages(). Otherwise, we have a problem > > > > anyway when accessing the page via the kernel mapping. > > > > > > Indeed. What I meant was user mapping inconsistent with the kernel > > > mapping. It's the latter that needs flushing. > > > > > > > > > Thus, continue to do lazy flushing if there are no user space mappings. > > > > > > Otherwise, flush the kernel cache lines directly. > > > > > ... > > > > > > /* > > > > > > + * Ensure cache coherency for kernel mapping of this page. > > > > > > + * > > > > > > + * If the page only exists in the page cache and there are no user > > > > > > + * space mappings, this is a no-op since the page was already marked > > > > > > + * dirty at creation. Otherwise, we need to flush the dirty kernel > > > > > > + * cache lines directly. > > > > > > + */ > > > > > > +void flush_kernel_dcache_page(struct page *page) > > > > > > +{ > > > > > > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > > > > > > + struct address_space *mapping; > > > > > > + > > > > > > + mapping = page_mapping(page); > > > > > > + > > > > > > + if (!mapping || mapping_mapped(mapping)) > > > > > > + __flush_kernel_dcache_page(page); > > > > > > + } > > > > > > +} > > > > > > > > > > BTW, does the mapping check optimise anything for the > > > > > flush_kernel_dcache_page() uses? Would we have a mapping anyway (or > > > > > anonymous page) in most cases? > > > > > > > > Looking at the relevant uses in the kernel, we have: > > > > > > > > drivers/scsi/libsas/sas_host_smp.c > > > > drivers/mmc/host/mmc_spi.c > > > > drivers/block/ps3disk.c > > > > fs/exec.c > > > > lib/scatterlist.c > > > > > > > > That is not much (I omit my usual rant here that many uses of > > > > flush_dcache_page() in drivers are incorrect and should be uses of > > > > flush_kernel_dcache_page() ...). > > > > > > > > Without looking at the actual code, we seem to have two basic use > > > > cases: > > > > > > > > 1. Block drivers (the ones listed above + those using the memory > > > > scatterlist iterator in scatterlist.c) > > > > > > > > These will probably almost exclusively use page cache pages for which > > > > we can be lazy. Other pages only occur in special I/O situations > > > > like direct I/O. > > > > > > > > 2. fs/exec.c > > > > > > > > I think these are anonymous pages only > > > > > > > > Thus depending on the actual drivers used, usage can be dominated by > > > > page cache pages on one setup and anon pages on the other. > > > > > > > > I prefer the currently proposed way since it prevents massive > > > > overflushing if flush_kernel_dcache_page() is used in an I/O path. > > > > > > > > (Btw. I am still puzzled as to why making flush_kernel_dcache_page() > > > > the current nop apparently did not cause problems in fs/exec.c.) > > > > > > We get extra flushing via flush_old_exec() -> exec_mmap() -> mmput() -> > > > exit_mmap() -> flush_cache_mm() before we actually start the new exec so > > > this would flush the arg page as well. > > > > > > > > Otherwise the patch looks good. > > > > > > > > Thanks. Especially, thanks for pointing out the highmem case. > > > > > > You can add my ack (before I forget the whole discussion ;)) > > > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > > > wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can > > Simon go ahead and put this in rmk's patch tracker and mention that it > > should go to all -stable trees? > > I'm not sure how easy it is to apply this patch on past stable versions. > Maybe something simpler for stable like always flushing the cache in > flush_kernel_dcache_page() with optimisation only in recent stable > versions. Ok, then please don't let that hold this up. We can do something separate for stable later if it's desired. thx, Jason.
On Fri, May 31, 2013 at 05:07:22PM +0800, Ming Lei wrote: > On Sun, May 12, 2013 at 1:35 PM, Simon Baatz <gmbnomis@gmail.com> wrote: > > +void flush_kernel_dcache_page(struct page *page) > > +{ > > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { > > + struct address_space *mapping; > > + > > + mapping = page_mapping(page); > > + > > + if (!mapping || mapping_mapped(mapping)) > > It is better to replace mapping_mapped(mapping) with page_mapped(page). Good point. I did not have the time to look into your proposed change for flush_dcache_page() yet. However, this patch here is a fix for a long standing bug, which should also be applied to stable in one form or another. I would not like to mix it with a separate performance improvement. But, of course, if we decide to change the respective condition in flush_dcache_page(), we should also change it in flush_kernel_dcache_page() at the same time (provided this patch gets accepted). > > + __flush_kernel_dcache_page(page); > > I am wondering if I-cache should be flushed here since the user mapping > on the page may be executable and I/D coherency should be maintained. Me too. In fact, I wondered so much that I proposed it myself in a previous version of the patch (see [1]) :-) However, this has never been part of flush_kernel_dcache_page() and it is not clear whether the flush for I/D-cache coherency is really needed here (or in flush_dcache_page() for that matter) or whether it is sufficient to do it when mapping the page into user space. On top of that neither Russell nor Catalin liked this idea very much. - Simon [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-October/124292.html
On Sat, Jun 1, 2013 at 2:54 AM, Simon Baatz <gmbnomis@gmail.com> wrote: > On Fri, May 31, 2013 at 05:07:22PM +0800, Ming Lei wrote: >> On Sun, May 12, 2013 at 1:35 PM, Simon Baatz <gmbnomis@gmail.com> wrote: >> > +void flush_kernel_dcache_page(struct page *page) >> > +{ >> > + if (cache_is_vivt() || cache_is_vipt_aliasing()) { >> > + struct address_space *mapping; >> > + >> > + mapping = page_mapping(page); >> > + >> > + if (!mapping || mapping_mapped(mapping)) >> >> It is better to replace mapping_mapped(mapping) with page_mapped(page). > > Good point. I did not have the time to look into your proposed change > for flush_dcache_page() yet. > > However, this patch here is a fix for a long standing bug, which > should also be applied to stable in one form or another. I would not > like to mix it with a separate performance improvement. OK, got it, so you should have added "CC: stable", and it is better to add the bug fix description in your change log, :-) > > But, of course, if we decide to change the respective condition in > flush_dcache_page(), we should also change it in > flush_kernel_dcache_page() at the same time (provided this patch gets > accepted). > >> > + __flush_kernel_dcache_page(page); >> >> I am wondering if I-cache should be flushed here since the user mapping >> on the page may be executable and I/D coherency should be maintained. > > Me too. In fact, I wondered so much that I proposed it myself in a > previous version of the patch (see [1]) :-) However, this has never > been part of flush_kernel_dcache_page() and it is not clear whether > the flush for I/D-cache coherency is really needed here (or in > flush_dcache_page() for that matter) or whether it is sufficient to > do it when mapping the page into user space. After reading cachetlb.txt again, looks your patch is correct, because this kernel API is called in the situations: It is assumed here that the user has no incoherent cached copies (i.e. the original page was obtained from a mechanism like get_user_pages()). So only flushing kernel mapping for the page is correct if callers of the API obey the rule. Thanks,
On Sat, Jun 01, 2013 at 11:27:38AM +0100, Ming Lei wrote: > On Sat, Jun 1, 2013 at 2:54 AM, Simon Baatz <gmbnomis@gmail.com> wrote: > > On Fri, May 31, 2013 at 05:07:22PM +0800, Ming Lei wrote: > >> On Sun, May 12, 2013 at 1:35 PM, Simon Baatz <gmbnomis@gmail.com> wrote: > >> > + __flush_kernel_dcache_page(page); > >> > >> I am wondering if I-cache should be flushed here since the user mapping > >> on the page may be executable and I/D coherency should be maintained. > > > > Me too. In fact, I wondered so much that I proposed it myself in a > > previous version of the patch (see [1]) :-) However, this has never > > been part of flush_kernel_dcache_page() and it is not clear whether > > the flush for I/D-cache coherency is really needed here (or in > > flush_dcache_page() for that matter) or whether it is sufficient to > > do it when mapping the page into user space. > > After reading cachetlb.txt again, looks your patch is correct, because > this kernel API is called in the situations: > > It is assumed here that the user has no incoherent cached copies > (i.e. the original page was obtained from a mechanism like > get_user_pages()). > > So only flushing kernel mapping for the page is correct if callers of the API > obey the rule. I think we should stick with just the D-cache flushing here. We can revisit when we see a clear scenario where I-cache maintenance is needed. Regarding the 'user has no incoherent cached copies', this would work on ARMv5 but not on ARMv7 unless you unmap the page from user space (because of speculative loads into both I and D caches). Since we only target the D-cache, we just leave flush_kernel_dcache_page() as no-op on ARMv7 (non-aliasing VIPT).
Hi Catalin, On Fri, May 31, 2013 at 03:15:08PM +0100, Catalin Marinas wrote: > On Fri, May 31, 2013 at 01:05:19PM +0100, Jason Cooper wrote: > > On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > > > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > > > After Nico's clarification, I think the original commit introducing this > > > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > > > their flushing could be deferred for a long time. > > > > > > > > Yes, I agree. > > > > > > > > > For my understanding (if I re-read this tread) - basically code like > > > > > this should not leave the user mapping inconsistent: > > > > > > > > > > kmap() > > > > > ... > > > > > flush_kernel_dcache_page() > > > > > kunmap() > ... > > > > wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can > > Simon go ahead and put this in rmk's patch tracker and mention that it > > should go to all -stable trees? > > I'm not sure how easy it is to apply this patch on past stable versions. > Maybe something simpler for stable like always flushing the cache in > flush_kernel_dcache_page() with optimisation only in recent stable > versions. Looking at how to adapt the change to previous kernel versions, it occurred to me that we could probably simplify the current patch since we can make stronger assumptions than __flush_dcache_page() can do. As discussed, __flush_dcache_page() does: addr = kmap_high_get(page); if (addr) { __cpuc_flush_dcache_area(addr, PAGE_SIZE); kunmap_high(page); } The kmap_high_get() is used to ensure that the page is/remains pinned during the flush. However, the API of flush_kernel_dcache_page() ensures that the page is already kmapped when it is called (as in the quoted example above). Thus, it is already pinned and we can simply look at page_address(). If it is NULL, the page must have been obtained via kmap_atomic() and we don't need to flush anyway since kunmap_atomic() will do this. The actual flush covering both the lowmem and highmem cases actually is as simple as this (inspired by __flush_dcache_page() in 2.6.32): addr = page_address(page); #ifdef CONFIG_HIGHMEM /* * kmap_atomic() doesn't set the page virtual address, and * kunmap_atomic() takes care of cache flushing already. */ if (addr) #endif __cpuc_flush_dcache_area(addr, PAGE_SIZE); If you agree that this makes sense, I will send a new version. If we change the patch this way, there is no need to split off __flush_kernel_dcache_page() from __flush_dcache_page(), which also makes it simpler to apply to past stable kernels. The only thing we need to be aware of is the change of flush_kernel_dcache_page() in 2.6.37 (commit f8b63c1). In earlier versions, we would only need to fix the highmem case. However, I wonder whether it makes sense to apply a fix to 2.6.32 and 2.6.34 without this ever being reported as a problem. - Simon
All, Adding gregkh for insight into -stable and applying fixes to previous kernels when the bug was not against them (but the offending commit exists in them). Greg, Some history: This bug stems from attempting to write O_DIRECT to a dm-crypt device on ARM architectures. In my case, it was trying (years ago!) to put an LVM into a dm-crypt container. This data corruption can be more readily seen by using dd to write to a dm-crypt device with the direct flag. On Mon, Jun 03, 2013 at 07:38:18PM +0200, Simon Baatz wrote: > Hi Catalin, > > On Fri, May 31, 2013 at 03:15:08PM +0100, Catalin Marinas wrote: > > On Fri, May 31, 2013 at 01:05:19PM +0100, Jason Cooper wrote: > > > On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > > > > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > > > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > > > > > After Nico's clarification, I think the original commit introducing this > > > > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > > > > their flushing could be deferred for a long time. > > > > > > > > > > Yes, I agree. > > > > > > > > > > > For my understanding (if I re-read this tread) - basically code like > > > > > > this should not leave the user mapping inconsistent: > > > > > > > > > > > > kmap() > > > > > > ... > > > > > > flush_kernel_dcache_page() > > > > > > kunmap() > > ... > > > > > > wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can > > > Simon go ahead and put this in rmk's patch tracker and mention that it > > > should go to all -stable trees? > > > > I'm not sure how easy it is to apply this patch on past stable versions. > > Maybe something simpler for stable like always flushing the cache in > > flush_kernel_dcache_page() with optimisation only in recent stable > > versions. > > Looking at how to adapt the change to previous kernel versions, it > occurred to me that we could probably simplify the current patch > since we can make stronger assumptions than __flush_dcache_page() > can do. As discussed, __flush_dcache_page() does: > > addr = kmap_high_get(page); > if (addr) { > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > kunmap_high(page); > } > > The kmap_high_get() is used to ensure that the page is/remains pinned > during the flush. However, the API of flush_kernel_dcache_page() > ensures that the page is already kmapped when it is called (as in the > quoted example above). > > Thus, it is already pinned and we can simply look at page_address(). > If it is NULL, the page must have been obtained via kmap_atomic() and > we don't need to flush anyway since kunmap_atomic() will do this. > > The actual flush covering both the lowmem and highmem cases actually > is as simple as this (inspired by __flush_dcache_page() in 2.6.32): > > addr = page_address(page); > #ifdef CONFIG_HIGHMEM > /* > * kmap_atomic() doesn't set the page virtual address, and > * kunmap_atomic() takes care of cache flushing already. > */ > if (addr) > #endif > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > If you agree that this makes sense, I will send a new version. > > If we change the patch this way, there is no need to split off > __flush_kernel_dcache_page() from __flush_dcache_page(), which also > makes it simpler to apply to past stable kernels. > > The only thing we need to be aware of is the change of > flush_kernel_dcache_page() in 2.6.37 (commit f8b63c1). In earlier > versions, we would only need to fix the highmem case. However, I > wonder whether it makes sense to apply a fix to 2.6.32 and 2.6.34 > without this ever being reported as a problem. imho, if you discover a problem while running kernel v3.9.x, and the commit causing the problem goes all the way back to 2.6.32, then the fix should go back that far as well (assuming those kernels are still actively maintained). I don't think we would need a bug report for each kernel version. However, some more nuanced problems are not the result of a single commit, but potentially the interaction of several commits. In this case, a more conservative application may be in order. Greg, any thoughts? thx, Jason.
Hi Jason, Greg, On Mon, Jun 03, 2013 at 02:03:39PM -0400, Jason Cooper wrote: > All, > > Adding gregkh for insight into -stable and applying fixes to previous > kernels when the bug was not against them (but the offending commit > exists in them). > > Greg, > > Some history: This bug stems from attempting to write O_DIRECT to a > dm-crypt device on ARM architectures. In my case, it was trying (years > ago!) to put an LVM into a dm-crypt container. This data corruption can > be more readily seen by using dd to write to a dm-crypt device with the > direct flag. Some more history: The bug Jason describes is a regression that appeared in 2.6.37. However, while fixing it, we think we found a bug in conjunction with highmem that goes back to 2.6.31 or so. This may occur on ARM machines with VIVT D-cache, highmem pages and a driver using flush_kernel_dcache_page(), but not kmap_atomic(). > ... > > > > The only thing we need to be aware of is the change of > > flush_kernel_dcache_page() in 2.6.37 (commit f8b63c1). In earlier > > versions, we would only need to fix the highmem case. However, I > > wonder whether it makes sense to apply a fix to 2.6.32 and 2.6.34 > > without this ever being reported as a problem. > > imho, if you discover a problem while running kernel v3.9.x, and the > commit causing the problem goes all the way back to 2.6.32, then the fix > should go back that far as well (assuming those kernels are still > actively maintained). I don't think we would need a bug report for each > kernel version. For the bug you described that's true. But 2.6.32 and 2.6.34 are not affected. The question is whether the other bug requires such an obscure set of conditions that there are simply no affected machines. - Simon
It would be helpful if I actually did as I stated ;-) Adding Greg for real this time. On Mon, Jun 03, 2013 at 02:03:39PM -0400, Jason Cooper wrote: > All, > > Adding gregkh for insight into -stable and applying fixes to previous > kernels when the bug was not against them (but the offending commit > exists in them). > > Greg, > > Some history: This bug stems from attempting to write O_DIRECT to a > dm-crypt device on ARM architectures. In my case, it was trying (years > ago!) to put an LVM into a dm-crypt container. This data corruption can > be more readily seen by using dd to write to a dm-crypt device with the > direct flag. > > On Mon, Jun 03, 2013 at 07:38:18PM +0200, Simon Baatz wrote: > > Hi Catalin, > > > > On Fri, May 31, 2013 at 03:15:08PM +0100, Catalin Marinas wrote: > > > On Fri, May 31, 2013 at 01:05:19PM +0100, Jason Cooper wrote: > > > > On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > > > > > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > > > > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > > > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > > > > > > > After Nico's clarification, I think the original commit introducing this > > > > > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > > > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > > > > > their flushing could be deferred for a long time. > > > > > > > > > > > > Yes, I agree. > > > > > > > > > > > > > For my understanding (if I re-read this tread) - basically code like > > > > > > > this should not leave the user mapping inconsistent: > > > > > > > > > > > > > > kmap() > > > > > > > ... > > > > > > > flush_kernel_dcache_page() > > > > > > > kunmap() > > > ... > > > > > > > > wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can > > > > Simon go ahead and put this in rmk's patch tracker and mention that it > > > > should go to all -stable trees? > > > > > > I'm not sure how easy it is to apply this patch on past stable versions. > > > Maybe something simpler for stable like always flushing the cache in > > > flush_kernel_dcache_page() with optimisation only in recent stable > > > versions. > > > > Looking at how to adapt the change to previous kernel versions, it > > occurred to me that we could probably simplify the current patch > > since we can make stronger assumptions than __flush_dcache_page() > > can do. As discussed, __flush_dcache_page() does: > > > > addr = kmap_high_get(page); > > if (addr) { > > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > kunmap_high(page); > > } > > > > The kmap_high_get() is used to ensure that the page is/remains pinned > > during the flush. However, the API of flush_kernel_dcache_page() > > ensures that the page is already kmapped when it is called (as in the > > quoted example above). > > > > Thus, it is already pinned and we can simply look at page_address(). > > If it is NULL, the page must have been obtained via kmap_atomic() and > > we don't need to flush anyway since kunmap_atomic() will do this. > > > > The actual flush covering both the lowmem and highmem cases actually > > is as simple as this (inspired by __flush_dcache_page() in 2.6.32): > > > > addr = page_address(page); > > #ifdef CONFIG_HIGHMEM > > /* > > * kmap_atomic() doesn't set the page virtual address, and > > * kunmap_atomic() takes care of cache flushing already. > > */ > > if (addr) > > #endif > > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > > > If you agree that this makes sense, I will send a new version. > > > > If we change the patch this way, there is no need to split off > > __flush_kernel_dcache_page() from __flush_dcache_page(), which also > > makes it simpler to apply to past stable kernels. > > > > The only thing we need to be aware of is the change of > > flush_kernel_dcache_page() in 2.6.37 (commit f8b63c1). In earlier > > versions, we would only need to fix the highmem case. However, I > > wonder whether it makes sense to apply a fix to 2.6.32 and 2.6.34 > > without this ever being reported as a problem. > > imho, if you discover a problem while running kernel v3.9.x, and the > commit causing the problem goes all the way back to 2.6.32, then the fix > should go back that far as well (assuming those kernels are still > actively maintained). I don't think we would need a bug report for each > kernel version. > > However, some more nuanced problems are not the result of a single > commit, but potentially the interaction of several commits. In this > case, a more conservative application may be in order. > > Greg, any thoughts? > > thx, > > Jason. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Mon, Jun 03, 2013 at 03:22:30PM -0400, Jason Cooper wrote: > It would be helpful if I actually did as I stated ;-) > > Adding Greg for real this time. > > On Mon, Jun 03, 2013 at 02:03:39PM -0400, Jason Cooper wrote: > > All, > > > > Adding gregkh for insight into -stable and applying fixes to previous > > kernels when the bug was not against them (but the offending commit > > exists in them). > > > > Greg, > > > > Some history: This bug stems from attempting to write O_DIRECT to a > > dm-crypt device on ARM architectures. In my case, it was trying (years > > ago!) to put an LVM into a dm-crypt container. This data corruption can > > be more readily seen by using dd to write to a dm-crypt device with the > > direct flag. > > > > On Mon, Jun 03, 2013 at 07:38:18PM +0200, Simon Baatz wrote: > > > Hi Catalin, > > > > > > On Fri, May 31, 2013 at 03:15:08PM +0100, Catalin Marinas wrote: > > > > On Fri, May 31, 2013 at 01:05:19PM +0100, Jason Cooper wrote: > > > > > On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > > > > > > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > > > > > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > > > > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > > > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > > > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > > > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > > > > > > > > > After Nico's clarification, I think the original commit introducing this > > > > > > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > > > > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > > > > > > their flushing could be deferred for a long time. > > > > > > > > > > > > > > Yes, I agree. > > > > > > > > > > > > > > > For my understanding (if I re-read this tread) - basically code like > > > > > > > > this should not leave the user mapping inconsistent: > > > > > > > > > > > > > > > > kmap() > > > > > > > > ... > > > > > > > > flush_kernel_dcache_page() > > > > > > > > kunmap() > > > > ... > > > > > > > > > > wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can > > > > > Simon go ahead and put this in rmk's patch tracker and mention that it > > > > > should go to all -stable trees? > > > > > > > > I'm not sure how easy it is to apply this patch on past stable versions. > > > > Maybe something simpler for stable like always flushing the cache in > > > > flush_kernel_dcache_page() with optimisation only in recent stable > > > > versions. > > > > > > Looking at how to adapt the change to previous kernel versions, it > > > occurred to me that we could probably simplify the current patch > > > since we can make stronger assumptions than __flush_dcache_page() > > > can do. As discussed, __flush_dcache_page() does: > > > > > > addr = kmap_high_get(page); > > > if (addr) { > > > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > > kunmap_high(page); > > > } > > > > > > The kmap_high_get() is used to ensure that the page is/remains pinned > > > during the flush. However, the API of flush_kernel_dcache_page() > > > ensures that the page is already kmapped when it is called (as in the > > > quoted example above). > > > > > > Thus, it is already pinned and we can simply look at page_address(). > > > If it is NULL, the page must have been obtained via kmap_atomic() and > > > we don't need to flush anyway since kunmap_atomic() will do this. > > > > > > The actual flush covering both the lowmem and highmem cases actually > > > is as simple as this (inspired by __flush_dcache_page() in 2.6.32): > > > > > > addr = page_address(page); > > > #ifdef CONFIG_HIGHMEM > > > /* > > > * kmap_atomic() doesn't set the page virtual address, and > > > * kunmap_atomic() takes care of cache flushing already. > > > */ > > > if (addr) > > > #endif > > > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > > > > > If you agree that this makes sense, I will send a new version. > > > > > > If we change the patch this way, there is no need to split off > > > __flush_kernel_dcache_page() from __flush_dcache_page(), which also > > > makes it simpler to apply to past stable kernels. > > > > > > The only thing we need to be aware of is the change of > > > flush_kernel_dcache_page() in 2.6.37 (commit f8b63c1). In earlier > > > versions, we would only need to fix the highmem case. However, I > > > wonder whether it makes sense to apply a fix to 2.6.32 and 2.6.34 > > > without this ever being reported as a problem. > > > > imho, if you discover a problem while running kernel v3.9.x, and the > > commit causing the problem goes all the way back to 2.6.32, then the fix > > should go back that far as well (assuming those kernels are still > > actively maintained). I don't think we would need a bug report for each > > kernel version. > > > > However, some more nuanced problems are not the result of a single > > commit, but potentially the interaction of several commits. In this > > case, a more conservative application may be in order. > > > > Greg, any thoughts? If a patch needs to be backported to stable kernels, and it isn't relevant at _all_ for Linus's tree due to it being fixed some other way that is not backportable, then I can accept the patch into the stable releases. Otherwise, just send me the git commit ids of the fix in Linus's tree to the stable@vger.kernel.org mailing list please. thanks, greg k-h
On Mon, Jun 03, 2013 at 06:38:18PM +0100, Simon Baatz wrote: > On Fri, May 31, 2013 at 03:15:08PM +0100, Catalin Marinas wrote: > > On Fri, May 31, 2013 at 01:05:19PM +0100, Jason Cooper wrote: > > > On Thu, May 30, 2013 at 05:43:35PM +0100, Catalin Marinas wrote: > > > > On Wed, May 29, 2013 at 08:16:57PM +0100, Simon Baatz wrote: > > > > > On Wed, May 29, 2013 at 12:05:09PM +0100, Catalin Marinas wrote: > > > > > > On Sun, May 12, 2013 at 06:35:56AM +0100, Simon Baatz wrote: > > > > > > > Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages > > > > > > > it needs to handle are kernel mapped only. However, for example when doing > > > > > > > direct I/O, pages with user space mappings may occur. > > > > > > > > > > > > After Nico's clarification, I think the original commit introducing this > > > > > > function was also incomplete (commit 73be1591 - [ARM] 5545/2: add > > > > > > flush_kernel_dcache_page() for ARM) since it ignores highmem pages and > > > > > > their flushing could be deferred for a long time. > > > > > > > > > > Yes, I agree. > > > > > > > > > > > For my understanding (if I re-read this tread) - basically code like > > > > > > this should not leave the user mapping inconsistent: > > > > > > > > > > > > kmap() > > > > > > ... > > > > > > flush_kernel_dcache_page() > > > > > > kunmap() > > ... > > > > > > wrt to 73be1591, this appears to go all the way back to v2.6.31.x... Can > > > Simon go ahead and put this in rmk's patch tracker and mention that it > > > should go to all -stable trees? > > > > I'm not sure how easy it is to apply this patch on past stable versions. > > Maybe something simpler for stable like always flushing the cache in > > flush_kernel_dcache_page() with optimisation only in recent stable > > versions. > > Looking at how to adapt the change to previous kernel versions, it > occurred to me that we could probably simplify the current patch > since we can make stronger assumptions than __flush_dcache_page() > can do. As discussed, __flush_dcache_page() does: > > addr = kmap_high_get(page); > if (addr) { > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > kunmap_high(page); > } > > The kmap_high_get() is used to ensure that the page is/remains pinned > during the flush. However, the API of flush_kernel_dcache_page() > ensures that the page is already kmapped when it is called (as in the > quoted example above). > > Thus, it is already pinned and we can simply look at page_address(). > If it is NULL, the page must have been obtained via kmap_atomic() and > we don't need to flush anyway since kunmap_atomic() will do this. Yes, that's the behaviour for __flush_dcache_page() already (VIVT). > The actual flush covering both the lowmem and highmem cases actually > is as simple as this (inspired by __flush_dcache_page() in 2.6.32): > > addr = page_address(page); > #ifdef CONFIG_HIGHMEM > /* > * kmap_atomic() doesn't set the page virtual address, and > * kunmap_atomic() takes care of cache flushing already. > */ > if (addr) > #endif > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > If you agree that this makes sense, I will send a new version. It makes sense (with the addition of the cache_is_vivt() check). Maybe something like if (IS_ENABLED(CONFIG_HIGHMEM) && addr) to avoid the #ifdef. > If we change the patch this way, there is no need to split off > __flush_kernel_dcache_page() from __flush_dcache_page(), which also > makes it simpler to apply to past stable kernels. > > The only thing we need to be aware of is the change of > flush_kernel_dcache_page() in 2.6.37 (commit f8b63c1). In earlier > versions, we would only need to fix the highmem case. However, I > wonder whether it makes sense to apply a fix to 2.6.32 and 2.6.34 > without this ever being reported as a problem. I think you can cc stable with a dependency for 2.6.37 onwards and send a different patch for earlier versions directly to the stable@vger.kernel.org alias, specifying the versions it needs to be applied to.
On Wed, Jun 05, 2013 at 02:58:17PM +0100, Catalin Marinas wrote: > On Mon, Jun 03, 2013 at 06:38:18PM +0100, Simon Baatz wrote: ... > > The actual flush covering both the lowmem and highmem cases actually > > is as simple as this (inspired by __flush_dcache_page() in 2.6.32): > > > > addr = page_address(page); > > #ifdef CONFIG_HIGHMEM > > /* > > * kmap_atomic() doesn't set the page virtual address, and > > * kunmap_atomic() takes care of cache flushing already. > > */ > > if (addr) > > #endif > > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > > > If you agree that this makes sense, I will send a new version. > > It makes sense (with the addition of the cache_is_vivt() check). Maybe > something like > > if (IS_ENABLED(CONFIG_HIGHMEM) && addr) > > to avoid the #ifdef. Yes, IS_ENABLED is much nicer. However, IS_ENABLED apparently is not available for <= 3.0.x. :-( > > If we change the patch this way, there is no need to split off > > __flush_kernel_dcache_page() from __flush_dcache_page(), which also > > makes it simpler to apply to past stable kernels. > > > > The only thing we need to be aware of is the change of > > flush_kernel_dcache_page() in 2.6.37 (commit f8b63c1). In earlier > > versions, we would only need to fix the highmem case. However, I > > wonder whether it makes sense to apply a fix to 2.6.32 and 2.6.34 > > without this ever being reported as a problem. > > I think you can cc stable with a dependency for 2.6.37 onwards and send > a different patch for earlier versions directly to the > stable@vger.kernel.org alias, specifying the versions it needs to be > applied to. Ok. Next version of the patch then for >= 3.2.x. 3.0.x is special (#ifdef). 2.6.3[24].x are special anyway. However, for the latter I still wonder whether this makes sense. "stable_kernel_rules.txt" states: - It must fix a real bug that bothers people (not a, "This could be a problem..." type thing). I don't know anyone who is/was bothered by the higmem bug. (Btw. I verified that the highmem problem is really there (using current linux-next) and is fixed by the proposed patch.) - Simon
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index bff7138..17d0ae8 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -320,9 +320,7 @@ static inline void flush_anon_page(struct vm_area_struct *vma, } #define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE -static inline void flush_kernel_dcache_page(struct page *page) -{ -} +extern void flush_kernel_dcache_page(struct page *); #define flush_dcache_mmap_lock(mapping) \ spin_lock_irq(&(mapping)->tree_lock) diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c index 0d473cc..485ca96 100644 --- a/arch/arm/mm/flush.c +++ b/arch/arm/mm/flush.c @@ -160,13 +160,13 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page, #endif } -void __flush_dcache_page(struct address_space *mapping, struct page *page) +/* + * Writeback any data associated with the kernel mapping of this + * page. This ensures that data in the physical page is mutually + * coherent with the kernel's mapping. + */ +static void __flush_kernel_dcache_page(struct page *page) { - /* - * Writeback any data associated with the kernel mapping of this - * page. This ensures that data in the physical page is mutually - * coherent with the kernels mapping. - */ if (!PageHighMem(page)) { __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE); } else { @@ -184,6 +184,11 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page) } } } +} + +void __flush_dcache_page(struct address_space *mapping, struct page *page) +{ + __flush_kernel_dcache_page(page); /* * If this is a page cache page, and we have an aliasing VIPT cache, @@ -301,6 +306,27 @@ void flush_dcache_page(struct page *page) EXPORT_SYMBOL(flush_dcache_page); /* + * Ensure cache coherency for kernel mapping of this page. + * + * If the page only exists in the page cache and there are no user + * space mappings, this is a no-op since the page was already marked + * dirty at creation. Otherwise, we need to flush the dirty kernel + * cache lines directly. + */ +void flush_kernel_dcache_page(struct page *page) +{ + if (cache_is_vivt() || cache_is_vipt_aliasing()) { + struct address_space *mapping; + + mapping = page_mapping(page); + + if (!mapping || mapping_mapped(mapping)) + __flush_kernel_dcache_page(page); + } +} +EXPORT_SYMBOL(flush_kernel_dcache_page); + +/* * Flush an anonymous page so that users of get_user_pages() * can safely access the data. The expected sequence is: *
Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages it needs to handle are kernel mapped only. However, for example when doing direct I/O, pages with user space mappings may occur. Thus, continue to do lazy flushing if there are no user space mappings. Otherwise, flush the kernel cache lines directly. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@arm.linux.org.uk> --- Changes: in V4: - get back to simpler version of flush_kernel_dcache_page() and use the logic from __flush_dcache_page() to flush the kernel mapping (which also takes care of highmem pages) in V3: - Followed Catalin's suggestion to reverse the order of the patches in V2: - flush_kernel_dcache_page() follows flush_dcache_page() now, except that it does not flush the user mappings arch/arm/include/asm/cacheflush.h | 4 +--- arch/arm/mm/flush.c | 38 +++++++++++++++++++++++++++++++------ 2 files changed, 33 insertions(+), 9 deletions(-)