Message ID | 20240507062044.20399-1-yan.y.zhao@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Enforce CPU cache flush for non-coherent device assignment | expand |
> From: Zhao, Yan Y <yan.y.zhao@intel.com> > Sent: Tuesday, May 7, 2024 2:21 PM > > + > +/* > + * Flush a reserved page or !pfn_valid() PFN. > + * Flush is not performed if the PFN is accessed in uncacheable type. i.e. > + * - PAT type is UC/UC-/WC when PAT is enabled > + * - MTRR type is UC/WC/WT/WP when PAT is not enabled. > + * (no need to do CLFLUSH though WT/WP is cacheable). > + */ As long as a page is cacheable (being WB/WT/WP) the malicious guest can always use non-coherent DMA to make cache/memory inconsistent, hence clflush is still required after unmapping such page from the IOMMU page table to avoid leaking the inconsistency state back to the host. > + > +/** > + * arch_clean_nonsnoop_dma - flush a cache range for non-coherent DMAs > + * (DMAs that lack CPU cache snooping). > + * @phys_addr: physical address start > + * @length: number of bytes to flush > + */ > +void arch_clean_nonsnoop_dma(phys_addr_t phys_addr, size_t length) > +{ > + unsigned long nrpages, pfn; > + unsigned long i; > + > + pfn = PHYS_PFN(phys_addr); > + nrpages = PAGE_ALIGN((phys_addr & ~PAGE_MASK) + length) >> > PAGE_SHIFT; > + > + for (i = 0; i < nrpages; i++, pfn++) > + clflush_pfn(pfn); > +} > +EXPORT_SYMBOL_GPL(arch_clean_nonsnoop_dma); this is not a good name. The code has nothing to do with nonsnoop dma aspect. It's just a general helper accepting a physical pfn to flush CPU cache, with nonsnoop dma as one potential caller usage. It's clearer to be arch_flush_cache_phys(). and probably drm_clflush_pages() can be converted to use this helper too.
On Tue, May 07, 2024 at 04:51:31PM +0800, Tian, Kevin wrote: > > From: Zhao, Yan Y <yan.y.zhao@intel.com> > > Sent: Tuesday, May 7, 2024 2:21 PM > > > > + > > +/* > > + * Flush a reserved page or !pfn_valid() PFN. > > + * Flush is not performed if the PFN is accessed in uncacheable type. i.e. > > + * - PAT type is UC/UC-/WC when PAT is enabled > > + * - MTRR type is UC/WC/WT/WP when PAT is not enabled. > > + * (no need to do CLFLUSH though WT/WP is cacheable). > > + */ > > As long as a page is cacheable (being WB/WT/WP) the malicious > guest can always use non-coherent DMA to make cache/memory > inconsistent, hence clflush is still required after unmapping such > page from the IOMMU page table to avoid leaking the inconsistency > state back to the host. You are right. I should only check MTRR type is UC or WC, as below. static void clflush_reserved_or_invalid_pfn(unsigned long pfn) { const int size = boot_cpu_data.x86_clflush_size; unsigned int i; void *va; if (!pat_enabled()) { u64 start = PFN_PHYS(pfn), end = start + PAGE_SIZE; u8 mtrr_type, uniform; mtrr_type = mtrr_type_lookup(start, end, &uniform); if ((mtrr_type == MTRR_TYPE_UNCACHABLE) ||( mtrry_type == MTRR_TYPE_WRCOMB)) return; } else if (pat_pfn_immune_to_uc_mtrr(pfn)) { return; } ... } Also for the pat_enabled() case where pat_pfn_immune_to_uc_mtrr() is called, maybe pat_x_mtrr_type() cannot be called in patch 1 for untracked PAT range, because pat_x_mtrr_type() will return UC- if MTRR type is WT/WP, which will cause pat_pfn_immune_to_uc_mtrr() to return true and CLFLUSH would be skipped. static unsigned long pat_x_mtrr_type(u64 start, u64 end, enum page_cache_mode req_type) { /* * Look for MTRR hint to get the effective type in case where PAT * request is for WB. */ if (req_type == _PAGE_CACHE_MODE_WB) { u8 mtrr_type, uniform; mtrr_type = mtrr_type_lookup(start, end, &uniform); if (mtrr_type != MTRR_TYPE_WRBACK) return _PAGE_CACHE_MODE_UC_MINUS; return _PAGE_CACHE_MODE_WB; } return req_type; } > > > + > > +/** > > + * arch_clean_nonsnoop_dma - flush a cache range for non-coherent DMAs > > + * (DMAs that lack CPU cache snooping). > > + * @phys_addr: physical address start > > + * @length: number of bytes to flush > > + */ > > +void arch_clean_nonsnoop_dma(phys_addr_t phys_addr, size_t length) > > +{ > > + unsigned long nrpages, pfn; > > + unsigned long i; > > + > > + pfn = PHYS_PFN(phys_addr); > > + nrpages = PAGE_ALIGN((phys_addr & ~PAGE_MASK) + length) >> > > PAGE_SHIFT; > > + > > + for (i = 0; i < nrpages; i++, pfn++) > > + clflush_pfn(pfn); > > +} > > +EXPORT_SYMBOL_GPL(arch_clean_nonsnoop_dma); > > this is not a good name. The code has nothing to do with nonsnoop > dma aspect. It's just a general helper accepting a physical pfn to flush > CPU cache, with nonsnoop dma as one potential caller usage. > > It's clearer to be arch_flush_cache_phys(). > > and probably drm_clflush_pages() can be converted to use this > helper too. Yes. I agree, though arch_clean_nonsnoop_dma() might have its merit if its implementation in other platforms would do some nonsnoop_dma specific implementations.
On Tue, May 07, 2024 at 02:20:44PM +0800, Yan Zhao wrote: > Introduce and export interface arch_clean_nonsnoop_dma() to flush CPU > caches for memory involved in non-coherent DMAs (DMAs that lack CPU cache > snooping). Err, no. There should really be no exported cache manipulation macros, as drivers are almost guaranteed to get this wrong. I've added Russell to the Cc list who has been extremtly vocal about this at least for arm.
On Mon, May 20, 2024 at 07:07:10AM -0700, Christoph Hellwig wrote: > On Tue, May 07, 2024 at 02:20:44PM +0800, Yan Zhao wrote: > > Introduce and export interface arch_clean_nonsnoop_dma() to flush CPU > > caches for memory involved in non-coherent DMAs (DMAs that lack CPU cache > > snooping). > > Err, no. There should really be no exported cache manipulation macros, > as drivers are almost guaranteed to get this wrong. I've added > Russell to the Cc list who has been extremtly vocal about this at least > for arm. We could possibly move this under some IOMMU core API (ie flush and map, unmap and flush), the iommu APIs are non-modular so this could avoid the exported symbol. Jason
On Tue, May 21, 2024 at 12:49:39PM -0300, Jason Gunthorpe wrote: > On Mon, May 20, 2024 at 07:07:10AM -0700, Christoph Hellwig wrote: > > On Tue, May 07, 2024 at 02:20:44PM +0800, Yan Zhao wrote: > > > Introduce and export interface arch_clean_nonsnoop_dma() to flush CPU > > > caches for memory involved in non-coherent DMAs (DMAs that lack CPU cache > > > snooping). > > > > Err, no. There should really be no exported cache manipulation macros, > > as drivers are almost guaranteed to get this wrong. I've added > > Russell to the Cc list who has been extremtly vocal about this at least > > for arm. > > We could possibly move this under some IOMMU core API (ie flush and > map, unmap and flush), the iommu APIs are non-modular so this could > avoid the exported symbol. Though this would be pretty difficult for unmap as we don't have the pfns in the core code to flush. I don't think we have alot of good options but to make iommufd & VFIO handle this directly as they have the list of pages to flush on the unmap side. Use a namespace? Jason
On Tue, May 21, 2024 at 01:00:16PM -0300, Jason Gunthorpe wrote: > On Tue, May 21, 2024 at 12:49:39PM -0300, Jason Gunthorpe wrote: > > On Mon, May 20, 2024 at 07:07:10AM -0700, Christoph Hellwig wrote: > > > On Tue, May 07, 2024 at 02:20:44PM +0800, Yan Zhao wrote: > > > > Introduce and export interface arch_clean_nonsnoop_dma() to flush CPU > > > > caches for memory involved in non-coherent DMAs (DMAs that lack CPU cache > > > > snooping). > > > > > > Err, no. There should really be no exported cache manipulation macros, > > > as drivers are almost guaranteed to get this wrong. I've added > > > Russell to the Cc list who has been extremtly vocal about this at least > > > for arm. > > > > We could possibly move this under some IOMMU core API (ie flush and > > map, unmap and flush), the iommu APIs are non-modular so this could > > avoid the exported symbol. > > Though this would be pretty difficult for unmap as we don't have the > pfns in the core code to flush. I don't think we have alot of good > options but to make iommufd & VFIO handle this directly as they have > the list of pages to flush on the unmap side. Use a namespace? Given we'll rename this function to arch_flush_cache_phys() which takes physical address as input, and there're already clflush_cache_range() and arch_invalidate_pmem() exported with vaddr as input, is this export still good?
On Tue, May 21, 2024 at 01:00:16PM -0300, Jason Gunthorpe wrote: > > > Err, no. There should really be no exported cache manipulation macros, > > > as drivers are almost guaranteed to get this wrong. I've added > > > Russell to the Cc list who has been extremtly vocal about this at least > > > for arm. > > > > We could possibly move this under some IOMMU core API (ie flush and > > map, unmap and flush), the iommu APIs are non-modular so this could > > avoid the exported symbol. > > Though this would be pretty difficult for unmap as we don't have the > pfns in the core code to flush. I don't think we have alot of good > options but to make iommufd & VFIO handle this directly as they have > the list of pages to flush on the unmap side. Use a namespace? Just have a unmap version that also takes a list of PFNs that you'd need for non-coherent mappings?
On Mon, May 27, 2024 at 11:37:34PM -0700, Christoph Hellwig wrote: > On Tue, May 21, 2024 at 01:00:16PM -0300, Jason Gunthorpe wrote: > > > > Err, no. There should really be no exported cache manipulation macros, > > > > as drivers are almost guaranteed to get this wrong. I've added > > > > Russell to the Cc list who has been extremtly vocal about this at least > > > > for arm. > > > > > > We could possibly move this under some IOMMU core API (ie flush and > > > map, unmap and flush), the iommu APIs are non-modular so this could > > > avoid the exported symbol. > > > > Though this would be pretty difficult for unmap as we don't have the > > pfns in the core code to flush. I don't think we have alot of good > > options but to make iommufd & VFIO handle this directly as they have > > the list of pages to flush on the unmap side. Use a namespace? > > Just have a unmap version that also takes a list of PFNs that you'd > need for non-coherent mappings? VFIO has never supported that so nothing like that exists yet.. This is sort of the first steps to some very basic support for a non-coherent cache flush in a limited case of a VM that can do its own cache flushing through kvm. The pfn list is needed for unpin_user_pages() and it has an ugly design where vfio/iommufd read back the pfns seperately from unmap, and they both do it differently without a common range list datastructure here. So, we'd need to build some new unmap function that returns a pfn list that it internally fetches via the read ops. Then it can do the read, unmap, flush iotlb, flush cache in core code. I've been working towards this very slowly as I want to push this stuff down into the io page table walk and remove the significant inefficiency, so it is not throw away work, but it is certainly some notable amount of work to do. Jason
On Sat, Jun 01, 2024 at 04:46:14PM -0300, Jason Gunthorpe wrote: > On Mon, May 27, 2024 at 11:37:34PM -0700, Christoph Hellwig wrote: > > On Tue, May 21, 2024 at 01:00:16PM -0300, Jason Gunthorpe wrote: > > > > > Err, no. There should really be no exported cache manipulation macros, > > > > > as drivers are almost guaranteed to get this wrong. I've added > > > > > Russell to the Cc list who has been extremtly vocal about this at least > > > > > for arm. > > > > > > > > We could possibly move this under some IOMMU core API (ie flush and > > > > map, unmap and flush), the iommu APIs are non-modular so this could > > > > avoid the exported symbol. > > > > > > Though this would be pretty difficult for unmap as we don't have the > > > pfns in the core code to flush. I don't think we have alot of good > > > options but to make iommufd & VFIO handle this directly as they have > > > the list of pages to flush on the unmap side. Use a namespace? > > > > Just have a unmap version that also takes a list of PFNs that you'd > > need for non-coherent mappings? > > VFIO has never supported that so nothing like that exists yet.. This > is sort of the first steps to some very basic support for a > non-coherent cache flush in a limited case of a VM that can do its own > cache flushing through kvm. > > The pfn list is needed for unpin_user_pages() and it has an ugly > design where vfio/iommufd read back the pfns seperately from unmap, > and they both do it differently without a common range list > datastructure here. > > So, we'd need to build some new unmap function that returns a pfn list > that it internally fetches via the read ops. Then it can do the read, > unmap, flush iotlb, flush cache in core code. Would the core code flush CPU caches by providing page physical address? If yes, do you think it's still necessary to export arch_flush_cache_phys() (as what's implemented in this patch)? > > I've been working towards this very slowly as I want to push this > stuff down into the io page table walk and remove the significant > inefficiency, so it is not throw away work, but it is certainly some > notable amount of work to do. Will VFIO also be switched to this new unmap interface? Do we need to care about backporting? And is it possible for VFIO alone to implement in the current proposed way in this series as the first step for easier backport?
On Thu, Jun 06, 2024 at 10:48:10AM +0800, Yan Zhao wrote: > On Sat, Jun 01, 2024 at 04:46:14PM -0300, Jason Gunthorpe wrote: > > On Mon, May 27, 2024 at 11:37:34PM -0700, Christoph Hellwig wrote: > > > On Tue, May 21, 2024 at 01:00:16PM -0300, Jason Gunthorpe wrote: > > > > > > Err, no. There should really be no exported cache manipulation macros, > > > > > > as drivers are almost guaranteed to get this wrong. I've added > > > > > > Russell to the Cc list who has been extremtly vocal about this at least > > > > > > for arm. > > > > > > > > > > We could possibly move this under some IOMMU core API (ie flush and > > > > > map, unmap and flush), the iommu APIs are non-modular so this could > > > > > avoid the exported symbol. > > > > > > > > Though this would be pretty difficult for unmap as we don't have the > > > > pfns in the core code to flush. I don't think we have alot of good > > > > options but to make iommufd & VFIO handle this directly as they have > > > > the list of pages to flush on the unmap side. Use a namespace? > > > > > > Just have a unmap version that also takes a list of PFNs that you'd > > > need for non-coherent mappings? > > > > VFIO has never supported that so nothing like that exists yet.. This > > is sort of the first steps to some very basic support for a > > non-coherent cache flush in a limited case of a VM that can do its own > > cache flushing through kvm. > > > > The pfn list is needed for unpin_user_pages() and it has an ugly > > design where vfio/iommufd read back the pfns seperately from unmap, > > and they both do it differently without a common range list > > datastructure here. > > > > So, we'd need to build some new unmap function that returns a pfn list > > that it internally fetches via the read ops. Then it can do the read, > > unmap, flush iotlb, flush cache in core code. > Would the core code flush CPU caches by providing page physical address? Physical address is all we will have in the core code.. > If yes, do you think it's still necessary to export arch_flush_cache_phys() > (as what's implemented in this patch)? Christoph is asking not to export it, that would mean relying on the iommu core to be non-modulare and putting the arch calls there with a more restricted exported API - ie based on unmap. > > I've been working towards this very slowly as I want to push this > > stuff down into the io page table walk and remove the significant > > inefficiency, so it is not throw away work, but it is certainly some > > notable amount of work to do. > Will VFIO also be switched to this new unmap interface? Do we need to care > about backporting? I don't know :) > And is it possible for VFIO alone to implement in the current proposed way > in this series as the first step for easier backport? I think this series is the best option we have right now, but make the EXPORT a NS export to try to discourage abuse of it while we continue working Jason
On Thu, Jun 06, 2024 at 08:55:03AM -0300, Jason Gunthorpe wrote: > On Thu, Jun 06, 2024 at 10:48:10AM +0800, Yan Zhao wrote: > > On Sat, Jun 01, 2024 at 04:46:14PM -0300, Jason Gunthorpe wrote: > > > On Mon, May 27, 2024 at 11:37:34PM -0700, Christoph Hellwig wrote: > > > > On Tue, May 21, 2024 at 01:00:16PM -0300, Jason Gunthorpe wrote: > > > > > > > Err, no. There should really be no exported cache manipulation macros, > > > > > > > as drivers are almost guaranteed to get this wrong. I've added > > > > > > > Russell to the Cc list who has been extremtly vocal about this at least > > > > > > > for arm. > > > > > > > > > > > > We could possibly move this under some IOMMU core API (ie flush and > > > > > > map, unmap and flush), the iommu APIs are non-modular so this could > > > > > > avoid the exported symbol. > > > > > > > > > > Though this would be pretty difficult for unmap as we don't have the > > > > > pfns in the core code to flush. I don't think we have alot of good > > > > > options but to make iommufd & VFIO handle this directly as they have > > > > > the list of pages to flush on the unmap side. Use a namespace? > > > > > > > > Just have a unmap version that also takes a list of PFNs that you'd > > > > need for non-coherent mappings? > > > > > > VFIO has never supported that so nothing like that exists yet.. This > > > is sort of the first steps to some very basic support for a > > > non-coherent cache flush in a limited case of a VM that can do its own > > > cache flushing through kvm. > > > > > > The pfn list is needed for unpin_user_pages() and it has an ugly > > > design where vfio/iommufd read back the pfns seperately from unmap, > > > and they both do it differently without a common range list > > > datastructure here. > > > > > > So, we'd need to build some new unmap function that returns a pfn list > > > that it internally fetches via the read ops. Then it can do the read, > > > unmap, flush iotlb, flush cache in core code. > > Would the core code flush CPU caches by providing page physical address? > > Physical address is all we will have in the core code.. > > > If yes, do you think it's still necessary to export arch_flush_cache_phys() > > (as what's implemented in this patch)? > > Christoph is asking not to export it, that would mean relying on the > iommu core to be non-modulare and putting the arch calls there with a > more restricted exported API - ie based on unmap. Got it. Thanks for explanation! > > > > I've been working towards this very slowly as I want to push this > > > stuff down into the io page table walk and remove the significant > > > inefficiency, so it is not throw away work, but it is certainly some > > > notable amount of work to do. > > Will VFIO also be switched to this new unmap interface? Do we need to care > > about backporting? > > I don't know :) > > > And is it possible for VFIO alone to implement in the current proposed way > > in this series as the first step for easier backport? > > I think this series is the best option we have right now, but make the > EXPORT a NS export to try to discourage abuse of it while we continue > working Will do. Thanks!
diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h index b192d917a6d0..b63607994285 100644 --- a/arch/x86/include/asm/cacheflush.h +++ b/arch/x86/include/asm/cacheflush.h @@ -10,4 +10,7 @@ void clflush_cache_range(void *addr, unsigned int size); +void arch_clean_nonsnoop_dma(phys_addr_t phys, size_t length); +#define arch_clean_nonsnoop_dma arch_clean_nonsnoop_dma + #endif /* _ASM_X86_CACHEFLUSH_H */ diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 80c9037ffadf..7ff08ad20369 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -34,6 +34,7 @@ #include <asm/memtype.h> #include <asm/hyperv-tlfs.h> #include <asm/mshyperv.h> +#include <asm/mtrr.h> #include "../mm_internal.h" @@ -349,6 +350,93 @@ void arch_invalidate_pmem(void *addr, size_t size) EXPORT_SYMBOL_GPL(arch_invalidate_pmem); #endif +/* + * Flush pfn_valid() and !PageReserved() page + */ +static void clflush_page(struct page *page) +{ + const int size = boot_cpu_data.x86_clflush_size; + unsigned int i; + void *va; + + va = kmap_local_page(page); + + /* CLFLUSHOPT is unordered and requires full memory barrier */ + mb(); + for (i = 0; i < PAGE_SIZE; i += size) + clflushopt(va + i); + /* CLFLUSHOPT is unordered and requires full memory barrier */ + mb(); + + kunmap_local(va); +} + +/* + * Flush a reserved page or !pfn_valid() PFN. + * Flush is not performed if the PFN is accessed in uncacheable type. i.e. + * - PAT type is UC/UC-/WC when PAT is enabled + * - MTRR type is UC/WC/WT/WP when PAT is not enabled. + * (no need to do CLFLUSH though WT/WP is cacheable). + */ +static void clflush_reserved_or_invalid_pfn(unsigned long pfn) +{ + const int size = boot_cpu_data.x86_clflush_size; + unsigned int i; + void *va; + + if (!pat_enabled()) { + u64 start = PFN_PHYS(pfn), end = start + PAGE_SIZE; + u8 mtrr_type, uniform; + + mtrr_type = mtrr_type_lookup(start, end, &uniform); + if (mtrr_type != MTRR_TYPE_WRBACK) + return; + } else if (pat_pfn_immune_to_uc_mtrr(pfn)) { + return; + } + + va = memremap(pfn << PAGE_SHIFT, PAGE_SIZE, MEMREMAP_WB); + if (!va) + return; + + /* CLFLUSHOPT is unordered and requires full memory barrier */ + mb(); + for (i = 0; i < PAGE_SIZE; i += size) + clflushopt(va + i); + /* CLFLUSHOPT is unordered and requires full memory barrier */ + mb(); + + memunmap(va); +} + +static inline void clflush_pfn(unsigned long pfn) +{ + if (pfn_valid(pfn) && + (!PageReserved(pfn_to_page(pfn)) || is_zero_pfn(pfn))) + return clflush_page(pfn_to_page(pfn)); + + clflush_reserved_or_invalid_pfn(pfn); +} + +/** + * arch_clean_nonsnoop_dma - flush a cache range for non-coherent DMAs + * (DMAs that lack CPU cache snooping). + * @phys_addr: physical address start + * @length: number of bytes to flush + */ +void arch_clean_nonsnoop_dma(phys_addr_t phys_addr, size_t length) +{ + unsigned long nrpages, pfn; + unsigned long i; + + pfn = PHYS_PFN(phys_addr); + nrpages = PAGE_ALIGN((phys_addr & ~PAGE_MASK) + length) >> PAGE_SHIFT; + + for (i = 0; i < nrpages; i++, pfn++) + clflush_pfn(pfn); +} +EXPORT_SYMBOL_GPL(arch_clean_nonsnoop_dma); + #ifdef CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION bool cpu_cache_has_invalidate_memregion(void) { diff --git a/include/linux/cacheflush.h b/include/linux/cacheflush.h index 55f297b2c23f..0bfc6551c6d3 100644 --- a/include/linux/cacheflush.h +++ b/include/linux/cacheflush.h @@ -26,4 +26,10 @@ static inline void flush_icache_pages(struct vm_area_struct *vma, #define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1) +#ifndef arch_clean_nonsnoop_dma +static inline void arch_clean_nonsnoop_dma(phys_addr_t phys, size_t length) +{ +} +#endif + #endif /* _LINUX_CACHEFLUSH_H */
Introduce and export interface arch_clean_nonsnoop_dma() to flush CPU caches for memory involved in non-coherent DMAs (DMAs that lack CPU cache snooping). When IOMMU does not enforce cache coherency, devices are allowed to perform non-coherent DMAs. This scenario poses a risk of information leakage when the device is assigned into a VM. Specifically, a malicious guest could potentially retrieve stale host data through non-coherent DMA reads to physical memory, with data initialized by host (e.g., zeros) still residing in the cache. Additionally, host kernel (e.g. by a ksm kthread) is possible to read inconsistent data from CPU cache/memory (left by a malicious guest) after a page is unpinned for non-coherent DMA but before it's freed. Therefore, VFIO/IOMMUFD must initiate a CPU cache flush for pages involved in non-coherent DMAs prior to or following their mapping or unmapping to or from the IOMMU. Introduce and export an interface accepting a contiguous physical address range as input to help flush CPU caches in architecture specific way for VFIO/IOMMUFD. (Currently, x86 only). Given CLFLUSH on MMIOs in x86 is generally undesired and sometimes will cause MCE on certain platforms (e.g. executing CLFLUSH on VGA ranges 0xA0000-0xBFFFF causes MCE on some platforms). Meanwhile, some MMIOs are cacheable and demands CLFLUSH (e.g. certain MMIOs for PMEM). Hence, a method of checking host PAT/MTRR for uncacheable memory is adopted. This implementation always performs CLFLUSH on "pfn_valid() && !reserved" pages (since they are not possible to be MMIOs). For the reserved or !pfn_valid() cases, check host PAT/MTRR to bypass uncacheable physical ranges in host and do CFLUSH on the rest cacheable ranges. Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Kevin Tian <kevin.tian@intel.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> --- arch/x86/include/asm/cacheflush.h | 3 ++ arch/x86/mm/pat/set_memory.c | 88 +++++++++++++++++++++++++++++++ include/linux/cacheflush.h | 6 +++ 3 files changed, 97 insertions(+)