Message ID | 20230513003600.818142-4-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gvt: KVM: KVMGT fixes and page-track cleanups | expand |
hi Sean Do you think it's necessary to double check that struct page pointers are also contiguous? And do you like to also include a fix as below, which is to remove the warning in vfio_device_container_unpin_pages() when npage is 0? @ -169,7 +173,8 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, *page = base_page; return 0; err: - gvt_unpin_guest_page(vgpu, gfn, npage * PAGE_SIZE); + if (npage) + gvt_unpin_guest_page(vgpu, gfn, npage * PAGE_SIZE); return ret; } BTW, I've sent a separate fix for vfio_iommu_type1_pin_pages() to ensure struct page is a valid address. https://lore.kernel.org/lkml/20230516093007.15234-1-yan.y.zhao@intel.com/ On Fri, May 12, 2023 at 05:35:35PM -0700, Sean Christopherson wrote: > When shadowing a GTT entry with a 2M page, verify that the pfns are > contiguous, not just that the struct page pointers are contiguous. The > memory map is virtual contiguous if "CONFIG_FLATMEM=y || > CONFIG_SPARSEMEM_VMEMMAP=y", but not for "CONFIG_SPARSEMEM=y && > CONFIG_SPARSEMEM_VMEMMAP=n", so theoretically KVMGT could encounter struct > pages that are virtually contiguous, but not physically contiguous. > > In practice, this flaw is likely a non-issue as it would cause functional > problems iff a section isn't 2M aligned _and_ is directly adjacent to > another section with discontiguous pfns. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c > index de675d799c7d..429f0f993a13 100644 > --- a/drivers/gpu/drm/i915/gvt/kvmgt.c > +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c > @@ -161,7 +161,7 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, > > if (npage == 0) > base_page = cur_page; > - else if (base_page + npage != cur_page) { > + else if (page_to_pfn(base_page) + npage != page_to_pfn(cur_page)) { > gvt_vgpu_err("The pages are not continuous\n"); > ret = -EINVAL; > npage++; > -- > 2.40.1.606.ga4b1b128d6-goog >
On Tue, May 16, 2023, Yan Zhao wrote: > hi Sean > > Do you think it's necessary to double check that struct page pointers > are also contiguous? No, the virtual address space should be irrelevant. The only way it would be problematic is if something in dma_map_page() expected to be able to access the entire chunk of memory by getting the virtual address of only the first page, but I can't imagine that code is reading or writing memory, let alone doing so across a huge range of memory. > And do you like to also include a fix as below, which is to remove the > warning in vfio_device_container_unpin_pages() when npage is 0? > > @ -169,7 +173,8 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, > *page = base_page; > return 0; > err: > - gvt_unpin_guest_page(vgpu, gfn, npage * PAGE_SIZE); > + if (npage) > + gvt_unpin_guest_page(vgpu, gfn, npage * PAGE_SIZE); > return ret; > } Sure. Want to give your SoB? I'll write a changelog. Thanks again!
On Wed, May 17, 2023 at 07:50:26AM -0700, Sean Christopherson wrote: > On Tue, May 16, 2023, Yan Zhao wrote: > > hi Sean > > > > Do you think it's necessary to double check that struct page pointers > > are also contiguous? > > No, the virtual address space should be irrelevant. The only way it would be > problematic is if something in dma_map_page() expected to be able to access the > entire chunk of memory by getting the virtual address of only the first page, > but I can't imagine that code is reading or writing memory, let alone doing so > across a huge range of memory. Yes, I do find arm_iommu version of dma_map_page() access the memory by getting virtual address of pages passed in, but it's implemented as page by page, not only from the first page. dma_map_page dma_map_page_attrs ops->map_page arm_iommu_map_page __dma_page_cpu_to_dev dma_cache_maint_page Just a little worried about the condition of PFNs are contiguous while they belong to different backends, e.g. one from system memory and one from MMIO. But I don't know how to avoid this without complicated checks. And this condition might not happen in practice. > > > And do you like to also include a fix as below, which is to remove the > > warning in vfio_device_container_unpin_pages() when npage is 0? > > > > @ -169,7 +173,8 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, > > *page = base_page; > > return 0; > > err: > > - gvt_unpin_guest_page(vgpu, gfn, npage * PAGE_SIZE); > > + if (npage) > > + gvt_unpin_guest_page(vgpu, gfn, npage * PAGE_SIZE); > > return ret; > > } > > Sure. Want to give your SoB? I'll write a changelog. > Thanks! It's just a small code piece. Whatever is convenient for you :)
On Thu, May 18, 2023, Yan Zhao wrote: > On Wed, May 17, 2023 at 07:50:26AM -0700, Sean Christopherson wrote: > > On Tue, May 16, 2023, Yan Zhao wrote: > > > hi Sean > > > > > > Do you think it's necessary to double check that struct page pointers > > > are also contiguous? > > > > No, the virtual address space should be irrelevant. The only way it would be > > problematic is if something in dma_map_page() expected to be able to access the > > entire chunk of memory by getting the virtual address of only the first page, > > but I can't imagine that code is reading or writing memory, let alone doing so > > across a huge range of memory. > Yes, I do find arm_iommu version of dma_map_page() access the memory by getting > virtual address of pages passed in, but it's implemented as page by page, not only > from the first page. > > dma_map_page > dma_map_page_attrs > ops->map_page > arm_iommu_map_page Heh, thankfully this is ARM specific, which presumably doesn't collide with KVMGT. > __dma_page_cpu_to_dev > dma_cache_maint_page > > Just a little worried about the condition of PFNs are contiguous > while they belong to different backends, e.g. one from system memory and > one from MMIO. > But I don't know how to avoid this without complicated checks. > And this condition might not happen in practice. IMO, assuming that contiguous pfns are vritually contiguous is wrong, i.e. would be a bug in the other code. The above dma_cache_maint_page() get's this right, and even has a well written comment to boot.
On Thu, May 18, 2023 at 11:04:46AM -0700, Sean Christopherson wrote: > On Thu, May 18, 2023, Yan Zhao wrote: > > On Wed, May 17, 2023 at 07:50:26AM -0700, Sean Christopherson wrote: > > > On Tue, May 16, 2023, Yan Zhao wrote: > > > > hi Sean > > > > > > > > Do you think it's necessary to double check that struct page pointers > > > > are also contiguous? > > > > > > No, the virtual address space should be irrelevant. The only way it would be > > > problematic is if something in dma_map_page() expected to be able to access the > > > entire chunk of memory by getting the virtual address of only the first page, > > > but I can't imagine that code is reading or writing memory, let alone doing so > > > across a huge range of memory. > > Yes, I do find arm_iommu version of dma_map_page() access the memory by getting > > virtual address of pages passed in, but it's implemented as page by page, not only > > from the first page. > > > > dma_map_page > > dma_map_page_attrs > > ops->map_page > > arm_iommu_map_page > > Heh, thankfully this is ARM specific, which presumably doesn't collide with KVMGT. Actually, this is fine with KVMGT (owning to page by page access), isn't it? :) > > > __dma_page_cpu_to_dev > > dma_cache_maint_page > > > > Just a little worried about the condition of PFNs are contiguous > > while they belong to different backends, e.g. one from system memory and > > one from MMIO. > > But I don't know how to avoid this without complicated checks. > > And this condition might not happen in practice. > > IMO, assuming that contiguous pfns are vritually contiguous is wrong, i.e. would > be a bug in the other code. The above dma_cache_maint_page() get's this right, > and even has a well written comment to boot. Right.
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index de675d799c7d..429f0f993a13 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -161,7 +161,7 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, if (npage == 0) base_page = cur_page; - else if (base_page + npage != cur_page) { + else if (page_to_pfn(base_page) + npage != page_to_pfn(cur_page)) { gvt_vgpu_err("The pages are not continuous\n"); ret = -EINVAL; npage++;
When shadowing a GTT entry with a 2M page, verify that the pfns are contiguous, not just that the struct page pointers are contiguous. The memory map is virtual contiguous if "CONFIG_FLATMEM=y || CONFIG_SPARSEMEM_VMEMMAP=y", but not for "CONFIG_SPARSEMEM=y && CONFIG_SPARSEMEM_VMEMMAP=n", so theoretically KVMGT could encounter struct pages that are virtually contiguous, but not physically contiguous. In practice, this flaw is likely a non-issue as it would cause functional problems iff a section isn't 2M aligned _and_ is directly adjacent to another section with discontiguous pfns. Signed-off-by: Sean Christopherson <seanjc@google.com> --- drivers/gpu/drm/i915/gvt/kvmgt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)