mbox series

[v6,0/8] Emulated coherent graphics memory take 2

Message ID 20191014132204.7721-1-thomas_os@shipmail.org (mailing list archive)
Headers show
Series Emulated coherent graphics memory take 2 | expand

Message

Thomas Hellström (Intel) Oct. 14, 2019, 1:21 p.m. UTC
From: Thomas Hellström <thellstrom@vmware.com>

Graphics APIs like OpenGL 4.4 and Vulkan require the graphics driver
to provide coherent graphics memory, meaning that the GPU sees any
content written to the coherent memory on the next GPU operation that
touches that memory, and the CPU sees any content written by the GPU
to that memory immediately after any fence object trailing the GPU
operation is signaled.

Paravirtual drivers that otherwise require explicit synchronization
needs to do this by hooking up dirty tracking to pagefault handlers
and buffer object validation.

Provide mm helpers needed for this and that also allow for huge pmd-
and pud entries (patch 1-3), and the associated vmwgfx code (patch 4-7).

The code has been tested and exercised by a tailored version of mesa
where we disable all explicit synchronization and assume graphics memory
is coherent. The performance loss varies of course; a typical number is
around 5%.

I would like to merge this code through the DRM tree, so an ack to include
the new mm helpers in that merge would be greatly appreciated.

Changes since RFC:
- Merge conflict changes moved to the correct patch. Fixes intra-patchset
  compile errors.
- Be more aggressive when turning ttm vm code into helpers. This makes sure
  we can use a const qualifier on the vmwgfx vm_ops.
- Reinstate a lost comment an fix an error path that was broken when turning
  the ttm vm code into helpers.
- Remove explicit type-casts of struct vm_area_struct::vm_private_data
- Clarify the locking inversion that makes us not being able to use the mm
  pagewalk code.

Changes since v1:
- Removed the vmwgfx maintainer entry for as_dirty_helpers.c, updated
  commit message accordingly
- Removed the TTM patches from the series as they are merged separately
  through DRM.
Changes since v2:
- Split out the pagewalk code from as_dirty_helpers.c and document locking.
- Add pre_vma and post_vma callbacks to the pagewalk code.
- Remove huge pmd and -pud asserts that would trip when we protect vmas with
  struct address_space::i_mmap_rwsem rather than with
  struct vm_area_struct::mmap_sem.
- Do some naming cleanup in as_dirty_helpers.c
Changes since v3:
- Extensive renaming of the dirty helpers including the filename.
- Update walk_page_mapping() doc.
- Update the pagewalk code to not unconditionally split pmds if a pte_entry()
  callback is present. Update the dirty helper pmd_entry accordingly.
- Use separate walk ops for the dirty helpers.
- Update the pagewalk code to take the pagetable lock in walk_pte_range.
Changes since v4:
- Fix pte pointer confusion in patch 2/8
- Skip the pagewalk code conditional split patch for now, and update the
  mapping_dirty_helper accordingly. That problem will be solved in a cleaner
  way in a follow-up patchset.
Changes since v5:
- Fix tlb flushing when we have other pending tlb flushes.
  
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>

Comments

Thomas Hellstrom Oct. 21, 2019, 12:24 p.m. UTC | #1
On 10/14/19 3:22 PM, Thomas Hellström (VMware) wrote:
> From: Thomas Hellström <thellstrom@vmware.com>
>
> Graphics APIs like OpenGL 4.4 and Vulkan require the graphics driver
> to provide coherent graphics memory, meaning that the GPU sees any
> content written to the coherent memory on the next GPU operation that
> touches that memory, and the CPU sees any content written by the GPU
> to that memory immediately after any fence object trailing the GPU
> operation is signaled.
>
> Paravirtual drivers that otherwise require explicit synchronization
> needs to do this by hooking up dirty tracking to pagefault handlers
> and buffer object validation.
>
> Provide mm helpers needed for this and that also allow for huge pmd-
> and pud entries (patch 1-3), and the associated vmwgfx code (patch 4-7).
>
> The code has been tested and exercised by a tailored version of mesa
> where we disable all explicit synchronization and assume graphics memory
> is coherent. The performance loss varies of course; a typical number is
> around 5%.
>
> I would like to merge this code through the DRM tree, so an ack to include
> the new mm helpers in that merge would be greatly appreciated.
>
> Changes since RFC:
> - Merge conflict changes moved to the correct patch. Fixes intra-patchset
>   compile errors.
> - Be more aggressive when turning ttm vm code into helpers. This makes sure
>   we can use a const qualifier on the vmwgfx vm_ops.
> - Reinstate a lost comment an fix an error path that was broken when turning
>   the ttm vm code into helpers.
> - Remove explicit type-casts of struct vm_area_struct::vm_private_data
> - Clarify the locking inversion that makes us not being able to use the mm
>   pagewalk code.
>
> Changes since v1:
> - Removed the vmwgfx maintainer entry for as_dirty_helpers.c, updated
>   commit message accordingly
> - Removed the TTM patches from the series as they are merged separately
>   through DRM.
> Changes since v2:
> - Split out the pagewalk code from as_dirty_helpers.c and document locking.
> - Add pre_vma and post_vma callbacks to the pagewalk code.
> - Remove huge pmd and -pud asserts that would trip when we protect vmas with
>   struct address_space::i_mmap_rwsem rather than with
>   struct vm_area_struct::mmap_sem.
> - Do some naming cleanup in as_dirty_helpers.c
> Changes since v3:
> - Extensive renaming of the dirty helpers including the filename.
> - Update walk_page_mapping() doc.
> - Update the pagewalk code to not unconditionally split pmds if a pte_entry()
>   callback is present. Update the dirty helper pmd_entry accordingly.
> - Use separate walk ops for the dirty helpers.
> - Update the pagewalk code to take the pagetable lock in walk_pte_range.
> Changes since v4:
> - Fix pte pointer confusion in patch 2/8
> - Skip the pagewalk code conditional split patch for now, and update the
>   mapping_dirty_helper accordingly. That problem will be solved in a cleaner
>   way in a follow-up patchset.
> Changes since v5:
> - Fix tlb flushing when we have other pending tlb flushes.
>   
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>
Kirill, Linus

I have a formal Ack for two of the four mm patches. Is there a chance I
can get an ack to merge the mm patches of this series through drm with
the vmwgfx patches?

Thanks,

Thomas
Thomas Hellström (Intel) Nov. 4, 2019, 9:21 a.m. UTC | #2
Hi, All,

On 10/14/19 3:22 PM, Thomas Hellström (VMware) wrote:
> From: Thomas Hellström <thellstrom@vmware.com>
>
> Graphics APIs like OpenGL 4.4 and Vulkan require the graphics driver
> to provide coherent graphics memory, meaning that the GPU sees any
> content written to the coherent memory on the next GPU operation that
> touches that memory, and the CPU sees any content written by the GPU
> to that memory immediately after any fence object trailing the GPU
> operation is signaled.
>
> Paravirtual drivers that otherwise require explicit synchronization
> needs to do this by hooking up dirty tracking to pagefault handlers
> and buffer object validation.
>
> Provide mm helpers needed for this and that also allow for huge pmd-
> and pud entries (patch 1-3), and the associated vmwgfx code (patch 4-7).
>
> The code has been tested and exercised by a tailored version of mesa
> where we disable all explicit synchronization and assume graphics memory
> is coherent. The performance loss varies of course; a typical number is
> around 5%.
>
> I would like to merge this code through the DRM tree, so an ack to include
> the new mm helpers in that merge would be greatly appreciated.
>

I'm a bit confused as how to get this merged? Is there an -mm maintainer 
or who is supposed to ack -mm patches and get them into the kernel?

Any input appreciated,

Thanks,

Thomas
Linus Torvalds Nov. 4, 2019, 4:36 p.m. UTC | #3
On Mon, Nov 4, 2019 at 1:21 AM Thomas Hellström (VMware)
<thomas_os@shipmail.org> wrote:
>
> I'm a bit confused as how to get this merged? Is there an -mm maintainer
> or who is supposed to ack -mm patches and get them into the kernel?

I was assuming they'd go through Andrew Morton.. Andrew?

            Linus