diff mbox series

[v5,2/4] x86/mem_sharing: copy a page_lock version to be internal to memshr

Message ID 20190516213752.1701-2-tamas@tklengyel.com (mailing list archive)
State New, archived
Headers show
Series [v5,1/4] x86/mem_sharing: reorder when pages are unlocked and released | expand

Commit Message

Tamas K Lengyel May 16, 2019, 9:37 p.m. UTC
Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
ordering" added extra sanity checking to page_lock/page_unlock for debug builds
with the assumption that no hypervisor path ever locks two pages at once.

This assumption doesn't hold during memory sharing so we copy a version of
page_lock/unlock to be used exclusively in the memory sharing subsystem
without the sanity checks.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
v5: comments and TODOs
---
 xen/arch/x86/mm/mem_sharing.c | 54 ++++++++++++++++++++++++++++++++---
 xen/include/asm-x86/mm.h      | 15 ++--------
 2 files changed, 53 insertions(+), 16 deletions(-)

Comments

Jan Beulich May 17, 2019, 7:21 a.m. UTC | #1
>>> On 16.05.19 at 23:37, <tamas@tklengyel.com> wrote:
> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -356,24 +356,15 @@ struct platform_bad_page {
>  const struct platform_bad_page *get_platform_badpages(unsigned int *array_size);
>  
>  /* Per page locks:
> - * page_lock() is used for two purposes: pte serialization, and memory sharing.
> + * page_lock() is used for pte serialization.
>   *
>   * All users of page lock for pte serialization live in mm.c, use it
>   * to lock a page table page during pte updates, do not take other locks within
>   * the critical section delimited by page_lock/unlock, and perform no
>   * nesting.
>   *
> - * All users of page lock for memory sharing live in mm/mem_sharing.c. Page_lock
> - * is used in memory sharing to protect addition (share) and removal (unshare)
> - * of (gfn,domain) tupples to a list of gfn's that the shared page is currently
> - * backing. Nesting may happen when sharing (and locking) two pages -- deadlock
> - * is avoided by locking pages in increasing order.
> - * All memory sharing code paths take the p2m lock of the affected gfn before
> - * taking the lock for the underlying page. We enforce ordering between page_lock
> - * and p2m_lock using an mm-locks.h construct.
> - *
> - * These two users (pte serialization and memory sharing) do not collide, since
> - * sharing is only supported for hvm guests, which do not perform pv pte updates.
> + * The use of PGT_locked in mem_sharing does not collide, since mem_sharing is
> + * only supported for hvm guests, which do not perform pv pte updates.

Hmm, I thought we had agreed on you also correcting the wording of
the sentence you now retain (as requested). As said before, a HVM
(PVH to be precise) Dom0 can very well perform PV PTE updates, just
not on itself. I had suggested the wording "which do not have PV PTEs
updated" - I'd be fine for this to be folded in while committing, to avoid
another round trip. With this
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan
Tamas K Lengyel May 17, 2019, 8:04 p.m. UTC | #2
On Fri, May 17, 2019 at 1:21 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 16.05.19 at 23:37, <tamas@tklengyel.com> wrote:
> > --- a/xen/include/asm-x86/mm.h
> > +++ b/xen/include/asm-x86/mm.h
> > @@ -356,24 +356,15 @@ struct platform_bad_page {
> >  const struct platform_bad_page *get_platform_badpages(unsigned int *array_size);
> >
> >  /* Per page locks:
> > - * page_lock() is used for two purposes: pte serialization, and memory sharing.
> > + * page_lock() is used for pte serialization.
> >   *
> >   * All users of page lock for pte serialization live in mm.c, use it
> >   * to lock a page table page during pte updates, do not take other locks within
> >   * the critical section delimited by page_lock/unlock, and perform no
> >   * nesting.
> >   *
> > - * All users of page lock for memory sharing live in mm/mem_sharing.c. Page_lock
> > - * is used in memory sharing to protect addition (share) and removal (unshare)
> > - * of (gfn,domain) tupples to a list of gfn's that the shared page is currently
> > - * backing. Nesting may happen when sharing (and locking) two pages -- deadlock
> > - * is avoided by locking pages in increasing order.
> > - * All memory sharing code paths take the p2m lock of the affected gfn before
> > - * taking the lock for the underlying page. We enforce ordering between page_lock
> > - * and p2m_lock using an mm-locks.h construct.
> > - *
> > - * These two users (pte serialization and memory sharing) do not collide, since
> > - * sharing is only supported for hvm guests, which do not perform pv pte updates.
> > + * The use of PGT_locked in mem_sharing does not collide, since mem_sharing is
> > + * only supported for hvm guests, which do not perform pv pte updates.
>
> Hmm, I thought we had agreed on you also correcting the wording of
> the sentence you now retain (as requested). As said before, a HVM
> (PVH to be precise) Dom0 can very well perform PV PTE updates, just
> not on itself. I had suggested the wording "which do not have PV PTEs
> updated" - I'd be fine for this to be folded in while committing, to avoid
> another round trip. With this

Thanks, I do seem to have missed that.

Tamas
Tamas K Lengyel June 17, 2019, 12:21 p.m. UTC | #3
On Thu, May 16, 2019 at 11:38 PM Tamas K Lengyel <tamas@tklengyel.com> wrote:
>
> Patch cf4b30dca0a "Add debug code to detect illegal page_lock and put_page_type
> ordering" added extra sanity checking to page_lock/page_unlock for debug builds
> with the assumption that no hypervisor path ever locks two pages at once.
>
> This assumption doesn't hold during memory sharing so we copy a version of
> page_lock/unlock to be used exclusively in the memory sharing subsystem
> without the sanity checks.
>
> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>

Patch ping. Current tally:

Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks,
Tamas
diff mbox series

Patch

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 13b2f009d4..f2354d2d6f 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -112,13 +112,59 @@  static inline void page_sharing_dispose(struct page_info *page)
 
 #endif /* MEM_SHARING_AUDIT */
 
-static inline int mem_sharing_page_lock(struct page_info *pg)
+/*
+ * Private implementations of page_lock/unlock to bypass PV-only
+ * sanity checks not applicable to mem-sharing.
+ *
+ * _page_lock is used in memory sharing to protect addition (share) and removal
+ * (unshare) of (gfn,domain) tupples to a list of gfn's that the shared page is
+ * currently backing.
+ * Nesting may happen when sharing (and locking) two pages.
+ * Deadlock is avoided by locking pages in increasing order.
+ * All memory sharing code paths take the p2m lock of the affected gfn before
+ * taking the lock for the underlying page. We enforce ordering between page_lock
+ * and p2m_lock using an mm-locks.h construct.
+ *
+ * TODO: Investigate if PGT_validated is necessary.
+ */
+static inline bool _page_lock(struct page_info *page)
 {
-    int rc;
+    unsigned long x, nx;
+
+    do {
+        while ( (x = page->u.inuse.type_info) & PGT_locked )
+            cpu_relax();
+        nx = x + (1 | PGT_locked);
+        if ( !(x & PGT_validated) ||
+             !(x & PGT_count_mask) ||
+             !(nx & PGT_count_mask) )
+            return false;
+    } while ( cmpxchg(&page->u.inuse.type_info, x, nx) != x );
+
+    return true;
+}
+
+static inline void _page_unlock(struct page_info *page)
+{
+    unsigned long x, nx, y = page->u.inuse.type_info;
+
+    do {
+        x = y;
+        ASSERT((x & PGT_count_mask) && (x & PGT_locked));
+
+        nx = x - (1 | PGT_locked);
+        /* We must not drop the last reference here. */
+        ASSERT(nx & PGT_count_mask);
+    } while ( (y = cmpxchg(&page->u.inuse.type_info, x, nx)) != x );
+}
+
+static inline bool mem_sharing_page_lock(struct page_info *pg)
+{
+    bool rc;
     pg_lock_data_t *pld = &(this_cpu(__pld));
 
     page_sharing_mm_pre_lock();
-    rc = page_lock(pg);
+    rc = _page_lock(pg);
     if ( rc )
     {
         preempt_disable();
@@ -135,7 +181,7 @@  static inline void mem_sharing_page_unlock(struct page_info *pg)
     page_sharing_mm_unlock(pld->mm_unlock_level, 
                            &pld->recurse_count);
     preempt_enable();
-    page_unlock(pg);
+    _page_unlock(pg);
 }
 
 static inline shr_handle_t get_next_handle(void)
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 6faa563167..24c4205ba7 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -356,24 +356,15 @@  struct platform_bad_page {
 const struct platform_bad_page *get_platform_badpages(unsigned int *array_size);
 
 /* Per page locks:
- * page_lock() is used for two purposes: pte serialization, and memory sharing.
+ * page_lock() is used for pte serialization.
  *
  * All users of page lock for pte serialization live in mm.c, use it
  * to lock a page table page during pte updates, do not take other locks within
  * the critical section delimited by page_lock/unlock, and perform no
  * nesting.
  *
- * All users of page lock for memory sharing live in mm/mem_sharing.c. Page_lock
- * is used in memory sharing to protect addition (share) and removal (unshare)
- * of (gfn,domain) tupples to a list of gfn's that the shared page is currently
- * backing. Nesting may happen when sharing (and locking) two pages -- deadlock
- * is avoided by locking pages in increasing order.
- * All memory sharing code paths take the p2m lock of the affected gfn before
- * taking the lock for the underlying page. We enforce ordering between page_lock
- * and p2m_lock using an mm-locks.h construct.
- *
- * These two users (pte serialization and memory sharing) do not collide, since
- * sharing is only supported for hvm guests, which do not perform pv pte updates.
+ * The use of PGT_locked in mem_sharing does not collide, since mem_sharing is
+ * only supported for hvm guests, which do not perform pv pte updates.
  */
 int page_lock(struct page_info *page);
 void page_unlock(struct page_info *page);