diff mbox series

[1/2] x86/mem_sharing: reorder when pages are unlocked and released

Message ID 20190412042930.2867-1-tamas@tklengyel.com (mailing list archive)
State Superseded
Headers show
Series [1/2] x86/mem_sharing: reorder when pages are unlocked and released | expand

Commit Message

Tamas K Lengyel April 12, 2019, 4:29 a.m. UTC
Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
grabbing extra references for pages that drop references tied to PGC_allocated.
However, the way these extra references were grabbed were incorrect, resulting
in both share_pages and unshare_pages failing. There actually is no need to
grab extra references, only a reordering was needed for when the existing
references are being released. This is in accordance to the XSA-242
recommendation of not calling _put_page_type while also holding the page_lock
for that page.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
 xen/arch/x86/mm/mem_sharing.c | 31 ++++++++-----------------------
 1 file changed, 8 insertions(+), 23 deletions(-)

Comments

Jan Beulich April 12, 2019, 11:55 a.m. UTC | #1
>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> grabbing extra references for pages that drop references tied to PGC_allocated.
> However, the way these extra references were grabbed were incorrect, resulting
> in both share_pages and unshare_pages failing.

I'm sorry for this.

> @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>           * Don't change the type of rmap for the client page. */
>          rmap_del(gfn, cpage, 0);
>          rmap_add(gfn, spage);
> -        put_page_and_type(cpage);
> +        put_count++;

Since this sits in a while(), ...

> @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>      /* Free the client page */
>      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>          put_page(cpage);
> -    put_page(cpage);
> +
> +    while(put_count--)
> +        put_page_and_type(cpage);

... put_count may be zero before this while(). At which point the
earlier put_page() would still be unsafe.

Also, despite the if() in context suggesting the style you used,
the rest of the function looks to use proper Xen style, so would
you please add the three missing blanks to the while() you add?

Jan
Tamas K Lengyel April 12, 2019, 1:41 p.m. UTC | #2
On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> > Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> > grabbing extra references for pages that drop references tied to PGC_allocated.
> > However, the way these extra references were grabbed were incorrect, resulting
> > in both share_pages and unshare_pages failing.
>
> I'm sorry for this.

It's my bad for not catching it earlier when I acked it. Reading the
patch it looked fine and made sense but evidently that's no substitute
for actually testing it.

>
> > @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >           * Don't change the type of rmap for the client page. */
> >          rmap_del(gfn, cpage, 0);
> >          rmap_add(gfn, spage);
> > -        put_page_and_type(cpage);
> > +        put_count++;
>
> Since this sits in a while(), ...
>
> > @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >      /* Free the client page */
> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
> >          put_page(cpage);
> > -    put_page(cpage);
> > +
> > +    while(put_count--)
> > +        put_page_and_type(cpage);
>
> ... put_count may be zero before this while(). At which point the
> earlier put_page() would still be unsafe.

The page should have at least one reference it got before when the
page went through page_make_sharable function. We also verified that
the page is still sharable so that reference is still there, so this
count is at least 1. We could certainly add an ASSERT(put_count)
before though.

>
> Also, despite the if() in context suggesting the style you used,
> the rest of the function looks to use proper Xen style, so would
> you please add the three missing blanks to the while() you add?

Ack.

Thanks,
Tamas
Andrew Cooper April 12, 2019, 2 p.m. UTC | #3
On 12/04/2019 14:41, Tamas K Lengyel wrote:
> On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>>> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
>>> grabbing extra references for pages that drop references tied to PGC_allocated.
>>> However, the way these extra references were grabbed were incorrect, resulting
>>> in both share_pages and unshare_pages failing.
>> I'm sorry for this.
> It's my bad for not catching it earlier when I acked it. Reading the
> patch it looked fine and made sense but evidently that's no substitute
> for actually testing it.

As an aside, do you have a one-paragraph introduction to how you use
mem_sharing?

With any luck, we can try and avoid breaking it as often as we seem to.

~Andrew
Tamas K Lengyel April 12, 2019, 2:13 p.m. UTC | #4
On Fri, Apr 12, 2019 at 8:00 AM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 12/04/2019 14:41, Tamas K Lengyel wrote:
> > On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
> >>>>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
> >>> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" introduced
> >>> grabbing extra references for pages that drop references tied to PGC_allocated.
> >>> However, the way these extra references were grabbed were incorrect, resulting
> >>> in both share_pages and unshare_pages failing.
> >> I'm sorry for this.
> > It's my bad for not catching it earlier when I acked it. Reading the
> > patch it looked fine and made sense but evidently that's no substitute
> > for actually testing it.
>
> As an aside, do you have a one-paragraph introduction to how you use
> mem_sharing?

The use-case is malware analysis where we tend to deploy the same VM
image over and over. Mem_sharing saves resources as we don't have to
have say 4gb of memory available for each analysis session since only
a fraction of that memory actually gets touched during the analysis.
Right now it's still a bit clunky since we first have to restore the
VM image and then proceed with mem_sharing but there are other
improvements that are planned, such as VM forking
(https://xenproject.atlassian.net/projects/XEN/board?issue-key=XEN-89),
which would use mem_sharing but the deployment of new VMs would be
more streamlined.

>
> With any luck, we can try and avoid breaking it as often as we seem to.

Ideally :)

Thanks,
Tamas
Jan Beulich April 12, 2019, 2:29 p.m. UTC | #5
>>> On 12.04.19 at 15:41, <tamas@tklengyel.com> wrote:
> On Fri, Apr 12, 2019 at 5:55 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 12.04.19 at 06:29, <tamas@tklengyel.com> wrote:
>> > @@ -984,7 +976,7 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>> >           * Don't change the type of rmap for the client page. */
>> >          rmap_del(gfn, cpage, 0);
>> >          rmap_add(gfn, spage);
>> > -        put_page_and_type(cpage);
>> > +        put_count++;
>>
>> Since this sits in a while(), ...
>>
>> > @@ -1002,7 +994,9 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>> >      /* Free the client page */
>> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>> >          put_page(cpage);
>> > -    put_page(cpage);
>> > +
>> > +    while(put_count--)
>> > +        put_page_and_type(cpage);
>>
>> ... put_count may be zero before this while(). At which point the
>> earlier put_page() would still be unsafe.
> 
> The page should have at least one reference it got before when the
> page went through page_make_sharable function. We also verified that
> the page is still sharable so that reference is still there, so this
> count is at least 1. We could certainly add an ASSERT(put_count)
> before though.

I see - back when I created the change you're now trying to fix,
I was wondering but couldn't find any proof. So yes, adding an
ASSERT() would surely be helpful.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 5ac9d8f54c..345a1778f9 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -900,6 +900,7 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     p2m_type_t smfn_type, cmfn_type;
     struct two_gfns tg;
     struct rmap_iterator ri;
+    unsigned long put_count = 0;
 
     get_two_gfns(sd, sgfn, &smfn_type, NULL, &smfn,
                  cd, cgfn, &cmfn_type, NULL, &cmfn, 0, &tg);
@@ -964,15 +965,6 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
         goto err_out;
     }
 
-    /* Acquire an extra reference, for the freeing below to be safe. */
-    if ( !get_page(cpage, cd) )
-    {
-        ret = -EOVERFLOW;
-        mem_sharing_page_unlock(secondpg);
-        mem_sharing_page_unlock(firstpg);
-        goto err_out;
-    }
-
     /* Merge the lists together */
     rmap_seed_iterator(cpage, &ri);
     while ( (gfn = rmap_iterate(cpage, &ri)) != NULL)
@@ -984,7 +976,7 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
          * Don't change the type of rmap for the client page. */
         rmap_del(gfn, cpage, 0);
         rmap_add(gfn, spage);
-        put_page_and_type(cpage);
+        put_count++;
         d = get_domain_by_id(gfn->domain);
         BUG_ON(!d);
         BUG_ON(set_shared_p2m_entry(d, gfn->gfn, smfn));
@@ -1002,7 +994,9 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     /* Free the client page */
     if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
         put_page(cpage);
-    put_page(cpage);
+
+    while(put_count--)
+        put_page_and_type(cpage);
 
     /* We managed to free a domain page. */
     atomic_dec(&nr_shared_mfns);
@@ -1167,20 +1161,11 @@  int __mem_sharing_unshare_page(struct domain *d,
     {
         if ( !last_gfn )
             mem_sharing_gfn_destroy(page, d, gfn_info);
-        put_page_and_type(page);
         mem_sharing_page_unlock(page);
-        if ( last_gfn )
-        {
-            if ( !get_page(page, d) )
-            {
-                put_gfn(d, gfn);
-                domain_crash(d);
-                return -EOVERFLOW;
-            }
-            if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
-                put_page(page);
+        if ( last_gfn &&
+            test_and_clear_bit(_PGC_allocated, &page->count_info) )
             put_page(page);
-        }
+        put_page_and_type(page);
         put_gfn(d, gfn);
 
         return 0;