[v1] Fix p2m_set_suppress_ve

Message ID	20190403142620.1224-1-aisaila@bitdefender.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> From: Alexandru Stefan ISAILA <aisaila@bitdefender.com> To: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org> Thread-Topic: [PATCH v1] Fix p2m_set_suppress_ve Thread-Index: AQHU6imUrZ5W3bOnfUWQBYZ5hKyycw== Date: Wed, 3 Apr 2019 14:29:00 +0000 Message-ID: <20190403142620.1224-1-aisaila@bitdefender.com> Accept-Language: en-US Content-Language: en-US received-spf: None (protection.outlook.com: bitdefender.com does not designate permitted sender hosts) MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v1] Fix p2m_set_suppress_ve Precedence: list Cc: "wei.liu2@citrix.com" <wei.liu2@citrix.com>, "george.dunlap@eu.citrix.com" <george.dunlap@eu.citrix.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "jbeulich@suse.com" <jbeulich@suse.com>, Alexandru Stefan ISAILA <aisaila@bitdefender.com>, "roger.pau@citrix.com" <roger.pau@citrix.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
Series	[v1] Fix p2m_set_suppress_ve \| expand [v1] Fix p2m_set_suppress_ve

Alexandru Stefan ISAILA April 3, 2019, 2:29 p.m. UTC

On a new altp2m view the p2m_set_suppress_ve() func will fail with invalid mfn
from p2m->get_entry() if the p2m->set_entry() was not called before.

This patch solves the problem by getting the mfn from __get_gfn_type_access()
and then returning error if the mfn is invalid.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
---
 xen/arch/x86/mm/p2m.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Jan Beulich April 3, 2019, 2:58 p.m. UTC | #1

>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d, gfn_t gfn, bool suppress_ve,
>      mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
>      if ( !mfn_valid(mfn) )
>      {
> -        rc = -ESRCH;
> -        goto out;
> +        unsigned int page_order;
> +
> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
> +                                    P2M_ALLOC | P2M_UNSHARE, &page_order, 0);

I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
at least P2M_UNSHARE is too heavy: Why would you want to force
un-sharing of a page when all you want to alter is #VE behavior?

A recent patch[1] of mine actually tries to move the other direction.

Additionally, when you add such a lookup as error handling attempt,
I think it is important to leave a code comment. But I wonder
whether this shouldn't be done before the call to ->get_entry(), or
whether in fact there's a bug here in how get_entry() behaves in
this case.

The description also doesn't clarify at all why you (need to?) use
host_p2m here, when get_entry() and set_entry() both use p2m.
I suppose that's because there's some sync-ing happening
between the p2m-s, but at least to me this isn't an obviously
necessary (side) effect of that call.

Also note that if you already need to call this lowest level function
directly here, the last argument should be "false".

Jan

[1] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00847.html

Razvan Cojocaru April 3, 2019, 3:17 p.m. UTC | #2

On 4/3/19 5:58 PM, Jan Beulich wrote:
>>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d, gfn_t gfn, bool suppress_ve,
>>       mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
>>       if ( !mfn_valid(mfn) )
>>       {
>> -        rc = -ESRCH;
>> -        goto out;
>> +        unsigned int page_order;
>> +
>> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
>> +                                    P2M_ALLOC | P2M_UNSHARE, &page_order, 0);
> 
> I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
> at least P2M_UNSHARE is too heavy: Why would you want to force
> un-sharing of a page when all you want to alter is #VE behavior?

That logic was taken from p2m_set_altp2m_mem_access(), we thought the 
two cases are very similar.

269     mfn = ap2m->get_entry(ap2m, gfn, &t, &old_a, 0, NULL, NULL);
270
271     /* Check host p2m if no valid entry in alternate */
272     if ( !mfn_valid(mfn) )
273     {
274
275         mfn = __get_gfn_type_access(hp2m, gfn_l, &t, &old_a,
276                                     P2M_ALLOC | P2M_UNSHARE, 
&page_order, 0);
277
278         rc = -ESRCH;
279         if ( !mfn_valid(mfn) || t != p2m_ram_rw )
280             return rc;
281
282         /* If this is a superpage, copy that first */
283         if ( page_order != PAGE_ORDER_4K )
284         {
285             unsigned long mask = ~((1UL << page_order) - 1);
286             gfn_t gfn2 = _gfn(gfn_l & mask);
287             mfn_t mfn2 = _mfn(mfn_x(mfn) & mask);
288
289             rc = ap2m->set_entry(ap2m, gfn2, mfn2, page_order, t, 
old_a, 1);
290             if ( rc )
291                 return rc;
292         }
293     }
294
295     /*
296      * Inherit the old suppress #VE bit value if it is already set, 
or set it
297      * to 1 otherwise
298      */
299     return ap2m->set_entry(ap2m, gfn, mfn, PAGE_ORDER_4K, t, a, -1);
300 }

I wonder if we should put the whole logic in the "if ( !mfn_valid(mfn) 
)" body in its own function and reuse that for both functions - although 
it doesn't look like the extra superpage logic matters for setting the 
suppress #VE bit alone (since even the code above only sets it with 
PAGE_ORDER_4K).

> Additionally, when you add such a lookup as error handling attempt,
> I think it is important to leave a code comment. But I wonder
> whether this shouldn't be done before the call to ->get_entry(), or
> whether in fact there's a bug here in how get_entry() behaves in
> this case.

Changes to the hostp2m (also known as altp2m view 0) propagate to all 
existing altp2ms, but they do so in a lazy manner, and also that won't 
happen for altp2ms created after a while. So altp2ms will not 
necessarily know about a page that the hostp2m knows about, which should 
not stop us from setting mem access restrictions or the value of the SVE 
bit.


Thanks,
Razvan

Jan Beulich April 3, 2019, 3:30 p.m. UTC | #3

>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
> On 4/3/19 5:58 PM, Jan Beulich wrote:
>>>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
>>> --- a/xen/arch/x86/mm/p2m.c
>>> +++ b/xen/arch/x86/mm/p2m.c
>>> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d, gfn_t gfn, bool suppress_ve,
>>>       mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
>>>       if ( !mfn_valid(mfn) )
>>>       {
>>> -        rc = -ESRCH;
>>> -        goto out;
>>> +        unsigned int page_order;
>>> +
>>> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
>>> +                                    P2M_ALLOC | P2M_UNSHARE, &page_order, 0);
>> 
>> I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
>> at least P2M_UNSHARE is too heavy: Why would you want to force
>> un-sharing of a page when all you want to alter is #VE behavior?
> 
> That logic was taken from p2m_set_altp2m_mem_access(), we thought the 
> two cases are very similar.

I see.

> 269     mfn = ap2m->get_entry(ap2m, gfn, &t, &old_a, 0, NULL, NULL);
> 270
> 271     /* Check host p2m if no valid entry in alternate */

This comment was not cloned.

> 272     if ( !mfn_valid(mfn) )
> 273     {
> 274
> 275         mfn = __get_gfn_type_access(hp2m, gfn_l, &t, &old_a,
> 276                                     P2M_ALLOC | P2M_UNSHARE, &page_order, 0);
> 277
> 278         rc = -ESRCH;
> 279         if ( !mfn_valid(mfn) || t != p2m_ram_rw )
> 280             return rc;
> 281
> 282         /* If this is a superpage, copy that first */
> 283         if ( page_order != PAGE_ORDER_4K )
> 284         {
> 285             unsigned long mask = ~((1UL << page_order) - 1);
> 286             gfn_t gfn2 = _gfn(gfn_l & mask);
> 287             mfn_t mfn2 = _mfn(mfn_x(mfn) & mask);
> 288
> 289             rc = ap2m->set_entry(ap2m, gfn2, mfn2, page_order, t, old_a, 1);
> 290             if ( rc )
> 291                 return rc;
> 292         }
> 293     }
> 294
> 295     /*
> 296      * Inherit the old suppress #VE bit value if it is already set, or set it
> 297      * to 1 otherwise
> 298      */
> 299     return ap2m->set_entry(ap2m, gfn, mfn, PAGE_ORDER_4K, t, a, -1);
> 300 }
> 
> I wonder if we should put the whole logic in the "if ( !mfn_valid(mfn) )"
> body in its own function and reuse that for both functions - although 
> it doesn't look like the extra superpage logic matters for setting the 
> suppress #VE bit alone (since even the code above only sets it with 
> PAGE_ORDER_4K).

Indeed, this latter aspect doesn't make it look very attractive to
introduce a common helper. Otoh I wonder what other functions
might be affected.

>> Additionally, when you add such a lookup as error handling attempt,
>> I think it is important to leave a code comment. But I wonder
>> whether this shouldn't be done before the call to ->get_entry(), or
>> whether in fact there's a bug here in how get_entry() behaves in
>> this case.
> 
> Changes to the hostp2m (also known as altp2m view 0) propagate to all 
> existing altp2ms, but they do so in a lazy manner, and also that won't 
> happen for altp2ms created after a while. So altp2ms will not 
> necessarily know about a page that the hostp2m knows about, which should 
> not stop us from setting mem access restrictions or the value of the SVE 
> bit.

But even if the propagation is lazy, a "get" should then return the
propagated value, shouldn't it? Or else callers (like is the case here)
can be mislead. I do recognize that there may be callers who
intentionally _don't_ want the propagated value, but I'd expect this
to be the exception, not the rule. I wonder therefore whether there
shouldn't be a wrapper around ->get_entry() to take care of this for
callers which want the propagated value. Calling the bare hook would
remain as is.

Jan

Razvan Cojocaru April 3, 2019, 3:48 p.m. UTC | #4

On 4/3/19 6:30 PM, Jan Beulich wrote:
>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
>> Changes to the hostp2m (also known as altp2m view 0) propagate to all
>> existing altp2ms, but they do so in a lazy manner, and also that won't
>> happen for altp2ms created after a while. So altp2ms will not
>> necessarily know about a page that the hostp2m knows about, which should
>> not stop us from setting mem access restrictions or the value of the SVE
>> bit.
> 
> But even if the propagation is lazy, a "get" should then return the
> propagated value, shouldn't it? Or else callers (like is the case here)
> can be mislead. I do recognize that there may be callers who
> intentionally _don't_ want the propagated value, but I'd expect this
> to be the exception, not the rule. I wonder therefore whether there
> shouldn't be a wrapper around ->get_entry() to take care of this for
> callers which want the propagated value. Calling the bare hook would
> remain as is.

But I believe that some hostp2m entries will never be propagated. Sorry 
that I've not been clear about this. This is what I meant by "won't 
happen for altp2ms created after a while".

Take, for example, the case where there's only the hostp2m for the first 
30 minutes of a guest's life, and in this time somebody calls 
ept_set_entry(). This, in turn, calls p2m_altp2m_propagate_change() if ( 
entry_written && p2m_is_hostp2m(p2m) ).

But at this point there are no altp2ms, so there's nowhere to propagate 
these changes to.

Now, at minute 31 we create a new altp2m. The changes will never be 
propagated here, so altp2m->get_entry() will always (rightfully) return 
an invalid MFN.

We only want to make an exception for setting the SVE bit or mem access 
restrictions on an entry in the altp2m, but having get_entry() (or a 
wrapper) return the hostp2m values and MFN might not necessarily be what 
current callers expect.

Whether the described scenario _should_ work the way it does is 
debatable, but it appears to do so by design.

Hopefully I'm not missing anything.

Thanks,
Razvan

Jan Beulich April 3, 2019, 4:04 p.m. UTC | #5

>>> On 03.04.19 at 17:48, <rcojocaru@bitdefender.com> wrote:
> On 4/3/19 6:30 PM, Jan Beulich wrote:
>>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
>>> Changes to the hostp2m (also known as altp2m view 0) propagate to all
>>> existing altp2ms, but they do so in a lazy manner, and also that won't
>>> happen for altp2ms created after a while. So altp2ms will not
>>> necessarily know about a page that the hostp2m knows about, which should
>>> not stop us from setting mem access restrictions or the value of the SVE
>>> bit.
>> 
>> But even if the propagation is lazy, a "get" should then return the
>> propagated value, shouldn't it? Or else callers (like is the case here)
>> can be mislead. I do recognize that there may be callers who
>> intentionally _don't_ want the propagated value, but I'd expect this
>> to be the exception, not the rule. I wonder therefore whether there
>> shouldn't be a wrapper around ->get_entry() to take care of this for
>> callers which want the propagated value. Calling the bare hook would
>> remain as is.
> 
> But I believe that some hostp2m entries will never be propagated. Sorry 
> that I've not been clear about this. This is what I meant by "won't 
> happen for altp2ms created after a while".
> 
> Take, for example, the case where there's only the hostp2m for the first 
> 30 minutes of a guest's life, and in this time somebody calls 
> ept_set_entry(). This, in turn, calls p2m_altp2m_propagate_change() if ( 
> entry_written && p2m_is_hostp2m(p2m) ).
> 
> But at this point there are no altp2ms, so there's nowhere to propagate 
> these changes to.
> 
> Now, at minute 31 we create a new altp2m. The changes will never be 
> propagated here, so altp2m->get_entry() will always (rightfully) return 
> an invalid MFN.

I guess I'm missing something here: Why "rightfully"? If the guest
runs on this altp2m, it'll die a quick death then, won't it? Yet if
this is intended to be demand-populate, then why would there be
automatic propagation for existing altp2m-s (as you explain
further up, unless I'm getting this wrong)?

> We only want to make an exception for setting the SVE bit or mem access 
> restrictions on an entry in the altp2m, but having get_entry() (or a 
> wrapper) return the hostp2m values and MFN might not necessarily be what 
> current callers expect.

Then I still don't see why you would want to force propagation
in these cases: If it's generally demand-populate, it can't be right
to _fully_ populate that slot. You ought to be setting the access
type or the "suppress #VE" bit alone, without propagating the
MFN and alike. The later, when the entry actually gets propagated,
the entered values should be used as overrides to what comes
from the host p2m.

Jan

Tamas K Lengyel April 3, 2019, 4:16 p.m. UTC | #6

On Wed, Apr 3, 2019 at 10:06 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 03.04.19 at 17:48, <rcojocaru@bitdefender.com> wrote:
> > On 4/3/19 6:30 PM, Jan Beulich wrote:
> >>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
> >>> Changes to the hostp2m (also known as altp2m view 0) propagate to all
> >>> existing altp2ms, but they do so in a lazy manner, and also that won't
> >>> happen for altp2ms created after a while. So altp2ms will not
> >>> necessarily know about a page that the hostp2m knows about, which should
> >>> not stop us from setting mem access restrictions or the value of the SVE
> >>> bit.
> >>
> >> But even if the propagation is lazy, a "get" should then return the
> >> propagated value, shouldn't it? Or else callers (like is the case here)
> >> can be mislead. I do recognize that there may be callers who
> >> intentionally _don't_ want the propagated value, but I'd expect this
> >> to be the exception, not the rule. I wonder therefore whether there
> >> shouldn't be a wrapper around ->get_entry() to take care of this for
> >> callers which want the propagated value. Calling the bare hook would
> >> remain as is.
> >
> > But I believe that some hostp2m entries will never be propagated. Sorry
> > that I've not been clear about this. This is what I meant by "won't
> > happen for altp2ms created after a while".
> >
> > Take, for example, the case where there's only the hostp2m for the first
> > 30 minutes of a guest's life, and in this time somebody calls
> > ept_set_entry(). This, in turn, calls p2m_altp2m_propagate_change() if (
> > entry_written && p2m_is_hostp2m(p2m) ).
> >
> > But at this point there are no altp2ms, so there's nowhere to propagate
> > these changes to.
> >
> > Now, at minute 31 we create a new altp2m. The changes will never be
> > propagated here, so altp2m->get_entry() will always (rightfully) return
> > an invalid MFN.
>
> I guess I'm missing something here: Why "rightfully"? If the guest
> runs on this altp2m, it'll die a quick death then, won't it? Yet if
> this is intended to be demand-populate, then why would there be
> automatic propagation for existing altp2m-s (as you explain
> further up, unless I'm getting this wrong)?
>
> > We only want to make an exception for setting the SVE bit or mem access
> > restrictions on an entry in the altp2m, but having get_entry() (or a
> > wrapper) return the hostp2m values and MFN might not necessarily be what
> > current callers expect.
>
> Then I still don't see why you would want to force propagation
> in these cases: If it's generally demand-populate, it can't be right
> to _fully_ populate that slot. You ought to be setting the access
> type or the "suppress #VE" bit alone, without propagating the
> MFN and alike. The later, when the entry actually gets propagated,
> the entered values should be used as overrides to what comes
> from the host p2m.

It is right to fully populate a slot when for example we want
different mem_access permissions in different altp2m views. We can't
wait for the domain to on-demand populate the altp2m view because we
don't know when (and if) that will actually happen. So
p2m_set_altp2m_mem_access already copies the entry over from the
hostp2m to the altp2m and then applies the requested mem_access
setting. This is done even if the mem_access setting is the same as it
was in the hostp2m.

Doing the same behavior for p2m_set_suppress_ve I think is consistent,
albeit you could easily overcome the issue by first simply setting a
mem_access permission on the gfn to trigger the aforementioned copy to
the altp2m before you try to set the ve settings.

Tamas

Razvan Cojocaru April 3, 2019, 5:07 p.m. UTC | #7

On 4/3/19 7:16 PM, Tamas K Lengyel wrote:
> It is right to fully populate a slot when for example we want
> different mem_access permissions in different altp2m views. We can't
> wait for the domain to on-demand populate the altp2m view because we
> don't know when (and if) that will actually happen. So
> p2m_set_altp2m_mem_access already copies the entry over from the
> hostp2m to the altp2m and then applies the requested mem_access
> setting. This is done even if the mem_access setting is the same as it
> was in the hostp2m.
> 
> Doing the same behavior for p2m_set_suppress_ve I think is consistent,
> albeit you could easily overcome the issue by first simply setting a
> mem_access permission on the gfn to trigger the aforementioned copy to
> the altp2m before you try to set the ve settings.

That's what we're doing now, but it wasn't at all intuitive to figure it
out. I'd expect that setting the SVE bit simply works without
maneuvering something else in place first.


Thanks,
Razvan

Tamas K Lengyel April 3, 2019, 5:12 p.m. UTC | #8

On Wed, Apr 3, 2019 at 11:08 AM Razvan Cojocaru
<rcojocaru@bitdefender.com> wrote:
>
> On 4/3/19 7:16 PM, Tamas K Lengyel wrote:
> > It is right to fully populate a slot when for example we want
> > different mem_access permissions in different altp2m views. We can't
> > wait for the domain to on-demand populate the altp2m view because we
> > don't know when (and if) that will actually happen. So
> > p2m_set_altp2m_mem_access already copies the entry over from the
> > hostp2m to the altp2m and then applies the requested mem_access
> > setting. This is done even if the mem_access setting is the same as it
> > was in the hostp2m.
> >
> > Doing the same behavior for p2m_set_suppress_ve I think is consistent,
> > albeit you could easily overcome the issue by first simply setting a
> > mem_access permission on the gfn to trigger the aforementioned copy to
> > the altp2m before you try to set the ve settings.
>
> That's what we're doing now, but it wasn't at all intuitive to figure it
> out. I'd expect that setting the SVE bit simply works without
> maneuvering something else in place first.

Yeap, makes sense to me :)

Tamas

Razvan Cojocaru April 3, 2019, 5:41 p.m. UTC | #9

On 4/3/19 7:04 PM, Jan Beulich wrote:
>>>> On 03.04.19 at 17:48, <rcojocaru@bitdefender.com> wrote:
>> On 4/3/19 6:30 PM, Jan Beulich wrote:
>>>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
>>>> Changes to the hostp2m (also known as altp2m view 0) propagate to all
>>>> existing altp2ms, but they do so in a lazy manner, and also that won't
>>>> happen for altp2ms created after a while. So altp2ms will not
>>>> necessarily know about a page that the hostp2m knows about, which should
>>>> not stop us from setting mem access restrictions or the value of the SVE
>>>> bit.
>>>
>>> But even if the propagation is lazy, a "get" should then return the
>>> propagated value, shouldn't it? Or else callers (like is the case here)
>>> can be mislead. I do recognize that there may be callers who
>>> intentionally _don't_ want the propagated value, but I'd expect this
>>> to be the exception, not the rule. I wonder therefore whether there
>>> shouldn't be a wrapper around ->get_entry() to take care of this for
>>> callers which want the propagated value. Calling the bare hook would
>>> remain as is.
>>
>> But I believe that some hostp2m entries will never be propagated. Sorry 
>> that I've not been clear about this. This is what I meant by "won't 
>> happen for altp2ms created after a while".
>>
>> Take, for example, the case where there's only the hostp2m for the first 
>> 30 minutes of a guest's life, and in this time somebody calls 
>> ept_set_entry(). This, in turn, calls p2m_altp2m_propagate_change() if ( 
>> entry_written && p2m_is_hostp2m(p2m) ).
>>
>> But at this point there are no altp2ms, so there's nowhere to propagate 
>> these changes to.
>>
>> Now, at minute 31 we create a new altp2m. The changes will never be 
>> propagated here, so altp2m->get_entry() will always (rightfully) return 
>> an invalid MFN.
> 
> I guess I'm missing something here: Why "rightfully"? If the guest
> runs on this altp2m, it'll die a quick death then, won't it? Yet if
> this is intended to be demand-populate, then why would there be
> automatic propagation for existing altp2m-s (as you explain
> further up, unless I'm getting this wrong)?
> 
>> We only want to make an exception for setting the SVE bit or mem access 
>> restrictions on an entry in the altp2m, but having get_entry() (or a 
>> wrapper) return the hostp2m values and MFN might not necessarily be what 
>> current callers expect.
> 
> Then I still don't see why you would want to force propagation
> in these cases: If it's generally demand-populate, it can't be right
> to _fully_ populate that slot. You ought to be setting the access
> type or the "suppress #VE" bit alone, without propagating the
> MFN and alike. The later, when the entry actually gets propagated,
> the entered values should be used as overrides to what comes
> from the host p2m.
To try to answer your question to the best of my knowledge, the altp2m
propagation appears to be a hybrid between lazy propagation, in
hvm_hap_nested_page_fault() (xen/arch/x86/hvm/hvm.c):

1748     ap2m_active = altp2m_active(currd);
1749
1750     /*
1751      * Take a lock on the host p2m speculatively, to avoid potential
1752      * locking order problems later and to handle unshare etc.
1753      */
1754     hostp2m = p2m_get_hostp2m(currd);
1755     mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
1756                               P2M_ALLOC | (npfec.write_access ?
P2M_UNSHARE : 0),
1757                               NULL);
1758
1759     if ( ap2m_active )
1760     {
1761         if ( p2m_altp2m_lazy_copy(curr, gpa, gla, npfec, &p2m) )
1762         {
1763             /* entry was lazily copied from host -- retry */
1764             __put_gfn(hostp2m, gfn);
1765             rc = 1;
1766             goto out;
1767         }
1768
1769         mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
1770     }
1771     else
1772         p2m = hostp2m;

and immediate propagation, in ept_set_entry() (xen/arch/x86/mm/p2m-ept.c):

 827     if ( entry_written && p2m_is_hostp2m(p2m) )
 828     {
 829         ret = p2m_altp2m_propagate_change(d, _gfn(gfn), mfn, order,
p2mt, p2ma);
 830         if ( !rc )
 831             rc = ret;
 832     }

Either way, doing what our patch does should pose no problem as far as I
can tell. For the lazy copy case, p2m_altp2m_lazy_copy() will do
nothing, as our function has already populated the slot correctly:

2378 bool_t p2m_altp2m_lazy_copy(struct vcpu *v, paddr_t gpa,
2379                             unsigned long gla, struct npfec npfec,
2380                             struct p2m_domain **ap2m)
2381 {
2382     struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
2383     p2m_type_t p2mt;
2384     p2m_access_t p2ma;
2385     unsigned int page_order;
2386     gfn_t gfn = _gfn(paddr_to_pfn(gpa));
2387     unsigned long mask;
2388     mfn_t mfn;
2389     int rv;
2390
2391     *ap2m = p2m_get_altp2m(v);
2392
2393     mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
2394                               0, &page_order);
2395     __put_gfn(*ap2m, gfn_x(gfn));
2396
2397     if ( !mfn_eq(mfn, INVALID_MFN) )
2398         return 0;

And the other case is not a concern at all.

Am I still misunderstanding something?


Thanks,
Razvan

Jan Beulich April 4, 2019, 10:16 a.m. UTC | #10

>>> On 03.04.19 at 18:16, <tamas.k.lengyel@gmail.com> wrote:
> On Wed, Apr 3, 2019 at 10:06 AM Jan Beulich <JBeulich@suse.com> wrote:
>> Then I still don't see why you would want to force propagation
>> in these cases: If it's generally demand-populate, it can't be right
>> to _fully_ populate that slot. You ought to be setting the access
>> type or the "suppress #VE" bit alone, without propagating the
>> MFN and alike. The later, when the entry actually gets propagated,
>> the entered values should be used as overrides to what comes
>> from the host p2m.
> 
> It is right to fully populate a slot when for example we want
> different mem_access permissions in different altp2m views. We can't
> wait for the domain to on-demand populate the altp2m view because we
> don't know when (and if) that will actually happen.

But you don't say _why_ it matters whether / when that's going
to happen.

> So
> p2m_set_altp2m_mem_access already copies the entry over from the
> hostp2m to the altp2m and then applies the requested mem_access
> setting. This is done even if the mem_access setting is the same as it
> was in the hostp2m.

And again you describe only current code, without clarifying why
the behavior is (and needs to be) the way it is.

> Doing the same behavior for p2m_set_suppress_ve I think is consistent,
> albeit you could easily overcome the issue by first simply setting a
> mem_access permission on the gfn to trigger the aforementioned copy to
> the altp2m before you try to set the ve settings.

Well, whatever the set-access behavior, the set-suppress-ve behavior
should match it, I agree. What I'm putting under question is whether
the former is really correct, and hence whether the patch here wouldn't
end up proliferating wrong behavior.

Jan

Jan Beulich April 4, 2019, 10:47 a.m. UTC | #11

>>> On 03.04.19 at 19:41, <rcojocaru@bitdefender.com> wrote:
> On 4/3/19 7:04 PM, Jan Beulich wrote:
>>>>> On 03.04.19 at 17:48, <rcojocaru@bitdefender.com> wrote:
>>> On 4/3/19 6:30 PM, Jan Beulich wrote:
>>>>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
>>>>> Changes to the hostp2m (also known as altp2m view 0) propagate to all
>>>>> existing altp2ms, but they do so in a lazy manner, and also that won't
>>>>> happen for altp2ms created after a while. So altp2ms will not
>>>>> necessarily know about a page that the hostp2m knows about, which should
>>>>> not stop us from setting mem access restrictions or the value of the SVE
>>>>> bit.
>>>>
>>>> But even if the propagation is lazy, a "get" should then return the
>>>> propagated value, shouldn't it? Or else callers (like is the case here)
>>>> can be mislead. I do recognize that there may be callers who
>>>> intentionally _don't_ want the propagated value, but I'd expect this
>>>> to be the exception, not the rule. I wonder therefore whether there
>>>> shouldn't be a wrapper around ->get_entry() to take care of this for
>>>> callers which want the propagated value. Calling the bare hook would
>>>> remain as is.
>>>
>>> But I believe that some hostp2m entries will never be propagated. Sorry 
>>> that I've not been clear about this. This is what I meant by "won't 
>>> happen for altp2ms created after a while".
>>>
>>> Take, for example, the case where there's only the hostp2m for the first 
>>> 30 minutes of a guest's life, and in this time somebody calls 
>>> ept_set_entry(). This, in turn, calls p2m_altp2m_propagate_change() if ( 
>>> entry_written && p2m_is_hostp2m(p2m) ).
>>>
>>> But at this point there are no altp2ms, so there's nowhere to propagate 
>>> these changes to.
>>>
>>> Now, at minute 31 we create a new altp2m. The changes will never be 
>>> propagated here, so altp2m->get_entry() will always (rightfully) return 
>>> an invalid MFN.
>> 
>> I guess I'm missing something here: Why "rightfully"? If the guest
>> runs on this altp2m, it'll die a quick death then, won't it? Yet if
>> this is intended to be demand-populate, then why would there be
>> automatic propagation for existing altp2m-s (as you explain
>> further up, unless I'm getting this wrong)?
>> 
>>> We only want to make an exception for setting the SVE bit or mem access 
>>> restrictions on an entry in the altp2m, but having get_entry() (or a 
>>> wrapper) return the hostp2m values and MFN might not necessarily be what 
>>> current callers expect.
>> 
>> Then I still don't see why you would want to force propagation
>> in these cases: If it's generally demand-populate, it can't be right
>> to _fully_ populate that slot. You ought to be setting the access
>> type or the "suppress #VE" bit alone, without propagating the
>> MFN and alike. The later, when the entry actually gets propagated,
>> the entered values should be used as overrides to what comes
>> from the host p2m.
> To try to answer your question to the best of my knowledge, the altp2m
> propagation appears to be a hybrid between lazy propagation,
>[...]
> and immediate propagation,

And I've been questioning whether this is an appropriate model,
i.e. whether the result at all times is always the same as either of
the pure variants. Thinking about it some more, it looks like it
indeed is. Yet then the question is why both functions don't use
p2m_altp2m_lazy_copy(), but rather construct things by other
means. I realize the function's gla and npfec parameters might
make it look non-suitable here, but neither parameter is actually
used by the function afaics.

And of course the other point remains: For callers who want to
see the propagate value (irrespective of what the physical
tables say), there would better be a function giving them the
intended result, rather than making every such caller go through
extra hoops.

Jan

Razvan Cojocaru April 4, 2019, 12:46 p.m. UTC | #12

On 4/3/19 6:30 PM, Jan Beulich wrote:
>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
>> On 4/3/19 5:58 PM, Jan Beulich wrote:
>>>>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
>>>> --- a/xen/arch/x86/mm/p2m.c
>>>> +++ b/xen/arch/x86/mm/p2m.c
>>>> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d, gfn_t gfn, bool suppress_ve,
>>>>        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
>>>>        if ( !mfn_valid(mfn) )
>>>>        {
>>>> -        rc = -ESRCH;
>>>> -        goto out;
>>>> +        unsigned int page_order;
>>>> +
>>>> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
>>>> +                                    P2M_ALLOC | P2M_UNSHARE, &page_order, 0);
>>>
>>> I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
>>> at least P2M_UNSHARE is too heavy: Why would you want to force
>>> un-sharing of a page when all you want to alter is #VE behavior?
>>
>> That logic was taken from p2m_set_altp2m_mem_access(), we thought the
>> two cases are very similar.
> 
> I see.

On the UNSHARE observation, we don't know why the author originally 
requested the flag. We decided to keep it on the assumption that it 
_probably_ handles some corner-case that somebody has come accross.

We'll prepare a mini-series factoring out the code we've been discussing 
in separate functions: one for getting things out of the hostp2m if the 
entry is not present in the altp2m, and one for the special 
page-order-dependent code (which is duplicated in 
p2m_set_altp2m_mem_access() and p2m_change_altp2m_gfn()).

Before going into that, are we now certain that ALLOC is sufficient? I 
believe it should be for _our_ use-cases, but we don't want to break 
anyone's code. Maybe Tamas knows more about this.

Thanks,
Razvan

Andrew Cooper April 4, 2019, 12:49 p.m. UTC | #13

On 04/04/2019 13:46, Razvan Cojocaru wrote:
> On 4/3/19 6:30 PM, Jan Beulich wrote:
>>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
>>> On 4/3/19 5:58 PM, Jan Beulich wrote:
>>>>>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
>>>>> --- a/xen/arch/x86/mm/p2m.c
>>>>> +++ b/xen/arch/x86/mm/p2m.c
>>>>> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d,
>>>>> gfn_t gfn, bool suppress_ve,
>>>>>        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
>>>>>        if ( !mfn_valid(mfn) )
>>>>>        {
>>>>> -        rc = -ESRCH;
>>>>> -        goto out;
>>>>> +        unsigned int page_order;
>>>>> +
>>>>> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
>>>>> +                                    P2M_ALLOC | P2M_UNSHARE,
>>>>> &page_order, 0);
>>>>
>>>> I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
>>>> at least P2M_UNSHARE is too heavy: Why would you want to force
>>>> un-sharing of a page when all you want to alter is #VE behavior?
>>>
>>> That logic was taken from p2m_set_altp2m_mem_access(), we thought the
>>> two cases are very similar.
>>
>> I see.
>
> On the UNSHARE observation, we don't know why the author originally
> requested the flag. We decided to keep it on the assumption that it
> _probably_ handles some corner-case that somebody has come accross.
>
> We'll prepare a mini-series factoring out the code we've been
> discussing in separate functions: one for getting things out of the
> hostp2m if the entry is not present in the altp2m, and one for the
> special page-order-dependent code (which is duplicated in
> p2m_set_altp2m_mem_access() and p2m_change_altp2m_gfn()).
>
> Before going into that, are we now certain that ALLOC is sufficient? I
> believe it should be for _our_ use-cases, but we don't want to break
> anyone's code. Maybe Tamas knows more about this.

UNSHARE is only for CoW/Paging, which is some experimental code added to
Xen 4.2(?) and has been slowly bit-rotting ever since.

We try to use good judgement when considering the intentions, but it is
fully out of security support and hasn't seen any testing in years. 
You're not going to break anything by making a mistake here.

~Andrew

Razvan Cojocaru April 4, 2019, 12:50 p.m. UTC | #14

On 4/4/19 3:46 PM, Razvan Cojocaru wrote:
> On 4/3/19 6:30 PM, Jan Beulich wrote:
>>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
>>> On 4/3/19 5:58 PM, Jan Beulich wrote:
>>>>>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
>>>>> --- a/xen/arch/x86/mm/p2m.c
>>>>> +++ b/xen/arch/x86/mm/p2m.c
>>>>> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d, 
>>>>> gfn_t gfn, bool suppress_ve,
>>>>>        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
>>>>>        if ( !mfn_valid(mfn) )
>>>>>        {
>>>>> -        rc = -ESRCH;
>>>>> -        goto out;
>>>>> +        unsigned int page_order;
>>>>> +
>>>>> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
>>>>> +                                    P2M_ALLOC | P2M_UNSHARE, 
>>>>> &page_order, 0);
>>>>
>>>> I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
>>>> at least P2M_UNSHARE is too heavy: Why would you want to force
>>>> un-sharing of a page when all you want to alter is #VE behavior?
>>>
>>> That logic was taken from p2m_set_altp2m_mem_access(), we thought the
>>> two cases are very similar.
>>
>> I see.
> 
> On the UNSHARE observation, we don't know why the author originally 
> requested the flag. We decided to keep it on the assumption that it 
> _probably_ handles some corner-case that somebody has come accross.
> 
> We'll prepare a mini-series factoring out the code we've been discussing 
> in separate functions: one for getting things out of the hostp2m if the 
> entry is not present in the altp2m, and one for the special 
> page-order-dependent code (which is duplicated in 
> p2m_set_altp2m_mem_access() and p2m_change_altp2m_gfn()).
> 
> Before going into that, are we now certain that ALLOC is sufficient? I 
> believe it should be for _our_ use-cases, but we don't want to break 
> anyone's code. Maybe Tamas knows more about this.

Sorry, I forgot to mention that p2m_change_altp2m_gfn() only uses ALLOC:

2649     /* Check host p2m if no valid entry in alternate */
2650     if ( !mfn_valid(mfn) )
2651     {
2652         mfn = __get_gfn_type_access(hp2m, gfn_x(old_gfn), &t, &a,
2653                                     P2M_ALLOC, &page_order, 0);
2654
2655         if ( !mfn_valid(mfn) || t != p2m_ram_rw )
2656             goto out;
2657
2658         /* If this is a superpage, copy that first */
2659         if ( page_order != PAGE_ORDER_4K )
2660         {
2661             gfn_t gfn;
2662             unsigned long mask;
2663
2664             mask = ~((1UL << page_order) - 1);
2665             gfn = _gfn(gfn_x(old_gfn) & mask);
2666             mfn = _mfn(mfn_x(mfn) & mask);
2667
2668             if ( ap2m->set_entry(ap2m, gfn, mfn, page_order, t, a, 1) )
2669                 goto out;
2670         }
2671     }

Confusing...


Thanks,
Razvan

Tamas K Lengyel April 4, 2019, 1:09 p.m. UTC | #15

On Thu, Apr 4, 2019 at 6:50 AM Razvan Cojocaru
<rcojocaru@bitdefender.com> wrote:
>
> On 4/4/19 3:46 PM, Razvan Cojocaru wrote:
> > On 4/3/19 6:30 PM, Jan Beulich wrote:
> >>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
> >>> On 4/3/19 5:58 PM, Jan Beulich wrote:
> >>>>>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
> >>>>> --- a/xen/arch/x86/mm/p2m.c
> >>>>> +++ b/xen/arch/x86/mm/p2m.c
> >>>>> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d,
> >>>>> gfn_t gfn, bool suppress_ve,
> >>>>>        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
> >>>>>        if ( !mfn_valid(mfn) )
> >>>>>        {
> >>>>> -        rc = -ESRCH;
> >>>>> -        goto out;
> >>>>> +        unsigned int page_order;
> >>>>> +
> >>>>> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
> >>>>> +                                    P2M_ALLOC | P2M_UNSHARE,
> >>>>> &page_order, 0);
> >>>>
> >>>> I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
> >>>> at least P2M_UNSHARE is too heavy: Why would you want to force
> >>>> un-sharing of a page when all you want to alter is #VE behavior?
> >>>
> >>> That logic was taken from p2m_set_altp2m_mem_access(), we thought the
> >>> two cases are very similar.
> >>
> >> I see.
> >
> > On the UNSHARE observation, we don't know why the author originally
> > requested the flag. We decided to keep it on the assumption that it
> > _probably_ handles some corner-case that somebody has come accross.
> >
> > We'll prepare a mini-series factoring out the code we've been discussing
> > in separate functions: one for getting things out of the hostp2m if the
> > entry is not present in the altp2m, and one for the special
> > page-order-dependent code (which is duplicated in
> > p2m_set_altp2m_mem_access() and p2m_change_altp2m_gfn()).
> >
> > Before going into that, are we now certain that ALLOC is sufficient? I
> > believe it should be for _our_ use-cases, but we don't want to break
> > anyone's code. Maybe Tamas knows more about this.
>
> Sorry, I forgot to mention that p2m_change_altp2m_gfn() only uses ALLOC:
>
> 2649     /* Check host p2m if no valid entry in alternate */
> 2650     if ( !mfn_valid(mfn) )
> 2651     {
> 2652         mfn = __get_gfn_type_access(hp2m, gfn_x(old_gfn), &t, &a,
> 2653                                     P2M_ALLOC, &page_order, 0);
> 2654
> 2655         if ( !mfn_valid(mfn) || t != p2m_ram_rw )
> 2656             goto out;
> 2657
> 2658         /* If this is a superpage, copy that first */
> 2659         if ( page_order != PAGE_ORDER_4K )
> 2660         {
> 2661             gfn_t gfn;
> 2662             unsigned long mask;
> 2663
> 2664             mask = ~((1UL << page_order) - 1);
> 2665             gfn = _gfn(gfn_x(old_gfn) & mask);
> 2666             mfn = _mfn(mfn_x(mfn) & mask);
> 2667
> 2668             if ( ap2m->set_entry(ap2m, gfn, mfn, page_order, t, a, 1) )
> 2669                 goto out;
> 2670         }
> 2671     }
>
> Confusing...

I agree that it is confusing. It would be fine to UNSHARE here as well
to keep things consistent but otherwise it's not really an issue as
the entry type is checked later to ensure that this is a p2m_ram_rw
entry. We are simply trying to keep mem_sharing and _modified_ altp2m
entries exclusive. So it is fine to have mem_shared entries in the
hostp2m and have those entries be copied into altp2m tables lazily,
but for altp2m entries that have changed mem_access permissions or are
remapped we want the entries in the hostp2m to be of regular type.
This is not necessarily a technical requirement, it's mostly just to
reduce complexity. So it would be fine to add UNSHARE here as well, I
guess the only reason why I haven't done that is because I already
trigger the unshare and copy-to-altp2m before remapping by setting
dummy mem_access permission on the entries.

Tamas

Tamas K Lengyel April 4, 2019, 1:15 p.m. UTC | #16

On Thu, Apr 4, 2019 at 6:49 AM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 04/04/2019 13:46, Razvan Cojocaru wrote:
> > On 4/3/19 6:30 PM, Jan Beulich wrote:
> >>>>> On 03.04.19 at 17:17, <rcojocaru@bitdefender.com> wrote:
> >>> On 4/3/19 5:58 PM, Jan Beulich wrote:
> >>>>>>> On 03.04.19 at 16:29, <aisaila@bitdefender.com> wrote:
> >>>>> --- a/xen/arch/x86/mm/p2m.c
> >>>>> +++ b/xen/arch/x86/mm/p2m.c
> >>>>> @@ -3011,8 +3011,16 @@ int p2m_set_suppress_ve(struct domain *d,
> >>>>> gfn_t gfn, bool suppress_ve,
> >>>>>        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
> >>>>>        if ( !mfn_valid(mfn) )
> >>>>>        {
> >>>>> -        rc = -ESRCH;
> >>>>> -        goto out;
> >>>>> +        unsigned int page_order;
> >>>>> +
> >>>>> +        mfn = __get_gfn_type_access(host_p2m, gfn_x(gfn), &t, &a,
> >>>>> +                                    P2M_ALLOC | P2M_UNSHARE,
> >>>>> &page_order, 0);
> >>>>
> >>>> I'm not entirely certain about P2M_ALLOC, but I'm pretty sure that
> >>>> at least P2M_UNSHARE is too heavy: Why would you want to force
> >>>> un-sharing of a page when all you want to alter is #VE behavior?
> >>>
> >>> That logic was taken from p2m_set_altp2m_mem_access(), we thought the
> >>> two cases are very similar.
> >>
> >> I see.
> >
> > On the UNSHARE observation, we don't know why the author originally
> > requested the flag. We decided to keep it on the assumption that it
> > _probably_ handles some corner-case that somebody has come accross.
> >
> > We'll prepare a mini-series factoring out the code we've been
> > discussing in separate functions: one for getting things out of the
> > hostp2m if the entry is not present in the altp2m, and one for the
> > special page-order-dependent code (which is duplicated in
> > p2m_set_altp2m_mem_access() and p2m_change_altp2m_gfn()).
> >
> > Before going into that, are we now certain that ALLOC is sufficient? I
> > believe it should be for _our_ use-cases, but we don't want to break
> > anyone's code. Maybe Tamas knows more about this.
>
> UNSHARE is only for CoW/Paging, which is some experimental code added to
> Xen 4.2(?) and has been slowly bit-rotting ever since.

At least mem_sharing still (suprisingly) works - albeit some recent
sanity-check restrictions on grabbing the lock for two gfn's at the
same time breaks it for debug builds. But that's a different story
that I hope to address in the near future.

>
> We try to use good judgement when considering the intentions, but it is
> fully out of security support and hasn't seen any testing in years.
> You're not going to break anything by making a mistake here.

Tsk, I still use mem_sharing and in fact have put in effort to make it
compatible with altp2m. But as far as triggering UNSHARE just to be
safe, I agree. That is fine and it won't break anything. It is
actually inline with what we agreed on last time I worked on it as a
general guide - keep altp2m and mem_sharing compatible but at least
exclusive for each frame.

Tamas

Jan Beulich April 4, 2019, 2:32 p.m. UTC | #17

>>> On 04.04.19 at 14:50, <rcojocaru@bitdefender.com> wrote:
> On 4/4/19 3:46 PM, Razvan Cojocaru wrote:
>> Before going into that, are we now certain that ALLOC is sufficient? I 
>> believe it should be for _our_ use-cases, but we don't want to break 
>> anyone's code. Maybe Tamas knows more about this.
> 
> Sorry, I forgot to mention that p2m_change_altp2m_gfn() only uses ALLOC:

I think this together with Andrew's reply and what I have said
earlier on should be sufficient to answer the question with "yes".
But I see Tamas is telling you the opposite, so I guess I'll reply
there as well.

Jan

Jan Beulich April 4, 2019, 2:36 p.m. UTC | #18

>>> On 04.04.19 at 15:09, <tamas@tklengyel.com> wrote:
> I agree that it is confusing. It would be fine to UNSHARE here as well
> to keep things consistent but otherwise it's not really an issue as
> the entry type is checked later to ensure that this is a p2m_ram_rw
> entry. We are simply trying to keep mem_sharing and _modified_ altp2m
> entries exclusive. So it is fine to have mem_shared entries in the
> hostp2m and have those entries be copied into altp2m tables lazily,
> but for altp2m entries that have changed mem_access permissions or are
> remapped we want the entries in the hostp2m to be of regular type.
> This is not necessarily a technical requirement, it's mostly just to
> reduce complexity. So it would be fine to add UNSHARE here as well, I
> guess the only reason why I haven't done that is because I already
> trigger the unshare and copy-to-altp2m before remapping by setting
> dummy mem_access permission on the entries.

I'm afraid I don't agree with this justification: mem-sharing is about
contents of pages, whereas altp2m is about meta data (permissions
etc). I don't see why one would want to unshare because of a meta
data adjustment other than a page becoming non-CoW-writable.
Eagerly un-sharing in the end undermines the intentions of sharing.

Jan

Tamas K Lengyel April 4, 2019, 2:43 p.m. UTC | #19

On Thu, Apr 4, 2019 at 8:36 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 04.04.19 at 15:09, <tamas@tklengyel.com> wrote:
> > I agree that it is confusing. It would be fine to UNSHARE here as well
> > to keep things consistent but otherwise it's not really an issue as
> > the entry type is checked later to ensure that this is a p2m_ram_rw
> > entry. We are simply trying to keep mem_sharing and _modified_ altp2m
> > entries exclusive. So it is fine to have mem_shared entries in the
> > hostp2m and have those entries be copied into altp2m tables lazily,
> > but for altp2m entries that have changed mem_access permissions or are
> > remapped we want the entries in the hostp2m to be of regular type.
> > This is not necessarily a technical requirement, it's mostly just to
> > reduce complexity. So it would be fine to add UNSHARE here as well, I
> > guess the only reason why I haven't done that is because I already
> > trigger the unshare and copy-to-altp2m before remapping by setting
> > dummy mem_access permission on the entries.
>
> I'm afraid I don't agree with this justification: mem-sharing is about
> contents of pages,

That's incorrect. Mem sharing doesn't care about the contents of pages at all.

whereas altp2m is about meta data (permissions
> etc). I don't see why one would want to unshare because of a meta
> data adjustment other than a page becoming non-CoW-writable.
> Eagerly un-sharing in the end undermines the intentions of sharing.

We are unsharing to keep altp2m and mem_sharing compatible but
mutually exclusive. Even if technically they could co-exist, last time
I worked on this we came to this agreement on the mailinglist as to
reduce complexity and to make reviewing easier.

Tamas

Jan Beulich April 4, 2019, 2:51 p.m. UTC | #20

>>> On 04.04.19 at 16:43, <tamas@tklengyel.com> wrote:
> On Thu, Apr 4, 2019 at 8:36 AM Jan Beulich <JBeulich@suse.com> wrote:
>>
>> >>> On 04.04.19 at 15:09, <tamas@tklengyel.com> wrote:
>> > I agree that it is confusing. It would be fine to UNSHARE here as well
>> > to keep things consistent but otherwise it's not really an issue as
>> > the entry type is checked later to ensure that this is a p2m_ram_rw
>> > entry. We are simply trying to keep mem_sharing and _modified_ altp2m
>> > entries exclusive. So it is fine to have mem_shared entries in the
>> > hostp2m and have those entries be copied into altp2m tables lazily,
>> > but for altp2m entries that have changed mem_access permissions or are
>> > remapped we want the entries in the hostp2m to be of regular type.
>> > This is not necessarily a technical requirement, it's mostly just to
>> > reduce complexity. So it would be fine to add UNSHARE here as well, I
>> > guess the only reason why I haven't done that is because I already
>> > trigger the unshare and copy-to-altp2m before remapping by setting
>> > dummy mem_access permission on the entries.
>>
>> I'm afraid I don't agree with this justification: mem-sharing is about
>> contents of pages,
> 
> That's incorrect. Mem sharing doesn't care about the contents of pages at 
> all.

Would you mind clarifying this? It's about sharing of the contents
of the pages, isn't it? (Of course the me-sharing code doesn't care
what the contents are.)

> whereas altp2m is about meta data (permissions
>> etc). I don't see why one would want to unshare because of a meta
>> data adjustment other than a page becoming non-CoW-writable.
>> Eagerly un-sharing in the end undermines the intentions of sharing.
> 
> We are unsharing to keep altp2m and mem_sharing compatible but
> mutually exclusive. Even if technically they could co-exist, last time
> I worked on this we came to this agreement on the mailinglist as to
> reduce complexity and to make reviewing easier.

But "mutually exclusive" to me means you can do only one of the
two on a particular VM. Which then makes it even less necessary
to request un-share from altp2m code - such requests would then
simple be no-ops.

Jan

George Dunlap April 4, 2019, 2:54 p.m. UTC | #21

> On Apr 4, 2019, at 3:36 PM, Jan Beulich <JBeulich@suse.com> wrote:
> 
>>>> On 04.04.19 at 15:09, <tamas@tklengyel.com> wrote:
>> I agree that it is confusing. It would be fine to UNSHARE here as well
>> to keep things consistent but otherwise it's not really an issue as
>> the entry type is checked later to ensure that this is a p2m_ram_rw
>> entry. We are simply trying to keep mem_sharing and _modified_ altp2m
>> entries exclusive. So it is fine to have mem_shared entries in the
>> hostp2m and have those entries be copied into altp2m tables lazily,
>> but for altp2m entries that have changed mem_access permissions or are
>> remapped we want the entries in the hostp2m to be of regular type.
>> This is not necessarily a technical requirement, it's mostly just to
>> reduce complexity. So it would be fine to add UNSHARE here as well, I
>> guess the only reason why I haven't done that is because I already
>> trigger the unshare and copy-to-altp2m before remapping by setting
>> dummy mem_access permission on the entries.
> 
> I'm afraid I don't agree with this justification: mem-sharing is about
> contents of pages, whereas altp2m is about meta data (permissions
> etc). I don't see why one would want to unshare because of a meta
> data adjustment other than a page becoming non-CoW-writable.
> Eagerly un-sharing in the end undermines the intentions of sharing.

Remember also that altp2ms allow someone to set not just alternate views with different permissions, but also alternate views with different backing mfns.  Combining shared mfns with alternate views with different mfns on the same gfn means that you have to be very careful not to end up giving write permission to the shared page, which would be a security issue.  Unsharing when creating an altp2m entry means that any given gfn is *either* shared *or* duplicated across altp2ms, but not both; this simplifies the reasoning.

 -George

Tamas K Lengyel April 4, 2019, 3:06 p.m. UTC | #22

On Thu, Apr 4, 2019 at 8:51 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 04.04.19 at 16:43, <tamas@tklengyel.com> wrote:
> > On Thu, Apr 4, 2019 at 8:36 AM Jan Beulich <JBeulich@suse.com> wrote:
> >>
> >> >>> On 04.04.19 at 15:09, <tamas@tklengyel.com> wrote:
> >> > I agree that it is confusing. It would be fine to UNSHARE here as well
> >> > to keep things consistent but otherwise it's not really an issue as
> >> > the entry type is checked later to ensure that this is a p2m_ram_rw
> >> > entry. We are simply trying to keep mem_sharing and _modified_ altp2m
> >> > entries exclusive. So it is fine to have mem_shared entries in the
> >> > hostp2m and have those entries be copied into altp2m tables lazily,
> >> > but for altp2m entries that have changed mem_access permissions or are
> >> > remapped we want the entries in the hostp2m to be of regular type.
> >> > This is not necessarily a technical requirement, it's mostly just to
> >> > reduce complexity. So it would be fine to add UNSHARE here as well, I
> >> > guess the only reason why I haven't done that is because I already
> >> > trigger the unshare and copy-to-altp2m before remapping by setting
> >> > dummy mem_access permission on the entries.
> >>
> >> I'm afraid I don't agree with this justification: mem-sharing is about
> >> contents of pages,
> >
> > That's incorrect. Mem sharing doesn't care about the contents of pages at
> > all.
>
> Would you mind clarifying this? It's about sharing of the contents
> of the pages, isn't it? (Of course the me-sharing code doesn't care
> what the contents are.)

That's what I mean. The contents of the pages are never checked - you
can share two pages with different contents in which case the "client"
pages' contents are discarded. Once pages are shared it means that the
pages have the same backing mfn. So it's not that the system ensures
that there *contents* are the same, it's only that while the type is
p2m_ram_shared, they have the same backing mfn.

>
> > whereas altp2m is about meta data (permissions
> >> etc). I don't see why one would want to unshare because of a meta
> >> data adjustment other than a page becoming non-CoW-writable.
> >> Eagerly un-sharing in the end undermines the intentions of sharing.
> >
> > We are unsharing to keep altp2m and mem_sharing compatible but
> > mutually exclusive. Even if technically they could co-exist, last time
> > I worked on this we came to this agreement on the mailinglist as to
> > reduce complexity and to make reviewing easier.
>
> But "mutually exclusive" to me means you can do only one of the
> two on a particular VM. Which then makes it even less necessary
> to request un-share from altp2m code - such requests would then
> simple be no-ops.

Which is exactly what I wanted to avoid. I want to be able use both on
the same domain. But that complexity was too much so we relaxed the
"exclusivity" to be per-page instead of global per VM.

Tamas

Jan Beulich April 5, 2019, 7:36 a.m. UTC | #23

>>> On 04.04.19 at 16:54, <George.Dunlap@citrix.com> wrote:
>> On Apr 4, 2019, at 3:36 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 04.04.19 at 15:09, <tamas@tklengyel.com> wrote:
>>> I agree that it is confusing. It would be fine to UNSHARE here as well
>>> to keep things consistent but otherwise it's not really an issue as
>>> the entry type is checked later to ensure that this is a p2m_ram_rw
>>> entry. We are simply trying to keep mem_sharing and _modified_ altp2m
>>> entries exclusive. So it is fine to have mem_shared entries in the
>>> hostp2m and have those entries be copied into altp2m tables lazily,
>>> but for altp2m entries that have changed mem_access permissions or are
>>> remapped we want the entries in the hostp2m to be of regular type.
>>> This is not necessarily a technical requirement, it's mostly just to
>>> reduce complexity. So it would be fine to add UNSHARE here as well, I
>>> guess the only reason why I haven't done that is because I already
>>> trigger the unshare and copy-to-altp2m before remapping by setting
>>> dummy mem_access permission on the entries.
>> 
>> I'm afraid I don't agree with this justification: mem-sharing is about
>> contents of pages, whereas altp2m is about meta data (permissions
>> etc). I don't see why one would want to unshare because of a meta
>> data adjustment other than a page becoming non-CoW-writable.
>> Eagerly un-sharing in the end undermines the intentions of sharing.
> 
> Remember also that altp2ms allow someone to set not just alternate views 
> with different permissions, but also alternate views with different backing 
> mfns.  Combining shared mfns with alternate views with different mfns on the 
> same gfn means that you have to be very careful not to end up giving write 
> permission to the shared page, which would be a security issue.  Unsharing 
> when creating an altp2m entry means that any given gfn is *either* shared 
> *or* duplicated across altp2ms, but not both; this simplifies the reasoning.

Hmm, yes, I can see how this gets complicated. But is this behavior
symmetric? I.e. will attempts to share a GFN fail when it has a non-
default setting in one of the alternate views? Looking at the code I
can't seem to recognize such behavior.

Furthermore I'm puzzled by p2m_altp2m_propagate_change()
apparently blindly overwriting (almost) everything. Is it really
intended in almost cases (there looks to be an exception when
the old entry holds INVALID_MFN; I wonder though whether its
condition isn't inverted) to discard special access and/or MFNs in
alternate views when the host p2m's respective slot changes?

Looking at the function I also wonder whether it doesn't
pointlessly call p2m_reset_altp2m() when old and new entry
both hold INVALID_MFN.

Jan

Tamas K Lengyel April 5, 2019, 1:59 p.m. UTC | #24

On Fri, Apr 5, 2019 at 1:36 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 04.04.19 at 16:54, <George.Dunlap@citrix.com> wrote:
> >> On Apr 4, 2019, at 3:36 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>>>> On 04.04.19 at 15:09, <tamas@tklengyel.com> wrote:
> >>> I agree that it is confusing. It would be fine to UNSHARE here as well
> >>> to keep things consistent but otherwise it's not really an issue as
> >>> the entry type is checked later to ensure that this is a p2m_ram_rw
> >>> entry. We are simply trying to keep mem_sharing and _modified_ altp2m
> >>> entries exclusive. So it is fine to have mem_shared entries in the
> >>> hostp2m and have those entries be copied into altp2m tables lazily,
> >>> but for altp2m entries that have changed mem_access permissions or are
> >>> remapped we want the entries in the hostp2m to be of regular type.
> >>> This is not necessarily a technical requirement, it's mostly just to
> >>> reduce complexity. So it would be fine to add UNSHARE here as well, I
> >>> guess the only reason why I haven't done that is because I already
> >>> trigger the unshare and copy-to-altp2m before remapping by setting
> >>> dummy mem_access permission on the entries.
> >>
> >> I'm afraid I don't agree with this justification: mem-sharing is about
> >> contents of pages, whereas altp2m is about meta data (permissions
> >> etc). I don't see why one would want to unshare because of a meta
> >> data adjustment other than a page becoming non-CoW-writable.
> >> Eagerly un-sharing in the end undermines the intentions of sharing.
> >
> > Remember also that altp2ms allow someone to set not just alternate views
> > with different permissions, but also alternate views with different backing
> > mfns.  Combining shared mfns with alternate views with different mfns on the
> > same gfn means that you have to be very careful not to end up giving write
> > permission to the shared page, which would be a security issue.  Unsharing
> > when creating an altp2m entry means that any given gfn is *either* shared
> > *or* duplicated across altp2ms, but not both; this simplifies the reasoning.
>
> Hmm, yes, I can see how this gets complicated. But is this behavior
> symmetric? I.e. will attempts to share a GFN fail when it has a non-
> default setting in one of the alternate views? Looking at the code I
> can't seem to recognize such behavior.

There are checks in place for that. Take a look at the nominate_page
function in mm/mem_sharing.c.

>
> Furthermore I'm puzzled by p2m_altp2m_propagate_change()
> apparently blindly overwriting (almost) everything. Is it really
> intended in almost cases (there looks to be an exception when
> the old entry holds INVALID_MFN; I wonder though whether its
> condition isn't inverted) to discard special access and/or MFNs in
> alternate views when the host p2m's respective slot changes?
>
> Looking at the function I also wonder whether it doesn't
> pointlessly call p2m_reset_altp2m() when old and new entry
> both hold INVALID_MFN.

It's not ideal for sure. Both that and the resetting of all altp2m
views completely when the hostp2m changes are troubling behaviors that
limit when altp2m can be used. It's up for debate though how hostp2m
changes should be handled, and handled safely. I think the current
implementation just tabled those hard questions by resetting
everything.

Tamas

[v1] Fix p2m_set_suppress_ve

Commit Message

Comments

Patch