diff mbox series

[PATCH-for-4.13] x86/mm: don't needlessly veto migration

Message ID 20191001082818.34233-1-paul.durrant@citrix.com (mailing list archive)
State Superseded
Headers show
Series [PATCH-for-4.13] x86/mm: don't needlessly veto migration | expand

Commit Message

Paul Durrant Oct. 1, 2019, 8:28 a.m. UTC
Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
domain, migration may be needlessly vetoed due to the check of
is_iommu_enabled() in paging_log_dirty_enable().
There is actually no need to prevent logdirty from being enabled unless
devices are assigned to a domain and that domain is sharing HAP mappings
with the IOMMU (in which case disabling write permissions in the P2M may
cause DMA faults).

This patch therefore reverts commit 37201c62 "make logdirty and iommu
mutually exclusive" and replaces it with checks to ensure that, if
iommu_use_hap_pt() is true, that logdirty and device assignment are mutually
exclusive.

NOTE: While in the neighbourhood, the bool_t parameter type in
      paging_log_dirty_enable() is replaced with a bool and the format
      of the comment in assign_device() is fixed.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/mm/hap/hap.c     |  2 +-
 xen/arch/x86/mm/paging.c      |  8 ++++----
 xen/drivers/passthrough/pci.c | 10 +++++++---
 xen/include/asm-x86/paging.h  |  2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

Comments

Jürgen Groß Oct. 1, 2019, 8:31 a.m. UTC | #1
On 01.10.19 10:28, Paul Durrant wrote:
> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
> domain, migration may be needlessly vetoed due to the check of
> is_iommu_enabled() in paging_log_dirty_enable().
> There is actually no need to prevent logdirty from being enabled unless
> devices are assigned to a domain and that domain is sharing HAP mappings
> with the IOMMU (in which case disabling write permissions in the P2M may
> cause DMA faults).
> 
> This patch therefore reverts commit 37201c62 "make logdirty and iommu
> mutually exclusive" and replaces it with checks to ensure that, if
> iommu_use_hap_pt() is true, that logdirty and device assignment are mutually
> exclusive.
> 
> NOTE: While in the neighbourhood, the bool_t parameter type in
>        paging_log_dirty_enable() is replaced with a bool and the format
>        of the comment in assign_device() is fixed.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen
Jan Beulich Oct. 1, 2019, 8:46 a.m. UTC | #2
On 01.10.2019 10:28, Paul Durrant wrote:
> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
> domain, migration may be needlessly vetoed due to the check of
> is_iommu_enabled() in paging_log_dirty_enable().
> There is actually no need to prevent logdirty from being enabled unless
> devices are assigned to a domain and that domain is sharing HAP mappings
> with the IOMMU (in which case disabling write permissions in the P2M may
> cause DMA faults).

But that's taking into account only half of the reason of the
exclusion. The other half is that assigned devices may cause pages
to be dirtied behind the back of the log-dirty logic. Therefore ...

> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -209,15 +209,15 @@ static int paging_free_log_dirty_bitmap(struct domain *d, int rc)
>      return rc;
>  }
>  
> -int paging_log_dirty_enable(struct domain *d, bool_t log_global)
> +int paging_log_dirty_enable(struct domain *d, bool log_global)
>  {
>      int ret;
>  
> -    if ( is_iommu_enabled(d) && log_global )
> +    if ( has_arch_pdevs(d) && iommu_use_hap_pt(d) && log_global )

... the iommu_use_hap_pt(d) needs to be dropped again, I think.

> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1486,11 +1486,15 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
>      if ( !is_iommu_enabled(d) )
>          return 0;
>  
> -    /* Prevent device assign if mem paging or mem sharing have been 
> -     * enabled for this domain */
> +    /*
> +     * Prevent device assign if mem paging or mem sharing have been
> +     * enabled for this domain, or logdirty is enabled and the P2M is
> +     * shared with the IOMMU.
> +     */
>      if ( unlikely(d->arch.hvm.mem_sharing_enabled ||
>                    vm_event_check_ring(d->vm_event_paging) ||
> -                  p2m_get_hostp2m(d)->global_logdirty) )
> +                  (p2m_get_hostp2m(d)->global_logdirty &&
> +                   iommu_use_hap_pt(d))) )

This change wants dropping altogether then.

Jan
Paul Durrant Oct. 1, 2019, 8:52 a.m. UTC | #3
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 01 October 2019 09:46
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross <jgross@suse.com>; Wei
> Liu <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> 
> On 01.10.2019 10:28, Paul Durrant wrote:
> > Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
> > domain, migration may be needlessly vetoed due to the check of
> > is_iommu_enabled() in paging_log_dirty_enable().
> > There is actually no need to prevent logdirty from being enabled unless
> > devices are assigned to a domain and that domain is sharing HAP mappings
> > with the IOMMU (in which case disabling write permissions in the P2M may
> > cause DMA faults).
> 
> But that's taking into account only half of the reason of the
> exclusion. The other half is that assigned devices may cause pages
> to be dirtied behind the back of the log-dirty logic.

But that's no reason to veto logdirty. Some devices have drivers (in dom0) which can extract DMA dirtying information and set dirty tracking information appropriately.

  Paul

> Therefore ...
> 
> > --- a/xen/arch/x86/mm/paging.c
> > +++ b/xen/arch/x86/mm/paging.c
> > @@ -209,15 +209,15 @@ static int paging_free_log_dirty_bitmap(struct domain *d, int rc)
> >      return rc;
> >  }
> >
> > -int paging_log_dirty_enable(struct domain *d, bool_t log_global)
> > +int paging_log_dirty_enable(struct domain *d, bool log_global)
> >  {
> >      int ret;
> >
> > -    if ( is_iommu_enabled(d) && log_global )
> > +    if ( has_arch_pdevs(d) && iommu_use_hap_pt(d) && log_global )
> 
> ... the iommu_use_hap_pt(d) needs to be dropped again, I think.
> 
> > --- a/xen/drivers/passthrough/pci.c
> > +++ b/xen/drivers/passthrough/pci.c
> > @@ -1486,11 +1486,15 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32
> flag)
> >      if ( !is_iommu_enabled(d) )
> >          return 0;
> >
> > -    /* Prevent device assign if mem paging or mem sharing have been
> > -     * enabled for this domain */
> > +    /*
> > +     * Prevent device assign if mem paging or mem sharing have been
> > +     * enabled for this domain, or logdirty is enabled and the P2M is
> > +     * shared with the IOMMU.
> > +     */
> >      if ( unlikely(d->arch.hvm.mem_sharing_enabled ||
> >                    vm_event_check_ring(d->vm_event_paging) ||
> > -                  p2m_get_hostp2m(d)->global_logdirty) )
> > +                  (p2m_get_hostp2m(d)->global_logdirty &&
> > +                   iommu_use_hap_pt(d))) )
> 
> This change wants dropping altogether then.
> 
> Jan
Jan Beulich Oct. 1, 2019, 9:19 a.m. UTC | #4
On 01.10.2019 10:52, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 01 October 2019 09:46
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross <jgross@suse.com>; Wei
>> Liu <wl@xen.org>
>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>
>> On 01.10.2019 10:28, Paul Durrant wrote:
>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>>> domain, migration may be needlessly vetoed due to the check of
>>> is_iommu_enabled() in paging_log_dirty_enable().
>>> There is actually no need to prevent logdirty from being enabled unless
>>> devices are assigned to a domain and that domain is sharing HAP mappings
>>> with the IOMMU (in which case disabling write permissions in the P2M may
>>> cause DMA faults).
>>
>> But that's taking into account only half of the reason of the
>> exclusion. The other half is that assigned devices may cause pages
>> to be dirtied behind the back of the log-dirty logic.
> 
> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
> which can extract DMA dirtying information and set dirty tracking
> information appropriately.

It still needs a positive identification then: Such drivers should tell
Xen for which specific devices such information is going to be provided.
I also wonder what interface I would have forgot about that would allow
such a driver to propagate dirtying information in the first place:
XEN_DMOP_modified_memory is for emulators only aiui, and does not provide
a means for Xen to actively query dirtied state (or request updating
thereof) of pages owned by a domain (as would be needed at least on the
XEN_DOMCTL_SHADOW_LOGDIRTY_FINAL invocation).

Jan
Paul Durrant Oct. 1, 2019, 9:36 a.m. UTC | #5
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 01 October 2019 10:19
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu
> <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> 
> On 01.10.2019 10:52, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 01 October 2019 09:46
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> >> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross <jgross@suse.com>;
> Wei
> >> Liu <wl@xen.org>
> >> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>
> >> On 01.10.2019 10:28, Paul Durrant wrote:
> >>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
> >>> domain, migration may be needlessly vetoed due to the check of
> >>> is_iommu_enabled() in paging_log_dirty_enable().
> >>> There is actually no need to prevent logdirty from being enabled unless
> >>> devices are assigned to a domain and that domain is sharing HAP mappings
> >>> with the IOMMU (in which case disabling write permissions in the P2M may
> >>> cause DMA faults).
> >>
> >> But that's taking into account only half of the reason of the
> >> exclusion. The other half is that assigned devices may cause pages
> >> to be dirtied behind the back of the log-dirty logic.
> >
> > But that's no reason to veto logdirty. Some devices have drivers (in dom0)
> > which can extract DMA dirtying information and set dirty tracking
> > information appropriately.
> 
> It still needs a positive identification then: Such drivers should tell
> Xen for which specific devices such information is going to be provided.

Why does the hypervisor need have the right of veto though? Surely it is the toolstack that should decide whether a VM is migratable in the presence of assigned h/w. Xen need only be concerned with the integrity of the host, which is why the check for ETP sharing remains.

> I also wonder what interface I would have forgot about that would allow
> such a driver to propagate dirtying information in the first place:
> XEN_DMOP_modified_memory is for emulators only aiui, and does not provide
> a means for Xen to actively query dirtied state (or request updating
> thereof) of pages owned by a domain (as would be needed at least on the
> XEN_DOMCTL_SHADOW_LOGDIRTY_FINAL invocation).

XEN_DMOP_modified_memory is indeed the interface. After each round of live memory copy, the toolstack can use an 'emulator' process running in dom0 to query the assigned device for pages dirtied by DMA and then add those into the set of pages to be copied in the next round. Similarly, prior to final memory copy, the devices is quiesced (i.e. bus master it turned off) and then the final set of dirtied pages is determined.

 Paul

> 
> Jan
Jan Beulich Oct. 1, 2019, 10:15 a.m. UTC | #6
On 01.10.2019 11:36, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 01 October 2019 10:19
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu
>> <wl@xen.org>
>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>
>> On 01.10.2019 10:52, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 01 October 2019 09:46
>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
>>>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross <jgross@suse.com>;
>> Wei
>>>> Liu <wl@xen.org>
>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>
>>>> On 01.10.2019 10:28, Paul Durrant wrote:
>>>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>>>>> domain, migration may be needlessly vetoed due to the check of
>>>>> is_iommu_enabled() in paging_log_dirty_enable().
>>>>> There is actually no need to prevent logdirty from being enabled unless
>>>>> devices are assigned to a domain and that domain is sharing HAP mappings
>>>>> with the IOMMU (in which case disabling write permissions in the P2M may
>>>>> cause DMA faults).
>>>>
>>>> But that's taking into account only half of the reason of the
>>>> exclusion. The other half is that assigned devices may cause pages
>>>> to be dirtied behind the back of the log-dirty logic.
>>>
>>> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
>>> which can extract DMA dirtying information and set dirty tracking
>>> information appropriately.
>>
>> It still needs a positive identification then: Such drivers should tell
>> Xen for which specific devices such information is going to be provided.
> 
> Why does the hypervisor need have the right of veto though? Surely it is
> the toolstack that should decide whether a VM is migratable in the
> presence of assigned h/w. Xen need only be concerned with the integrity
> of the host, which is why the check for ETP sharing remains.

While the tool stack is to decide, the hypervisor is expected to guarantee
correct data coming back from XEN_DOMCTL_SHADOW_OP_{PEEK,CLEAN}.

Jan
Paul Durrant Oct. 1, 2019, 10:24 a.m. UTC | #7
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 01 October 2019 11:15
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu
> <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> 
> On 01.10.2019 11:36, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 01 October 2019 10:19
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
> >> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei
> Liu
> >> <wl@xen.org>
> >> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>
> >> On 01.10.2019 10:52, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 01 October 2019 09:46
> >>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> >>>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross
> <jgross@suse.com>;
> >> Wei
> >>>> Liu <wl@xen.org>
> >>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>>>
> >>>> On 01.10.2019 10:28, Paul Durrant wrote:
> >>>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
> >>>>> domain, migration may be needlessly vetoed due to the check of
> >>>>> is_iommu_enabled() in paging_log_dirty_enable().
> >>>>> There is actually no need to prevent logdirty from being enabled unless
> >>>>> devices are assigned to a domain and that domain is sharing HAP mappings
> >>>>> with the IOMMU (in which case disabling write permissions in the P2M may
> >>>>> cause DMA faults).
> >>>>
> >>>> But that's taking into account only half of the reason of the
> >>>> exclusion. The other half is that assigned devices may cause pages
> >>>> to be dirtied behind the back of the log-dirty logic.
> >>>
> >>> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
> >>> which can extract DMA dirtying information and set dirty tracking
> >>> information appropriately.
> >>
> >> It still needs a positive identification then: Such drivers should tell
> >> Xen for which specific devices such information is going to be provided.
> >
> > Why does the hypervisor need have the right of veto though? Surely it is
> > the toolstack that should decide whether a VM is migratable in the
> > presence of assigned h/w. Xen need only be concerned with the integrity
> > of the host, which is why the check for ETP sharing remains.
> 
> While the tool stack is to decide, the hypervisor is expected to guarantee
> correct data coming back from XEN_DOMCTL_SHADOW_OP_{PEEK,CLEAN}.

For some definition of 'correct', yes, and I don't think that this change violates any definition I can find in the domctl header.

Note: there are already emulators that will be playing with the dirty map on an arbitrary and unsynchronized basis because they are emulating bus mastering h/w.

  Paul

> 
> Jan
George Dunlap Oct. 1, 2019, 10:34 a.m. UTC | #8
On 10/1/19 11:24 AM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 01 October 2019 11:15
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu
>> <wl@xen.org>
>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>
>> On 01.10.2019 11:36, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 01 October 2019 10:19
>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
>>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei
>> Liu
>>>> <wl@xen.org>
>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>
>>>> On 01.10.2019 10:52, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: 01 October 2019 09:46
>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
>>>>>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross
>> <jgross@suse.com>;
>>>> Wei
>>>>>> Liu <wl@xen.org>
>>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>>>
>>>>>> On 01.10.2019 10:28, Paul Durrant wrote:
>>>>>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>>>>>>> domain, migration may be needlessly vetoed due to the check of
>>>>>>> is_iommu_enabled() in paging_log_dirty_enable().
>>>>>>> There is actually no need to prevent logdirty from being enabled unless
>>>>>>> devices are assigned to a domain and that domain is sharing HAP mappings
>>>>>>> with the IOMMU (in which case disabling write permissions in the P2M may
>>>>>>> cause DMA faults).
>>>>>>
>>>>>> But that's taking into account only half of the reason of the
>>>>>> exclusion. The other half is that assigned devices may cause pages
>>>>>> to be dirtied behind the back of the log-dirty logic.
>>>>>
>>>>> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
>>>>> which can extract DMA dirtying information and set dirty tracking
>>>>> information appropriately.
>>>>
>>>> It still needs a positive identification then: Such drivers should tell
>>>> Xen for which specific devices such information is going to be provided.
>>>
>>> Why does the hypervisor need have the right of veto though? Surely it is
>>> the toolstack that should decide whether a VM is migratable in the
>>> presence of assigned h/w. Xen need only be concerned with the integrity
>>> of the host, which is why the check for ETP sharing remains.
>>
>> While the tool stack is to decide, the hypervisor is expected to guarantee
>> correct data coming back from XEN_DOMCTL_SHADOW_OP_{PEEK,CLEAN}.
> 
> For some definition of 'correct', yes, and I don't think that this change violates any definition I can find in the domctl header.
> 
> Note: there are already emulators that will be playing with the dirty map on an arbitrary and unsynchronized basis because they are emulating bus mastering h/w.

But the question is, do we want the toolstack to have to become an
expert in what hardware might have external dirty tracking, and whether
such tracking is active?  At the moment that would mean either 1)
putting that information inside of libxc, or 2) duplicating it across
xapi and libxl, for instance.

One thing we could imagine is that when specific devices have an active
emulator (or whatever) propagating the dirty information, for that code
to tell Xen, "I am implementing dirty tracking for this device".  Then
when the toolstack enables logdirty, the check can be, "Are there any
devices *that don't have external dirty tracking enabled* assigned to
the guest?"

 -George
Paul Durrant Oct. 1, 2019, 10:40 a.m. UTC | #9
-----Original Message-----
> From: George Dunlap <george.dunlap@citrix.com>
> Sent: 01 October 2019 11:34
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' <jbeulich@suse.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; xen-
> devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> 
> On 10/1/19 11:24 AM, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 01 October 2019 11:15
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
> >> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei
> Liu
> >> <wl@xen.org>
> >> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>
> >> On 01.10.2019 11:36, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 01 October 2019 10:19
> >>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger
> Pau
> >>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>;
> Wei
> >> Liu
> >>>> <wl@xen.org>
> >>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>>>
> >>>> On 01.10.2019 10:52, Paul Durrant wrote:
> >>>>>> -----Original Message-----
> >>>>>> From: Jan Beulich <jbeulich@suse.com>
> >>>>>> Sent: 01 October 2019 09:46
> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>>>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> >>>>>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross
> >> <jgross@suse.com>;
> >>>> Wei
> >>>>>> Liu <wl@xen.org>
> >>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>>>>>
> >>>>>> On 01.10.2019 10:28, Paul Durrant wrote:
> >>>>>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
> >>>>>>> domain, migration may be needlessly vetoed due to the check of
> >>>>>>> is_iommu_enabled() in paging_log_dirty_enable().
> >>>>>>> There is actually no need to prevent logdirty from being enabled unless
> >>>>>>> devices are assigned to a domain and that domain is sharing HAP mappings
> >>>>>>> with the IOMMU (in which case disabling write permissions in the P2M may
> >>>>>>> cause DMA faults).
> >>>>>>
> >>>>>> But that's taking into account only half of the reason of the
> >>>>>> exclusion. The other half is that assigned devices may cause pages
> >>>>>> to be dirtied behind the back of the log-dirty logic.
> >>>>>
> >>>>> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
> >>>>> which can extract DMA dirtying information and set dirty tracking
> >>>>> information appropriately.
> >>>>
> >>>> It still needs a positive identification then: Such drivers should tell
> >>>> Xen for which specific devices such information is going to be provided.
> >>>
> >>> Why does the hypervisor need have the right of veto though? Surely it is
> >>> the toolstack that should decide whether a VM is migratable in the
> >>> presence of assigned h/w. Xen need only be concerned with the integrity
> >>> of the host, which is why the check for ETP sharing remains.
> >>
> >> While the tool stack is to decide, the hypervisor is expected to guarantee
> >> correct data coming back from XEN_DOMCTL_SHADOW_OP_{PEEK,CLEAN}.
> >
> > For some definition of 'correct', yes, and I don't think that this change violates any definition I
> can find in the domctl header.
> >
> > Note: there are already emulators that will be playing with the dirty map on an arbitrary and
> unsynchronized basis because they are emulating bus mastering h/w.
> 
> But the question is, do we want the toolstack to have to become an
> expert in what hardware might have external dirty tracking, and whether
> such tracking is active?  At the moment that would mean either 1)
> putting that information inside of libxc, or 2) duplicating it across
> xapi and libxl, for instance.

Why not? The toolstack is in charge of migration so why can't it decide whether it is 'safe' or not?

> 
> One thing we could imagine is that when specific devices have an active
> emulator (or whatever) propagating the dirty information, for that code
> to tell Xen, "I am implementing dirty tracking for this device".  Then
> when the toolstack enables logdirty, the check can be, "Are there any
> devices *that don't have external dirty tracking enabled* assigned to
> the guest?"

And what about existing emulators setting pages dirty at the moment? I don't see why Xen's internal dirty page logging is considered definitive because AFAICT that is really not the case even now.

  Paul
Jan Beulich Oct. 1, 2019, 10:50 a.m. UTC | #10
On 01.10.2019 12:40, Paul Durrant wrote:
>> From: George Dunlap <george.dunlap@citrix.com>
>> Sent: 01 October 2019 11:34
>>
>> One thing we could imagine is that when specific devices have an active
>> emulator (or whatever) propagating the dirty information, for that code
>> to tell Xen, "I am implementing dirty tracking for this device".  Then
>> when the toolstack enables logdirty, the check can be, "Are there any
>> devices *that don't have external dirty tracking enabled* assigned to
>> the guest?"
> 
> And what about existing emulators setting pages dirty at the moment? I
> don't see why Xen's internal dirty page logging is considered definitive
> because AFAICT that is really not the case even now.

I don't think external emulators already setting pages dirty matter here.
All they want/need to do is advertise which device(s) they take care of.
These emulators actually _help_ Xen maintain a correct picture. What your
patch imo does though is (further) weaken the current model.

Jan
Paul Durrant Oct. 1, 2019, 10:57 a.m. UTC | #11
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 01 October 2019 11:50
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau
> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu
> <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> 
> On 01.10.2019 12:40, Paul Durrant wrote:
> >> From: George Dunlap <george.dunlap@citrix.com>
> >> Sent: 01 October 2019 11:34
> >>
> >> One thing we could imagine is that when specific devices have an active
> >> emulator (or whatever) propagating the dirty information, for that code
> >> to tell Xen, "I am implementing dirty tracking for this device".  Then
> >> when the toolstack enables logdirty, the check can be, "Are there any
> >> devices *that don't have external dirty tracking enabled* assigned to
> >> the guest?"
> >
> > And what about existing emulators setting pages dirty at the moment? I
> > don't see why Xen's internal dirty page logging is considered definitive
> > because AFAICT that is really not the case even now.
> 
> I don't think external emulators already setting pages dirty matter here.
> All they want/need to do is advertise which device(s) they take care of.
> These emulators actually _help_ Xen maintain a correct picture. What your
> patch imo does though is (further) weaken the current model.
> 

Well that's where we disagree. I don't think the hypervisor currently is the authoritative source of information on the state of the domain. IMO that is what the toolstack is for and Xen should not be refusing to provide its input to the dirty page tracking information simply because it may not have the complete picture.

  Paul

> Jan
George Dunlap Oct. 1, 2019, 11:07 a.m. UTC | #12
On 10/1/19 11:40 AM, Paul Durrant wrote:
>  -----Original Message-----
>> From: George Dunlap <george.dunlap@citrix.com>
>> Sent: 01 October 2019 11:34
>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' <jbeulich@suse.com>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; xen-
>> devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu <wl@xen.org>
>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>
>> On 10/1/19 11:24 AM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 01 October 2019 11:15
>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
>>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei
>> Liu
>>>> <wl@xen.org>
>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>
>>>> On 01.10.2019 11:36, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: 01 October 2019 10:19
>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger
>> Pau
>>>>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>;
>> Wei
>>>> Liu
>>>>>> <wl@xen.org>
>>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>>>
>>>>>> On 01.10.2019 10:52, Paul Durrant wrote:
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>>>> Sent: 01 October 2019 09:46
>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
>>>>>>>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross
>>>> <jgross@suse.com>;
>>>>>> Wei
>>>>>>>> Liu <wl@xen.org>
>>>>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>>>>>
>>>>>>>> On 01.10.2019 10:28, Paul Durrant wrote:
>>>>>>>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>>>>>>>>> domain, migration may be needlessly vetoed due to the check of
>>>>>>>>> is_iommu_enabled() in paging_log_dirty_enable().
>>>>>>>>> There is actually no need to prevent logdirty from being enabled unless
>>>>>>>>> devices are assigned to a domain and that domain is sharing HAP mappings
>>>>>>>>> with the IOMMU (in which case disabling write permissions in the P2M may
>>>>>>>>> cause DMA faults).
>>>>>>>>
>>>>>>>> But that's taking into account only half of the reason of the
>>>>>>>> exclusion. The other half is that assigned devices may cause pages
>>>>>>>> to be dirtied behind the back of the log-dirty logic.
>>>>>>>
>>>>>>> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
>>>>>>> which can extract DMA dirtying information and set dirty tracking
>>>>>>> information appropriately.
>>>>>>
>>>>>> It still needs a positive identification then: Such drivers should tell
>>>>>> Xen for which specific devices such information is going to be provided.
>>>>>
>>>>> Why does the hypervisor need have the right of veto though? Surely it is
>>>>> the toolstack that should decide whether a VM is migratable in the
>>>>> presence of assigned h/w. Xen need only be concerned with the integrity
>>>>> of the host, which is why the check for ETP sharing remains.
>>>>
>>>> While the tool stack is to decide, the hypervisor is expected to guarantee
>>>> correct data coming back from XEN_DOMCTL_SHADOW_OP_{PEEK,CLEAN}.
>>>
>>> For some definition of 'correct', yes, and I don't think that this change violates any definition I
>> can find in the domctl header.
>>>
>>> Note: there are already emulators that will be playing with the dirty map on an arbitrary and
>> unsynchronized basis because they are emulating bus mastering h/w.
>>
>> But the question is, do we want the toolstack to have to become an
>> expert in what hardware might have external dirty tracking, and whether
>> such tracking is active?  At the moment that would mean either 1)
>> putting that information inside of libxc, or 2) duplicating it across
>> xapi and libxl, for instance.
> 
> Why not? The toolstack is in charge of migration so why can't it decide whether it is 'safe' or not?

First of all, it's not about what the toolstack can decide; it's what it
knows.  It doesn't currently know anything about the details of devices
themselves or how they relate to other functionality, such as migration.
 Xen knows about specific devices, and could (for instance) special-case
specific device IDs or what-not; and Xen knows that device pass-through
interacts with logdirty.  You're proposing to teach the toolstack about
all of that.

Secondly, you haven't answered the question about duplication.  Where do
you propose to put this functionality?

I'm not 100% opposed to the idea of doing it in the toolstack; defining
"correct" for the logdirty functionality to be "collect changes made by
processors, I'll worry about changes made by devices" is certainly a
reasonable approach, particularly as *somewhere* in the toolstack will
eventually have to actually start these external emulators.

But knowing that emulator X goes with bdf Y is one thing; knowing that
1) devices interact with logdirty, and 2) emulator X knows how to
mitigate that is a different thing.

 -George
Andrew Cooper Oct. 1, 2019, 11:08 a.m. UTC | #13
On 01/10/2019 09:46, Jan Beulich wrote:
> On 01.10.2019 10:28, Paul Durrant wrote:
>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>> domain, migration may be needlessly vetoed due to the check of
>> is_iommu_enabled() in paging_log_dirty_enable().
>> There is actually no need to prevent logdirty from being enabled unless
>> devices are assigned to a domain and that domain is sharing HAP mappings
>> with the IOMMU (in which case disabling write permissions in the P2M may
>> cause DMA faults).
> But that's taking into account only half of the reason of the
> exclusion. The other half is that assigned devices may cause pages
> to be dirtied behind the back of the log-dirty logic.

No.  Very specifically not.

Xen cannot correctly evaluate whether a domain is migrateable or not,
and therefore is unable to make a correct veto decision.

The existing DOMCTL_disable_migrate is bogus and I'm in the process of
purging it from Xen.

Also, I seem to recall nacking this change originally, specifically on
the basis that it breaks live migration with GPUs, so I'm rather
confused to find the code present...

~Andrew
Jan Beulich Oct. 1, 2019, 12:01 p.m. UTC | #14
On 01.10.2019 13:08, Andrew Cooper wrote:
> On 01/10/2019 09:46, Jan Beulich wrote:
>> On 01.10.2019 10:28, Paul Durrant wrote:
>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>>> domain, migration may be needlessly vetoed due to the check of
>>> is_iommu_enabled() in paging_log_dirty_enable().
>>> There is actually no need to prevent logdirty from being enabled unless
>>> devices are assigned to a domain and that domain is sharing HAP mappings
>>> with the IOMMU (in which case disabling write permissions in the P2M may
>>> cause DMA faults).
>> But that's taking into account only half of the reason of the
>> exclusion. The other half is that assigned devices may cause pages
>> to be dirtied behind the back of the log-dirty logic.
> 
> No.  Very specifically not.
> 
> Xen cannot correctly evaluate whether a domain is migrateable or not,
> and therefore is unable to make a correct veto decision.

Then it should be put in the position, shouldn't it? Like with the
suggested reporting of what devices have log-dirty handling taken care
of (which, to take into account a remark by Paul, I think should go
without saying for emulated devices).

> The existing DOMCTL_disable_migrate is bogus and I'm in the process of
> purging it from Xen.

That domctl is pretty much orthogonal I think.

> Also, I seem to recall nacking this change originally, specifically on
> the basis that it breaks live migration with GPUs, so I'm rather
> confused to find the code present...

I've gone and checked - there's no response by you that I could find in
the list archives on any of the v2 through v4 submissions by Roger (in
the April-June 2014 time frame; v1 looks to not have had this change yet).
Hence it became commit 37201c6203.

Jan
Paul Durrant Oct. 1, 2019, 12:32 p.m. UTC | #15
On Tue, 1 Oct 2019 at 12:09, George Dunlap <george.dunlap@citrix.com> wrote:
>
> On 10/1/19 11:40 AM, Paul Durrant wrote:
> >  -----Original Message-----
> >> From: George Dunlap <george.dunlap@citrix.com>
> >> Sent: 01 October 2019 11:34
> >> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' <jbeulich@suse.com>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; xen-
> >> devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu <wl@xen.org>
> >> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>
> >> On 10/1/19 11:24 AM, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 01 October 2019 11:15
> >>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
> >>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei
> >> Liu
> >>>> <wl@xen.org>
> >>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>>>
> >>>> On 01.10.2019 11:36, Paul Durrant wrote:
> >>>>>> -----Original Message-----
> >>>>>> From: Jan Beulich <jbeulich@suse.com>
> >>>>>> Sent: 01 October 2019 10:19
> >>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger
> >> Pau
> >>>>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>;
> >> Wei
> >>>> Liu
> >>>>>> <wl@xen.org>
> >>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>>>>>
> >>>>>> On 01.10.2019 10:52, Paul Durrant wrote:
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Jan Beulich <jbeulich@suse.com>
> >>>>>>>> Sent: 01 October 2019 09:46
> >>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
> >>>>>>>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> >>>>>>>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross
> >>>> <jgross@suse.com>;
> >>>>>> Wei
> >>>>>>>> Liu <wl@xen.org>
> >>>>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
> >>>>>>>>
> >>>>>>>> On 01.10.2019 10:28, Paul Durrant wrote:
> >>>>>>>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
> >>>>>>>>> domain, migration may be needlessly vetoed due to the check of
> >>>>>>>>> is_iommu_enabled() in paging_log_dirty_enable().
> >>>>>>>>> There is actually no need to prevent logdirty from being enabled unless
> >>>>>>>>> devices are assigned to a domain and that domain is sharing HAP mappings
> >>>>>>>>> with the IOMMU (in which case disabling write permissions in the P2M may
> >>>>>>>>> cause DMA faults).
> >>>>>>>>
> >>>>>>>> But that's taking into account only half of the reason of the
> >>>>>>>> exclusion. The other half is that assigned devices may cause pages
> >>>>>>>> to be dirtied behind the back of the log-dirty logic.
> >>>>>>>
> >>>>>>> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
> >>>>>>> which can extract DMA dirtying information and set dirty tracking
> >>>>>>> information appropriately.
> >>>>>>
> >>>>>> It still needs a positive identification then: Such drivers should tell
> >>>>>> Xen for which specific devices such information is going to be provided.
> >>>>>
> >>>>> Why does the hypervisor need have the right of veto though? Surely it is
> >>>>> the toolstack that should decide whether a VM is migratable in the
> >>>>> presence of assigned h/w. Xen need only be concerned with the integrity
> >>>>> of the host, which is why the check for ETP sharing remains.
> >>>>
> >>>> While the tool stack is to decide, the hypervisor is expected to guarantee
> >>>> correct data coming back from XEN_DOMCTL_SHADOW_OP_{PEEK,CLEAN}.
> >>>
> >>> For some definition of 'correct', yes, and I don't think that this change violates any definition I
> >> can find in the domctl header.
> >>>
> >>> Note: there are already emulators that will be playing with the dirty map on an arbitrary and
> >> unsynchronized basis because they are emulating bus mastering h/w.
> >>
> >> But the question is, do we want the toolstack to have to become an
> >> expert in what hardware might have external dirty tracking, and whether
> >> such tracking is active?  At the moment that would mean either 1)
> >> putting that information inside of libxc, or 2) duplicating it across
> >> xapi and libxl, for instance.
> >
> > Why not? The toolstack is in charge of migration so why can't it decide whether it is 'safe' or not?
>
> First of all, it's not about what the toolstack can decide; it's what it
> knows.  It doesn't currently know anything about the details of devices
> themselves or how they relate to other functionality, such as migration.

Doesn't it? Why? It can, in the face of an arbitrary device, use an
emulator such as QEMU to deal with the pass-through and having so
decided knows that it can't get dirty page information, and hence the
domain cannot be safely migrated. In the face of a device it knows
about though e.g. a GPU, it can run a dedicated emulator from which it
can get dirty page information and hence (providing shared EPT is not
in use) it knows the domain can be migrated.

>  Xen knows about specific devices, and could (for instance) special-case
> specific device IDs or what-not; and Xen knows that device pass-through
> interacts with logdirty.  You're proposing to teach the toolstack about
> all of that.
>

Indeed, because it is in a much better position to know that kind of
information and indeed XAPI already does know that kind of
informaiton.

> Secondly, you haven't answered the question about duplication.  Where do
> you propose to put this functionality?
>

Different toolstacks can have different capabilities. If libxl is
unaware of a devices capability to provide dirty page information, but
XAPI is aware, then why is that a problem?

> I'm not 100% opposed to the idea of doing it in the toolstack; defining
> "correct" for the logdirty functionality to be "collect changes made by
> processors, I'll worry about changes made by devices" is certainly a
> reasonable approach, particularly as *somewhere* in the toolstack will
> eventually have to actually start these external emulators.
>
> But knowing that emulator X goes with bdf Y is one thing; knowing that
> 1) devices interact with logdirty, and 2) emulator X knows how to
> mitigate that is a different thing.

It is, but I don't see that Xen should have any right of veto over
what a paticular toolstack wishes to do with the domains it has
created.

  Paul

>
>  -George
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
George Dunlap Oct. 1, 2019, 1:06 p.m. UTC | #16
On 10/1/19 1:32 PM, Paul Durrant wrote:
> On Tue, 1 Oct 2019 at 12:09, George Dunlap <george.dunlap@citrix.com> wrote:
>>
>> On 10/1/19 11:40 AM, Paul Durrant wrote:
>>>  -----Original Message-----
>>>> From: George Dunlap <george.dunlap@citrix.com>
>>>> Sent: 01 October 2019 11:34
>>>> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich' <jbeulich@suse.com>
>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; xen-
>>>> devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei Liu <wl@xen.org>
>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>
>>>> On 10/1/19 11:24 AM, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: 01 October 2019 11:15
>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger Pau
>>>>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>; Wei
>>>> Liu
>>>>>> <wl@xen.org>
>>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>>>
>>>>>> On 01.10.2019 11:36, Paul Durrant wrote:
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>>>> Sent: 01 October 2019 10:19
>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Roger
>>>> Pau
>>>>>>>> Monne <roger.pau@citrix.com>; xen-devel@lists.xenproject.org; Juergen Gross <jgross@suse.com>;
>>>> Wei
>>>>>> Liu
>>>>>>>> <wl@xen.org>
>>>>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>>>>>
>>>>>>>> On 01.10.2019 10:52, Paul Durrant wrote:
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>>>>>> Sent: 01 October 2019 09:46
>>>>>>>>>> To: Paul Durrant <Paul.Durrant@citrix.com>
>>>>>>>>>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
>>>>>>>>>> <roger.pau@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Juergen Gross
>>>>>> <jgross@suse.com>;
>>>>>>>> Wei
>>>>>>>>>> Liu <wl@xen.org>
>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH-for-4.13] x86/mm: don't needlessly veto migration
>>>>>>>>>>
>>>>>>>>>> On 01.10.2019 10:28, Paul Durrant wrote:
>>>>>>>>>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>>>>>>>>>>> domain, migration may be needlessly vetoed due to the check of
>>>>>>>>>>> is_iommu_enabled() in paging_log_dirty_enable().
>>>>>>>>>>> There is actually no need to prevent logdirty from being enabled unless
>>>>>>>>>>> devices are assigned to a domain and that domain is sharing HAP mappings
>>>>>>>>>>> with the IOMMU (in which case disabling write permissions in the P2M may
>>>>>>>>>>> cause DMA faults).
>>>>>>>>>>
>>>>>>>>>> But that's taking into account only half of the reason of the
>>>>>>>>>> exclusion. The other half is that assigned devices may cause pages
>>>>>>>>>> to be dirtied behind the back of the log-dirty logic.
>>>>>>>>>
>>>>>>>>> But that's no reason to veto logdirty. Some devices have drivers (in dom0)
>>>>>>>>> which can extract DMA dirtying information and set dirty tracking
>>>>>>>>> information appropriately.
>>>>>>>>
>>>>>>>> It still needs a positive identification then: Such drivers should tell
>>>>>>>> Xen for which specific devices such information is going to be provided.
>>>>>>>
>>>>>>> Why does the hypervisor need have the right of veto though? Surely it is
>>>>>>> the toolstack that should decide whether a VM is migratable in the
>>>>>>> presence of assigned h/w. Xen need only be concerned with the integrity
>>>>>>> of the host, which is why the check for ETP sharing remains.
>>>>>>
>>>>>> While the tool stack is to decide, the hypervisor is expected to guarantee
>>>>>> correct data coming back from XEN_DOMCTL_SHADOW_OP_{PEEK,CLEAN}.
>>>>>
>>>>> For some definition of 'correct', yes, and I don't think that this change violates any definition I
>>>> can find in the domctl header.
>>>>>
>>>>> Note: there are already emulators that will be playing with the dirty map on an arbitrary and
>>>> unsynchronized basis because they are emulating bus mastering h/w.
>>>>
>>>> But the question is, do we want the toolstack to have to become an
>>>> expert in what hardware might have external dirty tracking, and whether
>>>> such tracking is active?  At the moment that would mean either 1)
>>>> putting that information inside of libxc, or 2) duplicating it across
>>>> xapi and libxl, for instance.
>>>
>>> Why not? The toolstack is in charge of migration so why can't it decide whether it is 'safe' or not?
>>
>> First of all, it's not about what the toolstack can decide; it's what it
>> knows.  It doesn't currently know anything about the details of devices
>> themselves or how they relate to other functionality, such as migration.
> 
> Doesn't it? Why?
[snip]
> It is, but I don't see that Xen should have any right of veto over
> what a paticular toolstack wishes to do with the domains it has
> created.

I feel like the temperature of this conversation is really high, and I
can't really figure out why.  Could I ask that we try to turn down the
heat a bit, and perhaps help Jan and I figure out where you're coming from?

> It can, in the face of an arbitrary device, use an
> emulator such as QEMU to deal with the pass-through and having so
> decided knows that it can't get dirty page information, and hence the
> domain cannot be safely migrated. In the face of a device it knows
> about though e.g. a GPU, it can run a dedicated emulator from which it
> can get dirty page information and hence (providing shared EPT is not
> in use) it knows the domain can be migrated.

xapi knows what devices *it has asked Xen to pass through*.  Xen knows
*what devices it gave to the guest*.

xapi can *gather* specific information about devices (topology,
characteristics, &c) from Linux and Xen; Xen has it already.

It seems you've encoded in xapi information about how Xen is
implemented.  That could change: More features could begin to interact
with logdirty, or with devices which can implement their own
logdirty-like functionality.  If/when those changes happen, we can
update the rules for when logdirty works within the patch series itself
in Xen.  If you instead encode that knowlegde in xapi, then xapi needs
to be updated to keep in sync with the internal implementation of the
hypervisor.

In any case, having emulators which can handle logdirty externally
report their own capability seems a much better way of doing things than
hard-coding in xapi which emulators know how to do what.

>> Secondly, you haven't answered the question about duplication.  Where do
>> you propose to put this functionality?
>>
> 
> Different toolstacks can have different capabilities. If libxl is
> unaware of a devices capability to provide dirty page information, but
> XAPI is aware, then why is that a problem?

So it sounds like you've already done a lot of this work in xapi.  But
that only benefits XenServer and derivatives: all of Citrix's other
partners who want to do something similar will have to completely
duplicate all of that functionality.  It should be obvious why that's
sub-optimal.

 -George
Paul Durrant Oct. 1, 2019, 1:29 p.m. UTC | #17
On Tue, 1 Oct 2019 at 14:06, George Dunlap <george.dunlap@citrix.com> wrote:
>
[snip]
> >>>> But the question is, do we want the toolstack to have to become an
> >>>> expert in what hardware might have external dirty tracking, and whether
> >>>> such tracking is active?  At the moment that would mean either 1)
> >>>> putting that information inside of libxc, or 2) duplicating it across
> >>>> xapi and libxl, for instance.
> >>>
> >>> Why not? The toolstack is in charge of migration so why can't it decide whether it is 'safe' or not?
> >>
> >> First of all, it's not about what the toolstack can decide; it's what it
> >> knows.  It doesn't currently know anything about the details of devices
> >> themselves or how they relate to other functionality, such as migration.
> >
> > Doesn't it? Why?
> [snip]
> > It is, but I don't see that Xen should have any right of veto over
> > what a paticular toolstack wishes to do with the domains it has
> > created.
>
> I feel like the temperature of this conversation is really high, and I
> can't really figure out why.  Could I ask that we try to turn down the
> heat a bit, and perhaps help Jan and I figure out where you're coming from?
>

It is not my intention it appears this way, but text-based
communication is not great in this kind of case.

I'll explain the background, whilst trying to keep proprietary stuff
proprietary... We (Citrix) have some SR-IOV hardware where VFs are
passed through to a domain via a dedicated emulator. The PF can
potentially provide information about which guest pages have been
dirtied by DMA, and thus it is feasible to migrate domains with VFs
assigned. However, current development is frustrated by Xen's refusal
to enable logdirty and so we now have a patch (a prior version of the
one I posted) that rectifies this. I would now like to get this
upstream as I do not see that any harm can be caused to the system by
allowing a domain with IOMMU mappings to have logdirty enabled, unless
the P2M is shared.

> > It can, in the face of an arbitrary device, use an
> > emulator such as QEMU to deal with the pass-through and having so
> > decided knows that it can't get dirty page information, and hence the
> > domain cannot be safely migrated. In the face of a device it knows
> > about though e.g. a GPU, it can run a dedicated emulator from which it
> > can get dirty page information and hence (providing shared EPT is not
> > in use) it knows the domain can be migrated.
>
> xapi knows what devices *it has asked Xen to pass through*.  Xen knows
> *what devices it gave to the guest*.

I don't understand the difference.

>
> xapi can *gather* specific information about devices (topology,
> characteristics, &c) from Linux and Xen; Xen has it already.
>
> It seems you've encoded in xapi information about how Xen is
> implemented.  That could change: More features could begin to interact
> with logdirty, or with devices which can implement their own
> logdirty-like functionality.  If/when those changes happen, we can
> update the rules for when logdirty works within the patch series itself
> in Xen.  If you instead encode that knowlegde in xapi, then xapi needs
> to be updated to keep in sync with the internal implementation of the
> hypervisor.
>

I don't follow. I don't think we have encoded information about how
Xen works. What we have are buch of things involved in migration...
Xen is one, and all the others are emulators. So, we enable logdirty
in Xen and then ask it what pages the guest has touched, and we enable
dirty tracking in the emulators and ask them what pages they (or the
h/w they manage) as touched. That info. feeds into a combined dirty
page list and that then informs the migration stream.

> In any case, having emulators which can handle logdirty externally
> report their own capability seems a much better way of doing things than
> hard-coding in xapi which emulators know how to do what.

In the long run, yes, but I don't see why Xen should be making the
assumption that no current emulator can do that. It seems entirely the
wrong place to be doing that. At some level Xen is 'just another
emulator' and ought not to care what its peers are capable of.

>
> >> Secondly, you haven't answered the question about duplication.  Where do
> >> you propose to put this functionality?
> >>
> >
> > Different toolstacks can have different capabilities. If libxl is
> > unaware of a devices capability to provide dirty page information, but
> > XAPI is aware, then why is that a problem?
>
> So it sounds like you've already done a lot of this work in xapi.  But
> that only benefits XenServer and derivatives: all of Citrix's other
> partners who want to do something similar will have to completely
> duplicate all of that functionality.  It should be obvious why that's
> sub-optimal.

The changes in XAPI are not vast; the main complexity is in the device
emulator (to provide information during the live phase of migration)
but I still don't see why Citrix's choice of closed vs. open source
implementation of the emulator really has anything to do with this. It
is still my opinion that Xen's only valid reason for refusing to
enable logdirty for a domain is one of host safety and I still haven't
heard an argument as to why Xen *is* right to refuse in other
circumstances.

  Paul

>
>  -George
Jan Beulich Oct. 1, 2019, 2:24 p.m. UTC | #18
On 01.10.2019 15:29, Paul Durrant wrote:
> The changes in XAPI are not vast; the main complexity is in the device
> emulator (to provide information during the live phase of migration)
> but I still don't see why Citrix's choice of closed vs. open source
> implementation of the emulator really has anything to do with this. It
> is still my opinion that Xen's only valid reason for refusing to
> enable logdirty for a domain is one of host safety and I still haven't
> heard an argument as to why Xen *is* right to refuse in other
> circumstances.

Let me take a completely different example for comparison:
There's no risk to the host in assigning the same, say, USB
controller to two guests. Yet Xen refuses to do so, even if the
tool stack didn't already filter such attempts, and even if the
admin may know that the two domains are cooperating, and hence
wouldn't get in the way of one another. (There are, I think,
many other similar examples.)

That said I can certainly see the validity of your and Andrew's
argumentation. It's just that, as in various other cases, I
don't think that's the only reasonable way of arranging things.
Hence at the very least your change would imo need to come with
an extended description.

Jan
Paul Durrant Oct. 1, 2019, 2:32 p.m. UTC | #19
On Tue, 1 Oct 2019 at 15:24, Jan Beulich <jbeulich@suse.com> wrote:
>
> On 01.10.2019 15:29, Paul Durrant wrote:
> > The changes in XAPI are not vast; the main complexity is in the device
> > emulator (to provide information during the live phase of migration)
> > but I still don't see why Citrix's choice of closed vs. open source
> > implementation of the emulator really has anything to do with this. It
> > is still my opinion that Xen's only valid reason for refusing to
> > enable logdirty for a domain is one of host safety and I still haven't
> > heard an argument as to why Xen *is* right to refuse in other
> > circumstances.
>
> Let me take a completely different example for comparison:
> There's no risk to the host in assigning the same, say, USB
> controller to two guests. Yet Xen refuses to do so, even if the
> tool stack didn't already filter such attempts, and even if the
> admin may know that the two domains are cooperating, and hence
> wouldn't get in the way of one another. (There are, I think,
> many other similar examples.)

That sounds like a perfectly valid thing to do if the s/w running in
the domains is trusted to co-operate as you say. For NVIDIA vGPU
implementations different MMIO regions of the same PCI device are
mapped into different guests.

>
> That said I can certainly see the validity of your and Andrew's
> argumentation. It's just that, as in various other cases, I
> don't think that's the only reasonable way of arranging things.
> Hence at the very least your change would imo need to come with
> an extended description.
>

Ok, I will expand on the reasoning in the commit comment and post v2.

  Paul
diff mbox series

Patch

diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 412a442b6a..3d93f3451c 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -71,7 +71,7 @@  int hap_track_dirty_vram(struct domain *d,
 
         if ( !paging_mode_log_dirty(d) )
         {
-            rc = paging_log_dirty_enable(d, 0);
+            rc = paging_log_dirty_enable(d, false);
             if ( rc )
                 goto out;
         }
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index d9a52c4db4..240f6f93fb 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -209,15 +209,15 @@  static int paging_free_log_dirty_bitmap(struct domain *d, int rc)
     return rc;
 }
 
-int paging_log_dirty_enable(struct domain *d, bool_t log_global)
+int paging_log_dirty_enable(struct domain *d, bool log_global)
 {
     int ret;
 
-    if ( is_iommu_enabled(d) && log_global )
+    if ( has_arch_pdevs(d) && iommu_use_hap_pt(d) && log_global )
     {
         /*
          * Refuse to turn on global log-dirty mode
-         * if the domain is using the IOMMU.
+         * if the domain is sharing the P2M with the IOMMU.
          */
         return -EINVAL;
     }
@@ -727,7 +727,7 @@  int paging_domctl(struct domain *d, struct xen_domctl_shadow_op *sc,
             break;
         /* Else fall through... */
     case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
-        return paging_log_dirty_enable(d, 1);
+        return paging_log_dirty_enable(d, true);
 
     case XEN_DOMCTL_SHADOW_OP_OFF:
         if ( (rc = paging_log_dirty_disable(d, resuming)) != 0 )
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 7deef2f12b..9614dca8c1 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1486,11 +1486,15 @@  static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     if ( !is_iommu_enabled(d) )
         return 0;
 
-    /* Prevent device assign if mem paging or mem sharing have been 
-     * enabled for this domain */
+    /*
+     * Prevent device assign if mem paging or mem sharing have been
+     * enabled for this domain, or logdirty is enabled and the P2M is
+     * shared with the IOMMU.
+     */
     if ( unlikely(d->arch.hvm.mem_sharing_enabled ||
                   vm_event_check_ring(d->vm_event_paging) ||
-                  p2m_get_hostp2m(d)->global_logdirty) )
+                  (p2m_get_hostp2m(d)->global_logdirty &&
+                   iommu_use_hap_pt(d))) )
         return -EXDEV;
 
     if ( !pcidevs_trylock() )
diff --git a/xen/include/asm-x86/paging.h b/xen/include/asm-x86/paging.h
index ab7887f23c..8c2027c791 100644
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -157,7 +157,7 @@  void paging_log_dirty_range(struct domain *d,
                             uint8_t *dirty_bitmap);
 
 /* enable log dirty */
-int paging_log_dirty_enable(struct domain *d, bool_t log_global);
+int paging_log_dirty_enable(struct domain *d, bool log_global);
 
 /* log dirty initialization */
 void paging_log_dirty_init(struct domain *d, const struct log_dirty_ops *ops);