Message ID | 573D4D7E.3080208@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> 05/19/16 7:22 AM >>> >On 05/16/2016 03:19 AM, Paul Durrant wrote: >> >From:suravee.suthikulpanit@amd.com >> >Sent: 13 May 2016 20:37 >>> >The guest_iommu_init() is currently called by the following code path: >>> > >>> > arch/x86/domain.c: arch_domain_create() >>> > ]- drivers/passthrough/iommu.c: iommu_domain_init() >>> > |- drivers/passthrough/amd/pci_amd_iommu.c: >>> >amd_iommu_domain_init(); >>> > |- drivers/passthrough/amd/iommu_guest.c: guest_iommu_init() >>> > >>> >At this point, the hvm_domain_initialised() has not been >>> >called. So register_mmio_handler(), in guest_iommu_init(), silently fails. >>> >This patch moves the call to guest_iommu_init/destroy() into >>> >the svm_domain_intialise/_destroy() instead. >>> > >> That seems wrong. You're taking a call that currently comes via a jump table, i.e. an abstraction layer, and calling it directly. Is it possible, instead, to move the call to iommu_domain_init() later in arch_domain_create()? It seems odd, to me at least, that it's done before hvm_domain_initialise() anyway. > >Good point. I think I should be able to move iommu_domain_init() call to >go after hvm_domain_initialise() as the following. > >--- a/xen/arch/x86/domain.c >+++ b/xen/arch/x86/domain.c >@@ -625,24 +625,21 @@ int arch_domain_create(struct domain *d, unsigned >int domcr_flags, > >if ( (rc = init_domain_irq_mapping(d)) != 0 ) >goto fail; >- >- if ( (rc = iommu_domain_init(d)) != 0 ) >- goto fail; >} >spin_lock_init(&d->arch.e820_lock); > >if ( has_hvm_container_domain(d) ) >{ >if ( (rc = hvm_domain_initialise(d)) != 0 ) >- { >- iommu_domain_destroy(d); >goto fail; >- } >} >else >/* 64-bit PV guest by default. */ >d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; > >+ if ( !is_idle_domain(d) && (rc = iommu_domain_init(d)) != 0 ) >+ goto fail; This would in the error case fail to undo what hvm_domain_initialise() did. There was a fix very recently dealing with a similar issue. There really shouldn't be anything that can fail after hvm_domain_initialise(). And I also can't see why both of you think iommu_domain_init() has to come later - that's a function affecting not just HVM guests. Jan
> -----Original Message----- > From: Jan Beulich [mailto:jbeulich@suse.com] > Sent: 19 May 2016 07:04 > To: Suravee.Suthikulpanit@amd.com; Paul Durrant > Cc: George Dunlap; xen-devel@lists.xen.org; Keir (Xen.org) > Subject: Re: [PATCH V2 2/2] svm: iommu: Only call guest_iommu_init() after > initialized HVM domain > > >>> Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> 05/19/16 7:22 > AM >>> > >On 05/16/2016 03:19 AM, Paul Durrant wrote: > >> >From:suravee.suthikulpanit@amd.com > >> >Sent: 13 May 2016 20:37 > >>> >The guest_iommu_init() is currently called by the following code path: > >>> > > >>> > arch/x86/domain.c: arch_domain_create() > >>> > ]- drivers/passthrough/iommu.c: iommu_domain_init() > >>> > |- drivers/passthrough/amd/pci_amd_iommu.c: > >>> >amd_iommu_domain_init(); > >>> > |- drivers/passthrough/amd/iommu_guest.c: guest_iommu_init() > >>> > > >>> >At this point, the hvm_domain_initialised() has not been > >>> >called. So register_mmio_handler(), in guest_iommu_init(), silently > fails. > >>> >This patch moves the call to guest_iommu_init/destroy() into > >>> >the svm_domain_intialise/_destroy() instead. > >>> > > >> That seems wrong. You're taking a call that currently comes via a jump > table, i.e. an abstraction layer, and calling it directly. Is it possible, instead, to > move the call to iommu_domain_init() later in arch_domain_create()? It > seems odd, to me at least, that it's done before hvm_domain_initialise() > anyway. > > > >Good point. I think I should be able to move iommu_domain_init() call to > >go after hvm_domain_initialise() as the following. > > > >--- a/xen/arch/x86/domain.c > >+++ b/xen/arch/x86/domain.c > >@@ -625,24 +625,21 @@ int arch_domain_create(struct domain *d, > unsigned > >int domcr_flags, > > > >if ( (rc = init_domain_irq_mapping(d)) != 0 ) > >goto fail; > >- > >- if ( (rc = iommu_domain_init(d)) != 0 ) > >- goto fail; > >} > >spin_lock_init(&d->arch.e820_lock); > > > >if ( has_hvm_container_domain(d) ) > >{ > >if ( (rc = hvm_domain_initialise(d)) != 0 ) > >- { > >- iommu_domain_destroy(d); > >goto fail; > >- } > >} > >else > >/* 64-bit PV guest by default. */ > >d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; > > > >+ if ( !is_idle_domain(d) && (rc = iommu_domain_init(d)) != 0 ) > >+ goto fail; > > This would in the error case fail to undo what hvm_domain_initialise() did. > There was a fix very recently dealing with a similar issue. There really > shouldn't be anything that can fail after hvm_domain_initialise(). Why is that? There are many failure paths within hvm_domain_initialise(). What's wrong with calling hvm_domain_destroy() to undo the whole thing? (Although I do notice that the io_bitmap would appear to leak in that case... but that looks like a bug to me). > And I also > can't see why both of you think iommu_domain_init() has to come later - > that's a function affecting not just HVM guests. > Yes, I realise that. But the problem is that, in the HVM case, it calls functions that make use of infrastructure that's initialized by hvm_domain_initialise(). Paul > Jan
>>> On 19.05.16 at 09:52, <Paul.Durrant@citrix.com> wrote: >> >--- a/xen/arch/x86/domain.c >> >+++ b/xen/arch/x86/domain.c >> >@@ -625,24 +625,21 @@ int arch_domain_create(struct domain *d, >> unsigned >> >int domcr_flags, >> > >> >if ( (rc = init_domain_irq_mapping(d)) != 0 ) >> >goto fail; >> >- >> >- if ( (rc = iommu_domain_init(d)) != 0 ) >> >- goto fail; >> >} >> >spin_lock_init(&d->arch.e820_lock); >> > >> >if ( has_hvm_container_domain(d) ) >> >{ >> >if ( (rc = hvm_domain_initialise(d)) != 0 ) >> >- { >> >- iommu_domain_destroy(d); >> >goto fail; >> >- } >> >} >> >else >> >/* 64-bit PV guest by default. */ >> >d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; >> > >> >+ if ( !is_idle_domain(d) && (rc = iommu_domain_init(d)) != 0 ) >> >+ goto fail; >> >> This would in the error case fail to undo what hvm_domain_initialise() did. >> There was a fix very recently dealing with a similar issue. There really >> shouldn't be anything that can fail after hvm_domain_initialise(). > > Why is that? There are many failure paths within hvm_domain_initialise(). > What's wrong with calling hvm_domain_destroy() to undo the whole thing? Well, I simply didn't check whether hvm_domain_destroy() would be suitable to be called in that case, but yes, it looks like it would be. > (Although I do notice that the io_bitmap would appear to leak in that case... > but that looks like a bug to me). That's a hardware domain only aspect, and failure to construct the hardware domain would be fatal to the system anyway. Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 19 May 2016 10:00 > To: Paul Durrant > Cc: Suravee.Suthikulpanit@amd.com; George Dunlap; xen- > devel@lists.xen.org; Keir (Xen.org) > Subject: RE: [PATCH V2 2/2] svm: iommu: Only call guest_iommu_init() after > initialized HVM domain > > >>> On 19.05.16 at 09:52, <Paul.Durrant@citrix.com> wrote: > >> >--- a/xen/arch/x86/domain.c > >> >+++ b/xen/arch/x86/domain.c > >> >@@ -625,24 +625,21 @@ int arch_domain_create(struct domain *d, > >> unsigned > >> >int domcr_flags, > >> > > >> >if ( (rc = init_domain_irq_mapping(d)) != 0 ) > >> >goto fail; > >> >- > >> >- if ( (rc = iommu_domain_init(d)) != 0 ) > >> >- goto fail; > >> >} > >> >spin_lock_init(&d->arch.e820_lock); > >> > > >> >if ( has_hvm_container_domain(d) ) > >> >{ > >> >if ( (rc = hvm_domain_initialise(d)) != 0 ) > >> >- { > >> >- iommu_domain_destroy(d); > >> >goto fail; > >> >- } > >> >} > >> >else > >> >/* 64-bit PV guest by default. */ > >> >d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; > >> > > >> >+ if ( !is_idle_domain(d) && (rc = iommu_domain_init(d)) != 0 ) > >> >+ goto fail; > >> > >> This would in the error case fail to undo what hvm_domain_initialise() did. > >> There was a fix very recently dealing with a similar issue. There really > >> shouldn't be anything that can fail after hvm_domain_initialise(). > > > > Why is that? There are many failure paths within hvm_domain_initialise(). > > What's wrong with calling hvm_domain_destroy() to undo the whole > thing? > > Well, I simply didn't check whether hvm_domain_destroy() would be > suitable to be called in that case, but yes, it looks like it would be. > > > (Although I do notice that the io_bitmap would appear to leak in that > case... > > but that looks like a bug to me). > > That's a hardware domain only aspect, and failure to construct > the hardware domain would be fatal to the system anyway. > Ok. No problem then. Paul > Jan
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 9d43f7b..ac289b6 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -625,24 +625,21 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, if ( (rc = init_domain_irq_mapping(d)) != 0 ) goto fail; - - if ( (rc = iommu_domain_init(d)) != 0 ) - goto fail; } spin_lock_init(&d->arch.e820_lock); if ( has_hvm_container_domain(d) ) { if ( (rc = hvm_domain_initialise(d)) != 0 ) - { - iommu_domain_destroy(d); goto fail; - } } else /* 64-bit PV guest by default. */ d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0; + if ( !is_idle_domain(d) && (rc = iommu_domain_init(d)) != 0 ) + goto fail; + if ( (rc = psr_domain_init(d)) != 0 ) goto fail;