Message ID | 5721F90402000078000E6B1B@prv-mh.provo.novell.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 28/04/16 10:50, Jan Beulich wrote: > ... as at least certain versions of Windows use such to update the > MSI-X table. However, to not overly complicate the logic for now > - only EFLAGS.DF=0 is being handled, > - only updates not crossing MSI-X table entry boundaries are handled. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> IMO, under those circumstances we should domain_crash() and be rather loud on the console. Perhaps in a debug build only, but this will be a very unpleasant issue to debug for whomever finds an OS which falls into of those unhandled situations. ~Andrew
>>> On 28.04.16 at 12:34, <andrew.cooper3@citrix.com> wrote: > On 28/04/16 10:50, Jan Beulich wrote: >> ... as at least certain versions of Windows use such to update the >> MSI-X table. However, to not overly complicate the logic for now >> - only EFLAGS.DF=0 is being handled, >> - only updates not crossing MSI-X table entry boundaries are handled. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > IMO, under those circumstances we should domain_crash() and be rather > loud on the console. > > Perhaps in a debug build only, but this will be a very unpleasant issue > to debug for whomever finds an OS which falls into of those unhandled > situations. I disagree: At the time we snoop accesses, we don't know whether they're targeting any MSI-X table entry. And we shouldn't crash the guest just because it accessed _something else_ in a way the snoop logic doesn't support. If any unsupported access gets done by a guest, all that'll happen is that MSI-X again doesn't work inside the guest, i.e. the same situation as the previous and this patches are trying to deal with. (And yes, the debugging wasn't really pleasant, but that's more because the original authors didn't design the whole thing properly, and any other solution I could think of would have caused issues with the qemu/Xen interface logic.) Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 28 April 2016 10:50 > To: xen-devel > Cc: Andrew Cooper; Paul Durrant > Subject: [PATCH 3/3] x86/vMSI-X: also snoop REP MOVS > > ... as at least certain versions of Windows use such to update the > MSI-X table. However, to not overly complicate the logic for now > - only EFLAGS.DF=0 is being handled, > - only updates not crossing MSI-X table entry boundaries are handled. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/hvm/vmsi.c > +++ b/xen/arch/x86/hvm/vmsi.c > @@ -351,9 +351,10 @@ static int msixtbl_range(struct vcpu *v, > ASSERT(r->type == IOREQ_TYPE_COPY); > if ( r->dir == IOREQ_WRITE ) > { > + unsigned int size = r->size; > + > if ( !r->data_is_ptr ) > { > - unsigned int size = r->size; > uint64_t data = r->data; > > if ( size == 8 ) > @@ -366,7 +367,22 @@ static int msixtbl_range(struct vcpu *v, > ((addr & (PCI_MSIX_ENTRY_SIZE - 1)) == > PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) && > !(data & PCI_MSIX_VECTOR_BITMASK) ) > + { > v->arch.hvm_vcpu.hvm_io.msix_snoop_address = addr; > + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = 0; > + } > + } > + else if ( (size == 4 || size == 8) && !r->df && > + r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && > + !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) That's quite an if statement. Any chance of making it more decipherable? I also think it's worth putting the restrictions you outline in the commit in a comment here so that it's clear that the code is not trying to handle all corner cases. > + { > + BUILD_BUG_ON((PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET + 4) & > + (PCI_MSIX_ENTRY_SIZE - 1)); > + > + v->arch.hvm_vcpu.hvm_io.msix_snoop_address = > + addr + size * r->count - 4; > + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = > + r->data + size * r->count - 4; Does there need to be any explicit type promotion here since r->data is uint64_t? Paul > } > } > > @@ -471,6 +487,7 @@ out: > for_each_vcpu ( d, v ) > { > if ( (v->pause_flags & VPF_blocked_in_xen) && > + !v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa && > v->arch.hvm_vcpu.hvm_io.msix_snoop_address == > (gtable + msi_desc->msi_attrib.entry_nr * > PCI_MSIX_ENTRY_SIZE + > @@ -551,9 +568,29 @@ void msixtbl_pt_cleanup(struct domain *d > void msix_write_completion(struct vcpu *v) > { > unsigned long ctrl_address = v- > >arch.hvm_vcpu.hvm_io.msix_unmask_address; > + unsigned long snoop_addr = v- > >arch.hvm_vcpu.hvm_io.msix_snoop_address; > > v->arch.hvm_vcpu.hvm_io.msix_snoop_address = 0; > > + if ( !ctrl_address && snoop_addr && > + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa ) > + { > + const struct msi_desc *desc; > + uint32_t data; > + > + rcu_read_lock(&msixtbl_rcu_lock); > + desc = msixtbl_addr_to_desc(msixtbl_find_entry(v, snoop_addr), > + snoop_addr); > + rcu_read_unlock(&msixtbl_rcu_lock); > + > + if ( desc && > + hvm_copy_from_guest_phys(&data, > + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa, > + sizeof(data)) == HVMCOPY_okay && > + !(data & PCI_MSIX_VECTOR_BITMASK) ) > + ctrl_address = snoop_addr; > + } > + > if ( !ctrl_address ) > return; > > --- unstable.orig/xen/include/asm-x86/hvm/vcpu.h 2016-04-27 > 14:47:25.000000000 +0200 > +++ unstable/xen/include/asm-x86/hvm/vcpu.h 2016-04-25 > 16:04:48.000000000 +0200 > @@ -86,6 +86,7 @@ struct hvm_vcpu_io { > > unsigned long msix_unmask_address; > unsigned long msix_snoop_address; > + unsigned long msix_snoop_gpa; > > const struct g2m_ioport *g2m_ioport; > }; > >
>>> On 28.04.16 at 13:17, <Paul.Durrant@citrix.com> wrote: >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 28 April 2016 10:50 >> @@ -366,7 +367,22 @@ static int msixtbl_range(struct vcpu *v, >> ((addr & (PCI_MSIX_ENTRY_SIZE - 1)) == >> PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) && >> !(data & PCI_MSIX_VECTOR_BITMASK) ) >> + { >> v->arch.hvm_vcpu.hvm_io.msix_snoop_address = addr; >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = 0; >> + } >> + } >> + else if ( (size == 4 || size == 8) && !r->df && >> + r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && >> + !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) > > That's quite an if statement. Any chance of making it more decipherable? I I don't see how I could be doing this. > also think it's worth putting the restrictions you outline in the commit in a > comment here so that it's clear that the code is not trying to handle all > corner cases. Sure. Question is whether by mixing code and comments things would get better readable (to at least somewhat address your request above), or whether that instead would make things worse. Thoughts? >> + { >> + BUILD_BUG_ON((PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET + 4) & >> + (PCI_MSIX_ENTRY_SIZE - 1)); >> + >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_address = >> + addr + size * r->count - 4; >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = >> + r->data + size * r->count - 4; > > Does there need to be any explicit type promotion here since r->data is > uint64_t? No, because both size and r->count did already get bounds checked to very narrow ranges. Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 28 April 2016 12:44 > To: Paul Durrant > Cc: Andrew Cooper; xen-devel > Subject: RE: [PATCH 3/3] x86/vMSI-X: also snoop REP MOVS > > >>> On 28.04.16 at 13:17, <Paul.Durrant@citrix.com> wrote: > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: 28 April 2016 10:50 > >> @@ -366,7 +367,22 @@ static int msixtbl_range(struct vcpu *v, > >> ((addr & (PCI_MSIX_ENTRY_SIZE - 1)) == > >> PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) && > >> !(data & PCI_MSIX_VECTOR_BITMASK) ) > >> + { > >> v->arch.hvm_vcpu.hvm_io.msix_snoop_address = addr; > >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = 0; > >> + } > >> + } > >> + else if ( (size == 4 || size == 8) && !r->df && > >> + r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && > >> + !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) > > > > That's quite an if statement. Any chance of making it more decipherable? I > > I don't see how I could be doing this. > > > also think it's worth putting the restrictions you outline in the commit in a > > comment here so that it's clear that the code is not trying to handle all > > corner cases. > > Sure. Question is whether by mixing code and comments things > would get better readable (to at least somewhat address your > request above), or whether that instead would make things > worse. Thoughts? I think mentioning why you're only tackling the !r->df case would be worth commenting on and if the && !r->df were on a separate line then the comment could go inline. Also, do you really need to check r->count (seems like a count of 0 should have been picked up before the code got here) and then TBH I'm not even sure what !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) is even checking so how about an illustratively named macro? Paul > > >> + { > >> + BUILD_BUG_ON((PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET + 4) & > >> + (PCI_MSIX_ENTRY_SIZE - 1)); > >> + > >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_address = > >> + addr + size * r->count - 4; > >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = > >> + r->data + size * r->count - 4; > > > > Does there need to be any explicit type promotion here since r->data is > > uint64_t? > > No, because both size and r->count did already get bounds > checked to very narrow ranges. > > Jan
>>> On 28.04.16 at 13:58, <Paul.Durrant@citrix.com> wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 28 April 2016 12:44 >> To: Paul Durrant >> Cc: Andrew Cooper; xen-devel >> Subject: RE: [PATCH 3/3] x86/vMSI-X: also snoop REP MOVS >> >> >>> On 28.04.16 at 13:17, <Paul.Durrant@citrix.com> wrote: >> >> From: Jan Beulich [mailto:JBeulich@suse.com] >> >> Sent: 28 April 2016 10:50 >> >> @@ -366,7 +367,22 @@ static int msixtbl_range(struct vcpu *v, >> >> ((addr & (PCI_MSIX_ENTRY_SIZE - 1)) == >> >> PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) && >> >> !(data & PCI_MSIX_VECTOR_BITMASK) ) >> >> + { >> >> v->arch.hvm_vcpu.hvm_io.msix_snoop_address = addr; >> >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = 0; >> >> + } >> >> + } >> >> + else if ( (size == 4 || size == 8) && !r->df && >> >> + r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && >> >> + !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) >> > >> > That's quite an if statement. Any chance of making it more decipherable? I >> >> I don't see how I could be doing this. >> >> > also think it's worth putting the restrictions you outline in the commit in > a >> > comment here so that it's clear that the code is not trying to handle all >> > corner cases. >> >> Sure. Question is whether by mixing code and comments things >> would get better readable (to at least somewhat address your >> request above), or whether that instead would make things >> worse. Thoughts? > > I think mentioning why you're only tackling the !r->df case would be worth > commenting on and if the && !r->df were on a separate line then the comment could > go inline. That's what I did. > Also, do you really need to check r->count (seems like a count of 0 > should have been picked up before the code got here) I've tried to fine where r->count == 0 would be dealt with, but could spot the location (other than relying on x86_emulate.c never passing such down), so since I want to subtract 1 from it (or really 4 from its product with "size") I wanted to be on the safe side. If you prefer, I could replace this by a respective ASSERT()... > and then TBH I'm not > even sure what !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) is even > checking so how about an illustratively named macro? How about this (without macro)? else if ( (size == 4 || size == 8) && /* Only support forward REP MOVS for now. */ !r->df && /* * Only fully support accesses to a single table entry for * now (if multiple ones get written to in one go, only the * final one gets dealt with). */ r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) { Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 28 April 2016 13:31 > To: Paul Durrant > Cc: Andrew Cooper; xen-devel > Subject: RE: [PATCH 3/3] x86/vMSI-X: also snoop REP MOVS > > >>> On 28.04.16 at 13:58, <Paul.Durrant@citrix.com> wrote: > >> -----Original Message----- > >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> Sent: 28 April 2016 12:44 > >> To: Paul Durrant > >> Cc: Andrew Cooper; xen-devel > >> Subject: RE: [PATCH 3/3] x86/vMSI-X: also snoop REP MOVS > >> > >> >>> On 28.04.16 at 13:17, <Paul.Durrant@citrix.com> wrote: > >> >> From: Jan Beulich [mailto:JBeulich@suse.com] > >> >> Sent: 28 April 2016 10:50 > >> >> @@ -366,7 +367,22 @@ static int msixtbl_range(struct vcpu *v, > >> >> ((addr & (PCI_MSIX_ENTRY_SIZE - 1)) == > >> >> PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) && > >> >> !(data & PCI_MSIX_VECTOR_BITMASK) ) > >> >> + { > >> >> v->arch.hvm_vcpu.hvm_io.msix_snoop_address = addr; > >> >> + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = 0; > >> >> + } > >> >> + } > >> >> + else if ( (size == 4 || size == 8) && !r->df && > >> >> + r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && > >> >> + !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) > >> > > >> > That's quite an if statement. Any chance of making it more > decipherable? I > >> > >> I don't see how I could be doing this. > >> > >> > also think it's worth putting the restrictions you outline in the commit in > > a > >> > comment here so that it's clear that the code is not trying to handle all > >> > corner cases. > >> > >> Sure. Question is whether by mixing code and comments things > >> would get better readable (to at least somewhat address your > >> request above), or whether that instead would make things > >> worse. Thoughts? > > > > I think mentioning why you're only tackling the !r->df case would be worth > > commenting on and if the && !r->df were on a separate line then the > comment could > > go inline. > > That's what I did. > > > Also, do you really need to check r->count (seems like a count of 0 > > should have been picked up before the code got here) > > I've tried to fine where r->count == 0 would be dealt with, but > could spot the location (other than relying on x86_emulate.c > never passing such down), so since I want to subtract 1 from it > (or really 4 from its product with "size") I wanted to be on the > safe side. If you prefer, I could replace this by a respective > ASSERT()... > > > and then TBH I'm not > > even sure what !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) > is even > > checking so how about an illustratively named macro? > > How about this (without macro)? > > else if ( (size == 4 || size == 8) && > /* Only support forward REP MOVS for now. */ > !r->df && > /* > * Only fully support accesses to a single table entry for > * now (if multiple ones get written to in one go, only the > * final one gets dealt with). > */ > r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && > !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) > { > That looks a lot better to me :-) Paul > Jan
--- a/xen/arch/x86/hvm/vmsi.c +++ b/xen/arch/x86/hvm/vmsi.c @@ -351,9 +351,10 @@ static int msixtbl_range(struct vcpu *v, ASSERT(r->type == IOREQ_TYPE_COPY); if ( r->dir == IOREQ_WRITE ) { + unsigned int size = r->size; + if ( !r->data_is_ptr ) { - unsigned int size = r->size; uint64_t data = r->data; if ( size == 8 ) @@ -366,7 +367,22 @@ static int msixtbl_range(struct vcpu *v, ((addr & (PCI_MSIX_ENTRY_SIZE - 1)) == PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) && !(data & PCI_MSIX_VECTOR_BITMASK) ) + { v->arch.hvm_vcpu.hvm_io.msix_snoop_address = addr; + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = 0; + } + } + else if ( (size == 4 || size == 8) && !r->df && + r->count && r->count <= PCI_MSIX_ENTRY_SIZE / size && + !((addr + (size * r->count)) & (PCI_MSIX_ENTRY_SIZE - 1)) ) + { + BUILD_BUG_ON((PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET + 4) & + (PCI_MSIX_ENTRY_SIZE - 1)); + + v->arch.hvm_vcpu.hvm_io.msix_snoop_address = + addr + size * r->count - 4; + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa = + r->data + size * r->count - 4; } } @@ -471,6 +487,7 @@ out: for_each_vcpu ( d, v ) { if ( (v->pause_flags & VPF_blocked_in_xen) && + !v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa && v->arch.hvm_vcpu.hvm_io.msix_snoop_address == (gtable + msi_desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE + @@ -551,9 +568,29 @@ void msixtbl_pt_cleanup(struct domain *d void msix_write_completion(struct vcpu *v) { unsigned long ctrl_address = v->arch.hvm_vcpu.hvm_io.msix_unmask_address; + unsigned long snoop_addr = v->arch.hvm_vcpu.hvm_io.msix_snoop_address; v->arch.hvm_vcpu.hvm_io.msix_snoop_address = 0; + if ( !ctrl_address && snoop_addr && + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa ) + { + const struct msi_desc *desc; + uint32_t data; + + rcu_read_lock(&msixtbl_rcu_lock); + desc = msixtbl_addr_to_desc(msixtbl_find_entry(v, snoop_addr), + snoop_addr); + rcu_read_unlock(&msixtbl_rcu_lock); + + if ( desc && + hvm_copy_from_guest_phys(&data, + v->arch.hvm_vcpu.hvm_io.msix_snoop_gpa, + sizeof(data)) == HVMCOPY_okay && + !(data & PCI_MSIX_VECTOR_BITMASK) ) + ctrl_address = snoop_addr; + } + if ( !ctrl_address ) return; --- unstable.orig/xen/include/asm-x86/hvm/vcpu.h 2016-04-27 14:47:25.000000000 +0200 +++ unstable/xen/include/asm-x86/hvm/vcpu.h 2016-04-25 16:04:48.000000000 +0200 @@ -86,6 +86,7 @@ struct hvm_vcpu_io { unsigned long msix_unmask_address; unsigned long msix_snoop_address; + unsigned long msix_snoop_gpa; const struct g2m_ioport *g2m_ioport; };