Message ID | 7b62d06c-1369-2857-81c0-45e2434357f4@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: refine guest_mode() | expand |
On Mon, Apr 27, 2020 at 10:03:05AM +0200, Jan Beulich wrote: > The 2nd of the assertions as well as the macro's return value have been > assuming we're on the primary stack. While for most IST exceptions we > eventually switch back to the main one, for #DF we intentionally never > do, and hence a #DF actually triggering on a user mode insn (which then > is still a Xen bug) would in turn trigger this assertion, rather than > cleanly logging state. > > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > While we could go further and also assert we're on the correct IST > stack in an "else" ti the "if()" added, I'm not fully convinced this > would be generally helpful. I'll be happy to adjust accordingly if > others think differently; at such a point though I think this should > then no longer be a macro. > > --- a/xen/include/asm-x86/regs.h > +++ b/xen/include/asm-x86/regs.h > @@ -10,9 +10,10 @@ > /* Frame pointer must point into current CPU stack. */ \ > ASSERT(diff < STACK_SIZE); \ > /* If not a guest frame, it must be a hypervisor frame. */ \ > - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ > + if ( diff < PRIMARY_STACK_SIZE ) \ > + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ Why not use: ASSERT(diff >= PRIMARY_STACK_SIZE || !diff || ((r)->cs == __HYPERVISOR_CS)); I'm not sure I fully understand this layout, is it possible that you also need to account for the size of cpu_info? Roger.
On 27.04.2020 11:59, Roger Pau Monné wrote: > On Mon, Apr 27, 2020 at 10:03:05AM +0200, Jan Beulich wrote: >> --- a/xen/include/asm-x86/regs.h >> +++ b/xen/include/asm-x86/regs.h >> @@ -10,9 +10,10 @@ >> /* Frame pointer must point into current CPU stack. */ \ >> ASSERT(diff < STACK_SIZE); \ >> /* If not a guest frame, it must be a hypervisor frame. */ \ >> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ >> + if ( diff < PRIMARY_STACK_SIZE ) \ >> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ > > Why not use: > > ASSERT(diff >= PRIMARY_STACK_SIZE || !diff || ((r)->cs == __HYPERVISOR_CS)); Except for the longer (without being helpful imo) string reported if the assertion triggers, I see not difference. > I'm not sure I fully understand this layout, is it possible that you > also need to account for the size of cpu_info? Depends on how paranoid we want the checking here to be, but in going further I wouldn't want this to become sub-page fine-grained if we aren't first doing e.g. what I'm mentioning in the post-commit-message remark. Jan
On 27/04/2020 09:03, Jan Beulich wrote: > The 2nd of the assertions as well as the macro's return value have been > assuming we're on the primary stack. While for most IST exceptions we > eventually switch back to the main one, "we switch to the main one when interrupting user mode". "eventually" isn't accurate as it is before we enter C. > for #DF we intentionally never > do, and hence a #DF actually triggering on a user mode insn (which then > is still a Xen bug) would in turn trigger this assertion, rather than > cleanly logging state. > > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > While we could go further and also assert we're on the correct IST > stack in an "else" ti the "if()" added, I'm not fully convinced this > would be generally helpful. I'll be happy to adjust accordingly if > others think differently; at such a point though I think this should > then no longer be a macro. > > --- a/xen/include/asm-x86/regs.h > +++ b/xen/include/asm-x86/regs.h > @@ -10,9 +10,10 @@ > /* Frame pointer must point into current CPU stack. */ \ > ASSERT(diff < STACK_SIZE); \ > /* If not a guest frame, it must be a hypervisor frame. */ \ > - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ > + if ( diff < PRIMARY_STACK_SIZE ) \ > + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ > /* Return TRUE if it's a guest frame. */ \ > - (diff == 0); \ > + !diff || ((r)->cs != __HYPERVISOR_CS); \ The (diff == 0) already worried me before because it doesn't fail safe, but this makes things more problematic. Consider the case back when we had __HYPERVISOR_CS32. Guest mode is strictly "(r)->cs & 3". Everything else is expectations about how things ought to be laid out, but for safety in release builds, the final judgement should not depend on the expectations evaluating true. ~Andrew
On 27.04.2020 16:35, Andrew Cooper wrote: > On 27/04/2020 09:03, Jan Beulich wrote: >> The 2nd of the assertions as well as the macro's return value have been >> assuming we're on the primary stack. While for most IST exceptions we >> eventually switch back to the main one, > > "we switch to the main one when interrupting user mode". > > "eventually" isn't accurate as it is before we enter C. Right, will change. >> --- a/xen/include/asm-x86/regs.h >> +++ b/xen/include/asm-x86/regs.h >> @@ -10,9 +10,10 @@ >> /* Frame pointer must point into current CPU stack. */ \ >> ASSERT(diff < STACK_SIZE); \ >> /* If not a guest frame, it must be a hypervisor frame. */ \ >> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ >> + if ( diff < PRIMARY_STACK_SIZE ) \ >> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ >> /* Return TRUE if it's a guest frame. */ \ >> - (diff == 0); \ >> + !diff || ((r)->cs != __HYPERVISOR_CS); \ > > The (diff == 0) already worried me before because it doesn't fail safe, > but this makes things more problematic. Consider the case back when we > had __HYPERVISOR_CS32. Yes - if __HYPERVISOR_CS32 would ever have been to be used for anything, it would have needed checking for here. > Guest mode is strictly "(r)->cs & 3". As long as CS (a) gets properly saved (it's a "manual" step for SYSCALL/SYSRET as well as #VMEXIT) and (b) didn't get clobbered. I didn't write this code, I don't think, so I can only guess that there were intentions behind this along these lines. > Everything else is expectations about how things ought to be laid out, > but for safety in release builds, the final judgement should not depend > on the expectations evaluating true. Well, I can switch to a purely CS.RPL based approach, as long as we're happy to live with the possible downside mentioned above. Of course this would then end up being a more intrusive change than originally intended ... Jan
On Mon, Apr 27, 2020 at 04:08:59PM +0200, Jan Beulich wrote: > On 27.04.2020 11:59, Roger Pau Monné wrote: > > On Mon, Apr 27, 2020 at 10:03:05AM +0200, Jan Beulich wrote: > >> --- a/xen/include/asm-x86/regs.h > >> +++ b/xen/include/asm-x86/regs.h > >> @@ -10,9 +10,10 @@ > >> /* Frame pointer must point into current CPU stack. */ \ > >> ASSERT(diff < STACK_SIZE); \ > >> /* If not a guest frame, it must be a hypervisor frame. */ \ > >> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ > >> + if ( diff < PRIMARY_STACK_SIZE ) \ > >> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ > > > > Why not use: > > > > ASSERT(diff >= PRIMARY_STACK_SIZE || !diff || ((r)->cs == __HYPERVISOR_CS)); > > Except for the longer (without being helpful imo) string reported if > the assertion triggers, I see not difference. Wanted to avoid the empty if on non-debug builds, but I assume the compiler will already optimize it out. > > I'm not sure I fully understand this layout, is it possible that you > > also need to account for the size of cpu_info? > > Depends on how paranoid we want the checking here to be, but in going > further I wouldn't want this to become sub-page fine-grained if we > aren't first doing e.g. what I'm mentioning in the post-commit-message > remark. Right, leaving it as-is is fine, just wanted to be sure I fully understand the layout. Thanks, Roger.
On 27/04/2020 16:15, Jan Beulich wrote: > On 27.04.2020 16:35, Andrew Cooper wrote: >> On 27/04/2020 09:03, Jan Beulich wrote: >>> The 2nd of the assertions as well as the macro's return value have been >>> assuming we're on the primary stack. While for most IST exceptions we >>> eventually switch back to the main one, >> "we switch to the main one when interrupting user mode". >> >> "eventually" isn't accurate as it is before we enter C. > Right, will change. > >>> --- a/xen/include/asm-x86/regs.h >>> +++ b/xen/include/asm-x86/regs.h >>> @@ -10,9 +10,10 @@ >>> /* Frame pointer must point into current CPU stack. */ \ >>> ASSERT(diff < STACK_SIZE); \ >>> /* If not a guest frame, it must be a hypervisor frame. */ \ >>> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ >>> + if ( diff < PRIMARY_STACK_SIZE ) \ >>> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ >>> /* Return TRUE if it's a guest frame. */ \ >>> - (diff == 0); \ >>> + !diff || ((r)->cs != __HYPERVISOR_CS); \ >> The (diff == 0) already worried me before because it doesn't fail safe, >> but this makes things more problematic. Consider the case back when we >> had __HYPERVISOR_CS32. > Yes - if __HYPERVISOR_CS32 would ever have been to be used for > anything, it would have needed checking for here. > >> Guest mode is strictly "(r)->cs & 3". > As long as CS (a) gets properly saved (it's a "manual" step for > SYSCALL/SYSRET as well as #VMEXIT) and (b) didn't get clobbered. I > didn't write this code, I don't think, so I can only guess that > there were intentions behind this along these lines. Hmm - the VMExit case might be problematic here, due to the variability in the poison used. > >> Everything else is expectations about how things ought to be laid out, >> but for safety in release builds, the final judgement should not depend >> on the expectations evaluating true. > Well, I can switch to a purely CS.RPL based approach, as long as > we're happy to live with the possible downside mentioned above. > Of course this would then end up being a more intrusive change > than originally intended ... I'd certainly prefer to go for something which is more robust, even if it is a larger change. ~Andrew
On 27.04.2020 22:11, Andrew Cooper wrote: > On 27/04/2020 16:15, Jan Beulich wrote: >> On 27.04.2020 16:35, Andrew Cooper wrote: >>> On 27/04/2020 09:03, Jan Beulich wrote: >>>> The 2nd of the assertions as well as the macro's return value have been >>>> assuming we're on the primary stack. While for most IST exceptions we >>>> eventually switch back to the main one, >>> "we switch to the main one when interrupting user mode". >>> >>> "eventually" isn't accurate as it is before we enter C. >> Right, will change. >> >>>> --- a/xen/include/asm-x86/regs.h >>>> +++ b/xen/include/asm-x86/regs.h >>>> @@ -10,9 +10,10 @@ >>>> /* Frame pointer must point into current CPU stack. */ \ >>>> ASSERT(diff < STACK_SIZE); \ >>>> /* If not a guest frame, it must be a hypervisor frame. */ \ >>>> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ >>>> + if ( diff < PRIMARY_STACK_SIZE ) \ >>>> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ >>>> /* Return TRUE if it's a guest frame. */ \ >>>> - (diff == 0); \ >>>> + !diff || ((r)->cs != __HYPERVISOR_CS); \ >>> The (diff == 0) already worried me before because it doesn't fail safe, >>> but this makes things more problematic. Consider the case back when we >>> had __HYPERVISOR_CS32. >> Yes - if __HYPERVISOR_CS32 would ever have been to be used for >> anything, it would have needed checking for here. >> >>> Guest mode is strictly "(r)->cs & 3". >> As long as CS (a) gets properly saved (it's a "manual" step for >> SYSCALL/SYSRET as well as #VMEXIT) and (b) didn't get clobbered. I >> didn't write this code, I don't think, so I can only guess that >> there were intentions behind this along these lines. > > Hmm - the VMExit case might be problematic here, due to the variability > in the poison used. "Variability" is an understatement - there's no poisoning at all in release builds afaics (and to be honest it seems a somewhat pointless to write the same values over and over again in debug mode). With this, ... >>> Everything else is expectations about how things ought to be laid out, >>> but for safety in release builds, the final judgement should not depend >>> on the expectations evaluating true. >> Well, I can switch to a purely CS.RPL based approach, as long as >> we're happy to live with the possible downside mentioned above. >> Of course this would then end up being a more intrusive change >> than originally intended ... > > I'd certainly prefer to go for something which is more robust, even if > it is a larger change. ... what's your suggestion? Basing on _just_ CS.RPL obviously won't work. Not even if we put in place the guest's CS (albeit that somewhat depends on the meaning we assign to the macro's returned value). Using current inside the macro to determine whether the guest is HVM would also seem fragile to me - there are quite a few uses of guest_mode(). Which would leave passing in a const struct vcpu * (or domain *), requiring to touch all call sites, including Arm's. Compared to this it would seem to me that the change as presented is a clear improvement without becoming overly large of a change. Jan
On Tue, Apr 28, 2020 at 08:30:12AM +0200, Jan Beulich wrote: > On 27.04.2020 22:11, Andrew Cooper wrote: > > On 27/04/2020 16:15, Jan Beulich wrote: > >> On 27.04.2020 16:35, Andrew Cooper wrote: > >>> On 27/04/2020 09:03, Jan Beulich wrote: > >>>> The 2nd of the assertions as well as the macro's return value have been > >>>> assuming we're on the primary stack. While for most IST exceptions we > >>>> eventually switch back to the main one, > >>> "we switch to the main one when interrupting user mode". > >>> > >>> "eventually" isn't accurate as it is before we enter C. > >> Right, will change. > >> > >>>> --- a/xen/include/asm-x86/regs.h > >>>> +++ b/xen/include/asm-x86/regs.h > >>>> @@ -10,9 +10,10 @@ > >>>> /* Frame pointer must point into current CPU stack. */ \ > >>>> ASSERT(diff < STACK_SIZE); \ > >>>> /* If not a guest frame, it must be a hypervisor frame. */ \ > >>>> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ > >>>> + if ( diff < PRIMARY_STACK_SIZE ) \ > >>>> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ > >>>> /* Return TRUE if it's a guest frame. */ \ > >>>> - (diff == 0); \ > >>>> + !diff || ((r)->cs != __HYPERVISOR_CS); \ > >>> The (diff == 0) already worried me before because it doesn't fail safe, > >>> but this makes things more problematic. Consider the case back when we > >>> had __HYPERVISOR_CS32. > >> Yes - if __HYPERVISOR_CS32 would ever have been to be used for > >> anything, it would have needed checking for here. > >> > >>> Guest mode is strictly "(r)->cs & 3". > >> As long as CS (a) gets properly saved (it's a "manual" step for > >> SYSCALL/SYSRET as well as #VMEXIT) and (b) didn't get clobbered. I > >> didn't write this code, I don't think, so I can only guess that > >> there were intentions behind this along these lines. > > > > Hmm - the VMExit case might be problematic here, due to the variability > > in the poison used. > > "Variability" is an understatement - there's no poisoning at all > in release builds afaics (and to be honest it seems a somewhat > pointless to write the same values over and over again in debug > mode). With this, ... > > >>> Everything else is expectations about how things ought to be laid out, > >>> but for safety in release builds, the final judgement should not depend > >>> on the expectations evaluating true. > >> Well, I can switch to a purely CS.RPL based approach, as long as > >> we're happy to live with the possible downside mentioned above. > >> Of course this would then end up being a more intrusive change > >> than originally intended ... > > > > I'd certainly prefer to go for something which is more robust, even if > > it is a larger change. > > ... what's your suggestion? Basing on _just_ CS.RPL obviously won't > work. Not even if we put in place the guest's CS (albeit that > somewhat depends on the meaning we assign to the macro's returned > value). Just to check I'm following this correctly, using CS.RPL won't work for HVM guests, as HVM can legitimately use a RPL of 0 (which is not the case for PV guests). Doesn't the same apply to the usage of __HYPERVISOR_CS? (A HVM guest could also use the same code segment value as Xen?) > Using current inside the macro to determine whether the > guest is HVM would also seem fragile to me - there are quite a few > uses of guest_mode(). Which would leave passing in a const struct > vcpu * (or domain *), requiring to touch all call sites, including > Arm's. Fragile or slow? Are there corner cases where guest_mode is used where current is not reliable? > Compared to this it would seem to me that the change as presented > is a clear improvement without becoming overly large of a change. Using the cs register is already part of the guest_mode code, even if just in debug mode, hence I don't see it as a regression from existing code. It however feels weird to me that the reporter of the issue doesn't agree with the fix, and hence would like to know if there's a way we could achieve consensus on this. Roger.
On 18.05.2020 16:51, Roger Pau Monné wrote: > On Tue, Apr 28, 2020 at 08:30:12AM +0200, Jan Beulich wrote: >> On 27.04.2020 22:11, Andrew Cooper wrote: >>> On 27/04/2020 16:15, Jan Beulich wrote: >>>> On 27.04.2020 16:35, Andrew Cooper wrote: >>>>> On 27/04/2020 09:03, Jan Beulich wrote: >>>>>> --- a/xen/include/asm-x86/regs.h >>>>>> +++ b/xen/include/asm-x86/regs.h >>>>>> @@ -10,9 +10,10 @@ >>>>>> /* Frame pointer must point into current CPU stack. */ \ >>>>>> ASSERT(diff < STACK_SIZE); \ >>>>>> /* If not a guest frame, it must be a hypervisor frame. */ \ >>>>>> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ >>>>>> + if ( diff < PRIMARY_STACK_SIZE ) \ >>>>>> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ >>>>>> /* Return TRUE if it's a guest frame. */ \ >>>>>> - (diff == 0); \ >>>>>> + !diff || ((r)->cs != __HYPERVISOR_CS); \ >>>>> The (diff == 0) already worried me before because it doesn't fail safe, >>>>> but this makes things more problematic. Consider the case back when we >>>>> had __HYPERVISOR_CS32. >>>> Yes - if __HYPERVISOR_CS32 would ever have been to be used for >>>> anything, it would have needed checking for here. >>>> >>>>> Guest mode is strictly "(r)->cs & 3". >>>> As long as CS (a) gets properly saved (it's a "manual" step for >>>> SYSCALL/SYSRET as well as #VMEXIT) and (b) didn't get clobbered. I >>>> didn't write this code, I don't think, so I can only guess that >>>> there were intentions behind this along these lines. >>> >>> Hmm - the VMExit case might be problematic here, due to the variability >>> in the poison used. >> >> "Variability" is an understatement - there's no poisoning at all >> in release builds afaics (and to be honest it seems a somewhat >> pointless to write the same values over and over again in debug >> mode). With this, ... >> >>>>> Everything else is expectations about how things ought to be laid out, >>>>> but for safety in release builds, the final judgement should not depend >>>>> on the expectations evaluating true. >>>> Well, I can switch to a purely CS.RPL based approach, as long as >>>> we're happy to live with the possible downside mentioned above. >>>> Of course this would then end up being a more intrusive change >>>> than originally intended ... >>> >>> I'd certainly prefer to go for something which is more robust, even if >>> it is a larger change. >> >> ... what's your suggestion? Basing on _just_ CS.RPL obviously won't >> work. Not even if we put in place the guest's CS (albeit that >> somewhat depends on the meaning we assign to the macro's returned >> value). > > Just to check I'm following this correctly, using CS.RPL won't work > for HVM guests, as HVM can legitimately use a RPL of 0 (which is not > the case for PV guests). Doesn't the same apply to the usage of > __HYPERVISOR_CS? (A HVM guest could also use the same code segment > value as Xen?) Of course (and in particular Xen as a guest would). My "Basing on _just_ CS.RPL" wasn't meant to exclude the rest of the selector, but to contrast this to the case where "diff" also is involved in the calculation (which looks to be what Andrew would prefer to see go away). >> Using current inside the macro to determine whether the >> guest is HVM would also seem fragile to me - there are quite a few >> uses of guest_mode(). Which would leave passing in a const struct >> vcpu * (or domain *), requiring to touch all call sites, including >> Arm's. > > Fragile or slow? Are there corner cases where guest_mode is used where > current is not reliable? This question is why I said "there are quite a few uses of guest_mode()" - auditing them all is just one side of the medal. The other is to prevent a new use appearing in the future that can be reached by a call path in the time window where a lazy context switch is pending (i.e. when current has already been updated, but register state hasn't been yet). >> Compared to this it would seem to me that the change as presented >> is a clear improvement without becoming overly large of a change. > > Using the cs register is already part of the guest_mode code, even if > just in debug mode, hence I don't see it as a regression from existing > code. It however feels weird to me that the reporter of the issue > doesn't agree with the fix, and hence would like to know if there's a > way we could achieve consensus on this. Indeed. I'd be happy to make further adjustments, if only I had a clear understanding of what is wanted (or why leaving things as they are is better than a little bit of an improvement). Jan
On Wed, May 20, 2020 at 10:56:26AM +0200, Jan Beulich wrote: > On 18.05.2020 16:51, Roger Pau Monné wrote: > > On Tue, Apr 28, 2020 at 08:30:12AM +0200, Jan Beulich wrote: > >> On 27.04.2020 22:11, Andrew Cooper wrote: > >>> On 27/04/2020 16:15, Jan Beulich wrote: > >>>> On 27.04.2020 16:35, Andrew Cooper wrote: > >>>>> On 27/04/2020 09:03, Jan Beulich wrote: > >>>>>> --- a/xen/include/asm-x86/regs.h > >>>>>> +++ b/xen/include/asm-x86/regs.h > >>>>>> @@ -10,9 +10,10 @@ > >>>>>> /* Frame pointer must point into current CPU stack. */ \ > >>>>>> ASSERT(diff < STACK_SIZE); \ > >>>>>> /* If not a guest frame, it must be a hypervisor frame. */ \ > >>>>>> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ > >>>>>> + if ( diff < PRIMARY_STACK_SIZE ) \ > >>>>>> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ > >>>>>> /* Return TRUE if it's a guest frame. */ \ > >>>>>> - (diff == 0); \ > >>>>>> + !diff || ((r)->cs != __HYPERVISOR_CS); \ > >>>>> The (diff == 0) already worried me before because it doesn't fail safe, > >>>>> but this makes things more problematic. Consider the case back when we > >>>>> had __HYPERVISOR_CS32. > >>>> Yes - if __HYPERVISOR_CS32 would ever have been to be used for > >>>> anything, it would have needed checking for here. > >>>> > >>>>> Guest mode is strictly "(r)->cs & 3". > >>>> As long as CS (a) gets properly saved (it's a "manual" step for > >>>> SYSCALL/SYSRET as well as #VMEXIT) and (b) didn't get clobbered. I > >>>> didn't write this code, I don't think, so I can only guess that > >>>> there were intentions behind this along these lines. > >>> > >>> Hmm - the VMExit case might be problematic here, due to the variability > >>> in the poison used. > >> > >> "Variability" is an understatement - there's no poisoning at all > >> in release builds afaics (and to be honest it seems a somewhat > >> pointless to write the same values over and over again in debug > >> mode). With this, ... > >> > >>>>> Everything else is expectations about how things ought to be laid out, > >>>>> but for safety in release builds, the final judgement should not depend > >>>>> on the expectations evaluating true. > >>>> Well, I can switch to a purely CS.RPL based approach, as long as > >>>> we're happy to live with the possible downside mentioned above. > >>>> Of course this would then end up being a more intrusive change > >>>> than originally intended ... > >>> > >>> I'd certainly prefer to go for something which is more robust, even if > >>> it is a larger change. > >> > >> ... what's your suggestion? Basing on _just_ CS.RPL obviously won't > >> work. Not even if we put in place the guest's CS (albeit that > >> somewhat depends on the meaning we assign to the macro's returned > >> value). > > > > Just to check I'm following this correctly, using CS.RPL won't work > > for HVM guests, as HVM can legitimately use a RPL of 0 (which is not > > the case for PV guests). Doesn't the same apply to the usage of > > __HYPERVISOR_CS? (A HVM guest could also use the same code segment > > value as Xen?) > > Of course (and in particular Xen as a guest would). My "Basing on > _just_ CS.RPL" wasn't meant to exclude the rest of the selector, > but to contrast this to the case where "diff" also is involved in > the calculation (which looks to be what Andrew would prefer to see > go away). > > >> Using current inside the macro to determine whether the > >> guest is HVM would also seem fragile to me - there are quite a few > >> uses of guest_mode(). Which would leave passing in a const struct > >> vcpu * (or domain *), requiring to touch all call sites, including > >> Arm's. > > > > Fragile or slow? Are there corner cases where guest_mode is used where > > current is not reliable? > > This question is why I said "there are quite a few uses of > guest_mode()" - auditing them all is just one side of the medal. > The other is to prevent a new use appearing in the future that > can be reached by a call path in the time window where a lazy > context switch is pending (i.e. when current has already been > updated, but register state hasn't been yet). > > >> Compared to this it would seem to me that the change as presented > >> is a clear improvement without becoming overly large of a change. > > > > Using the cs register is already part of the guest_mode code, even if > > just in debug mode, hence I don't see it as a regression from existing > > code. It however feels weird to me that the reporter of the issue > > doesn't agree with the fix, and hence would like to know if there's a > > way we could achieve consensus on this. > > Indeed. I'd be happy to make further adjustments, if only I had a > clear understanding of what is wanted (or why leaving things as > they are is better than a little bit of an improvement). OK, so I think I'm starting to understand this all. Sorry it's taken me so long. So it's my understanding that diff != 0 can only happen in Xen context, or when in an IST that has a different stack (ie: MCE, NMI or DF according to current.h) and running in PV mode? Wouldn't in then be fine to use (r)->cs & 3 to check we are in guest mode if diff != 0? I see a lot of other places where cs & 3 is already used to that effect AFAICT (like entry.S). Roger.
On 20.05.2020 17:13, Roger Pau Monné wrote: > On Wed, May 20, 2020 at 10:56:26AM +0200, Jan Beulich wrote: >> On 18.05.2020 16:51, Roger Pau Monné wrote: >>> On Tue, Apr 28, 2020 at 08:30:12AM +0200, Jan Beulich wrote: >>>> On 27.04.2020 22:11, Andrew Cooper wrote: >>>>> On 27/04/2020 16:15, Jan Beulich wrote: >>>>>> On 27.04.2020 16:35, Andrew Cooper wrote: >>>>>>> On 27/04/2020 09:03, Jan Beulich wrote: >>>>>>>> --- a/xen/include/asm-x86/regs.h >>>>>>>> +++ b/xen/include/asm-x86/regs.h >>>>>>>> @@ -10,9 +10,10 @@ >>>>>>>> /* Frame pointer must point into current CPU stack. */ \ >>>>>>>> ASSERT(diff < STACK_SIZE); \ >>>>>>>> /* If not a guest frame, it must be a hypervisor frame. */ \ >>>>>>>> - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ >>>>>>>> + if ( diff < PRIMARY_STACK_SIZE ) \ >>>>>>>> + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ >>>>>>>> /* Return TRUE if it's a guest frame. */ \ >>>>>>>> - (diff == 0); \ >>>>>>>> + !diff || ((r)->cs != __HYPERVISOR_CS); \ >>>>>>> The (diff == 0) already worried me before because it doesn't fail safe, >>>>>>> but this makes things more problematic. Consider the case back when we >>>>>>> had __HYPERVISOR_CS32. >>>>>> Yes - if __HYPERVISOR_CS32 would ever have been to be used for >>>>>> anything, it would have needed checking for here. >>>>>> >>>>>>> Guest mode is strictly "(r)->cs & 3". >>>>>> As long as CS (a) gets properly saved (it's a "manual" step for >>>>>> SYSCALL/SYSRET as well as #VMEXIT) and (b) didn't get clobbered. I >>>>>> didn't write this code, I don't think, so I can only guess that >>>>>> there were intentions behind this along these lines. >>>>> >>>>> Hmm - the VMExit case might be problematic here, due to the variability >>>>> in the poison used. >>>> >>>> "Variability" is an understatement - there's no poisoning at all >>>> in release builds afaics (and to be honest it seems a somewhat >>>> pointless to write the same values over and over again in debug >>>> mode). With this, ... >>>> >>>>>>> Everything else is expectations about how things ought to be laid out, >>>>>>> but for safety in release builds, the final judgement should not depend >>>>>>> on the expectations evaluating true. >>>>>> Well, I can switch to a purely CS.RPL based approach, as long as >>>>>> we're happy to live with the possible downside mentioned above. >>>>>> Of course this would then end up being a more intrusive change >>>>>> than originally intended ... >>>>> >>>>> I'd certainly prefer to go for something which is more robust, even if >>>>> it is a larger change. >>>> >>>> ... what's your suggestion? Basing on _just_ CS.RPL obviously won't >>>> work. Not even if we put in place the guest's CS (albeit that >>>> somewhat depends on the meaning we assign to the macro's returned >>>> value). >>> >>> Just to check I'm following this correctly, using CS.RPL won't work >>> for HVM guests, as HVM can legitimately use a RPL of 0 (which is not >>> the case for PV guests). Doesn't the same apply to the usage of >>> __HYPERVISOR_CS? (A HVM guest could also use the same code segment >>> value as Xen?) >> >> Of course (and in particular Xen as a guest would). My "Basing on >> _just_ CS.RPL" wasn't meant to exclude the rest of the selector, >> but to contrast this to the case where "diff" also is involved in >> the calculation (which looks to be what Andrew would prefer to see >> go away). >> >>>> Using current inside the macro to determine whether the >>>> guest is HVM would also seem fragile to me - there are quite a few >>>> uses of guest_mode(). Which would leave passing in a const struct >>>> vcpu * (or domain *), requiring to touch all call sites, including >>>> Arm's. >>> >>> Fragile or slow? Are there corner cases where guest_mode is used where >>> current is not reliable? >> >> This question is why I said "there are quite a few uses of >> guest_mode()" - auditing them all is just one side of the medal. >> The other is to prevent a new use appearing in the future that >> can be reached by a call path in the time window where a lazy >> context switch is pending (i.e. when current has already been >> updated, but register state hasn't been yet). >> >>>> Compared to this it would seem to me that the change as presented >>>> is a clear improvement without becoming overly large of a change. >>> >>> Using the cs register is already part of the guest_mode code, even if >>> just in debug mode, hence I don't see it as a regression from existing >>> code. It however feels weird to me that the reporter of the issue >>> doesn't agree with the fix, and hence would like to know if there's a >>> way we could achieve consensus on this. >> >> Indeed. I'd be happy to make further adjustments, if only I had a >> clear understanding of what is wanted (or why leaving things as >> they are is better than a little bit of an improvement). > > OK, so I think I'm starting to understand this all. Sorry it's taken > me so long. So it's my understanding that diff != 0 can only happen in > Xen context, or when in an IST that has a different stack (ie: MCE, NMI > or DF according to current.h) and running in PV mode? > > Wouldn't in then be fine to use (r)->cs & 3 to check we are in guest > mode if diff != 0? I see a lot of other places where cs & 3 is already > used to that effect AFAICT (like entry.S). Technically this would be correct afaics, but the idea with all this is (or should I say "looks to be"?) to have the checks be as tight as possible, to make sure we don't mistakenly consider something "guest mode" which really isn't. IOW your suggestion would be fine with me if we could exclude bugs anywhere in the code. But since this isn't realistic, I consider your suggestion to be relaxing things by too much. Jan
On Fri, May 22, 2020 at 11:52:42AM +0200, Jan Beulich wrote: > On 20.05.2020 17:13, Roger Pau Monné wrote: > > OK, so I think I'm starting to understand this all. Sorry it's taken > > me so long. So it's my understanding that diff != 0 can only happen in > > Xen context, or when in an IST that has a different stack (ie: MCE, NMI > > or DF according to current.h) and running in PV mode? > > > > Wouldn't in then be fine to use (r)->cs & 3 to check we are in guest > > mode if diff != 0? I see a lot of other places where cs & 3 is already > > used to that effect AFAICT (like entry.S). > > Technically this would be correct afaics, but the idea with all this > is (or should I say "looks to be"?) to have the checks be as tight as > possible, to make sure we don't mistakenly consider something "guest > mode" which really isn't. IOW your suggestion would be fine with me > if we could exclude bugs anywhere in the code. But since this isn't > realistic, I consider your suggestion to be relaxing things by too > much. OK, so I take that (long time) we might also want to change the cs & 3 checks from entry.S to check against __HYPERVISOR_CS explicitly? What I would prefer is to have some kind of homogeneity in how guest mode vs Xen mode checks are performed, so that we don't confuse people. Thanks, Roger.
On 22.05.2020 12:48, Roger Pau Monné wrote: > On Fri, May 22, 2020 at 11:52:42AM +0200, Jan Beulich wrote: >> On 20.05.2020 17:13, Roger Pau Monné wrote: >>> OK, so I think I'm starting to understand this all. Sorry it's taken >>> me so long. So it's my understanding that diff != 0 can only happen in >>> Xen context, or when in an IST that has a different stack (ie: MCE, NMI >>> or DF according to current.h) and running in PV mode? >>> >>> Wouldn't in then be fine to use (r)->cs & 3 to check we are in guest >>> mode if diff != 0? I see a lot of other places where cs & 3 is already >>> used to that effect AFAICT (like entry.S). >> >> Technically this would be correct afaics, but the idea with all this >> is (or should I say "looks to be"?) to have the checks be as tight as >> possible, to make sure we don't mistakenly consider something "guest >> mode" which really isn't. IOW your suggestion would be fine with me >> if we could exclude bugs anywhere in the code. But since this isn't >> realistic, I consider your suggestion to be relaxing things by too >> much. > > OK, so I take that (long time) we might also want to change the cs & 3 > checks from entry.S to check against __HYPERVISOR_CS explicitly? I didn't think so, no (not the least because of there not being any guarantee afaik that EFI runtime calls couldn't play with segment registers; they shouldn't, yes, but there's a lot of other "should" many don't obey to). Those are guaranteed PV-only code paths. The main issue here is that ->cs cannot be relied upon when a frame points at HVM state. Jan
On Fri, May 22, 2020 at 02:00:22PM +0200, Jan Beulich wrote: > On 22.05.2020 12:48, Roger Pau Monné wrote: > > On Fri, May 22, 2020 at 11:52:42AM +0200, Jan Beulich wrote: > >> On 20.05.2020 17:13, Roger Pau Monné wrote: > >>> OK, so I think I'm starting to understand this all. Sorry it's taken > >>> me so long. So it's my understanding that diff != 0 can only happen in > >>> Xen context, or when in an IST that has a different stack (ie: MCE, NMI > >>> or DF according to current.h) and running in PV mode? > >>> > >>> Wouldn't in then be fine to use (r)->cs & 3 to check we are in guest > >>> mode if diff != 0? I see a lot of other places where cs & 3 is already > >>> used to that effect AFAICT (like entry.S). > >> > >> Technically this would be correct afaics, but the idea with all this > >> is (or should I say "looks to be"?) to have the checks be as tight as > >> possible, to make sure we don't mistakenly consider something "guest > >> mode" which really isn't. IOW your suggestion would be fine with me > >> if we could exclude bugs anywhere in the code. But since this isn't > >> realistic, I consider your suggestion to be relaxing things by too > >> much. > > > > OK, so I take that (long time) we might also want to change the cs & 3 > > checks from entry.S to check against __HYPERVISOR_CS explicitly? > > I didn't think so, no (not the least because of there not being any > guarantee afaik that EFI runtime calls couldn't play with segment > registers; they shouldn't, yes, but there's a lot of other "should" > many don't obey to). Those are guaranteed PV-only code paths. The > main issue here is that ->cs cannot be relied upon when a frame > points at HVM state. Well, if it points at HVM state it could equally have __HYPERVISOR_CS set by the guest. Will things work anyway if you get here from an exception generated by EFI code that has changed the code segment? You are going to hit the assert at least, since diff will be != 0 and cs != __HYPERVISOR_CS? I would prefer to keep things coherent by either using cs & 3 or cs == __HYPERVISOR_CS everywhere if possible, as I'm still unsure of the benefit of using __HYPERVISOR_CS. Thanks, Roger.
On 26.05.2020 12:56, Roger Pau Monné wrote: > On Fri, May 22, 2020 at 02:00:22PM +0200, Jan Beulich wrote: >> On 22.05.2020 12:48, Roger Pau Monné wrote: >>> On Fri, May 22, 2020 at 11:52:42AM +0200, Jan Beulich wrote: >>>> On 20.05.2020 17:13, Roger Pau Monné wrote: >>>>> OK, so I think I'm starting to understand this all. Sorry it's taken >>>>> me so long. So it's my understanding that diff != 0 can only happen in >>>>> Xen context, or when in an IST that has a different stack (ie: MCE, NMI >>>>> or DF according to current.h) and running in PV mode? >>>>> >>>>> Wouldn't in then be fine to use (r)->cs & 3 to check we are in guest >>>>> mode if diff != 0? I see a lot of other places where cs & 3 is already >>>>> used to that effect AFAICT (like entry.S). >>>> >>>> Technically this would be correct afaics, but the idea with all this >>>> is (or should I say "looks to be"?) to have the checks be as tight as >>>> possible, to make sure we don't mistakenly consider something "guest >>>> mode" which really isn't. IOW your suggestion would be fine with me >>>> if we could exclude bugs anywhere in the code. But since this isn't >>>> realistic, I consider your suggestion to be relaxing things by too >>>> much. >>> >>> OK, so I take that (long time) we might also want to change the cs & 3 >>> checks from entry.S to check against __HYPERVISOR_CS explicitly? >> >> I didn't think so, no (not the least because of there not being any >> guarantee afaik that EFI runtime calls couldn't play with segment >> registers; they shouldn't, yes, but there's a lot of other "should" >> many don't obey to). Those are guaranteed PV-only code paths. The >> main issue here is that ->cs cannot be relied upon when a frame >> points at HVM state. > > Well, if it points at HVM state it could equally have __HYPERVISOR_CS > set by the guest. No, that's not the point. ->cs will never be __HYPERVISOR_CS in that case, as we never store the guest's CS selector there. Instead hvm_invalidate_regs_fields() clobbers the field in debug builds (with a value resulting in RPL 3), but zero (i.e. a value implying RPL 0) remains in place in release builds. Instead of doing this clobbering in debug mode only, we could - as I think I did suggest before - clobber always, but just once during vCPU init rather than on every VM exit. In debug mode we could then instead check that the dummy values didn't themselves get clobbered. > Will things work anyway if you get here from an exception generated by > EFI code that has changed the code segment? You are going to hit the > assert at least, since diff will be != 0 and cs != __HYPERVISOR_CS? What would guarantee the latter? Additionally they could in principle also have switched stacks then, i.e. diff may then also be larger than PRIMARY_STACK_SIZE, in which case - with the patch in place - the assertion is bypassed altogether. > I would prefer to keep things coherent by either using cs & 3 or > cs == __HYPERVISOR_CS everywhere if possible, as I'm still unsure of > the benefit of using __HYPERVISOR_CS. See above. Jan
On Tue, May 26, 2020 at 03:55:39PM +0200, Jan Beulich wrote: > On 26.05.2020 12:56, Roger Pau Monné wrote: > > On Fri, May 22, 2020 at 02:00:22PM +0200, Jan Beulich wrote: > >> On 22.05.2020 12:48, Roger Pau Monné wrote: > >>> On Fri, May 22, 2020 at 11:52:42AM +0200, Jan Beulich wrote: > >>>> On 20.05.2020 17:13, Roger Pau Monné wrote: > >>>>> OK, so I think I'm starting to understand this all. Sorry it's taken > >>>>> me so long. So it's my understanding that diff != 0 can only happen in > >>>>> Xen context, or when in an IST that has a different stack (ie: MCE, NMI > >>>>> or DF according to current.h) and running in PV mode? > >>>>> > >>>>> Wouldn't in then be fine to use (r)->cs & 3 to check we are in guest > >>>>> mode if diff != 0? I see a lot of other places where cs & 3 is already > >>>>> used to that effect AFAICT (like entry.S). > >>>> > >>>> Technically this would be correct afaics, but the idea with all this > >>>> is (or should I say "looks to be"?) to have the checks be as tight as > >>>> possible, to make sure we don't mistakenly consider something "guest > >>>> mode" which really isn't. IOW your suggestion would be fine with me > >>>> if we could exclude bugs anywhere in the code. But since this isn't > >>>> realistic, I consider your suggestion to be relaxing things by too > >>>> much. > >>> > >>> OK, so I take that (long time) we might also want to change the cs & 3 > >>> checks from entry.S to check against __HYPERVISOR_CS explicitly? > >> > >> I didn't think so, no (not the least because of there not being any > >> guarantee afaik that EFI runtime calls couldn't play with segment > >> registers; they shouldn't, yes, but there's a lot of other "should" > >> many don't obey to). Those are guaranteed PV-only code paths. The > >> main issue here is that ->cs cannot be relied upon when a frame > >> points at HVM state. > > > > Well, if it points at HVM state it could equally have __HYPERVISOR_CS > > set by the guest. > > No, that's not the point. ->cs will never be __HYPERVISOR_CS in that > case, as we never store the guest's CS selector there. Instead > hvm_invalidate_regs_fields() clobbers the field in debug builds (with > a value resulting in RPL 3), but zero (i.e. a value implying RPL 0) > remains in place in release builds. > > Instead of doing this clobbering in debug mode only, we could - as I > think I did suggest before - clobber always, but just once during vCPU > init rather than on every VM exit. In debug mode we could then instead > check that the dummy values didn't themselves get clobbered. It would make sense to clobber it always with a value that has RPL > 0, so that it's consistent with PV state. > > Will things work anyway if you get here from an exception generated by > > EFI code that has changed the code segment? You are going to hit the > > assert at least, since diff will be != 0 and cs != __HYPERVISOR_CS? > > What would guarantee the latter? Additionally they could in principle > also have switched stacks then, i.e. diff may then also be larger than > PRIMARY_STACK_SIZE, in which case - with the patch in place - the > assertion is bypassed altogether. > > > I would prefer to keep things coherent by either using cs & 3 or > > cs == __HYPERVISOR_CS everywhere if possible, as I'm still unsure of > > the benefit of using __HYPERVISOR_CS. > > See above. Well, I think it's an improvement overall, as it allows to properly handle the case where a PV guest could manage to trigger an exception that uses a stack different than the primary one. Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Roger.
--- a/xen/include/asm-x86/regs.h +++ b/xen/include/asm-x86/regs.h @@ -10,9 +10,10 @@ /* Frame pointer must point into current CPU stack. */ \ ASSERT(diff < STACK_SIZE); \ /* If not a guest frame, it must be a hypervisor frame. */ \ - ASSERT((diff == 0) || (r->cs == __HYPERVISOR_CS)); \ + if ( diff < PRIMARY_STACK_SIZE ) \ + ASSERT(!diff || ((r)->cs == __HYPERVISOR_CS)); \ /* Return TRUE if it's a guest frame. */ \ - (diff == 0); \ + !diff || ((r)->cs != __HYPERVISOR_CS); \ }) #endif /* __X86_REGS_H__ */
The 2nd of the assertions as well as the macro's return value have been assuming we're on the primary stack. While for most IST exceptions we eventually switch back to the main one, for #DF we intentionally never do, and hence a #DF actually triggering on a user mode insn (which then is still a Xen bug) would in turn trigger this assertion, rather than cleanly logging state. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> --- While we could go further and also assert we're on the correct IST stack in an "else" ti the "if()" added, I'm not fully convinced this would be generally helpful. I'll be happy to adjust accordingly if others think differently; at such a point though I think this should then no longer be a macro.