Message ID | 56D97F2102000078000D955D@prv-mh.provo.novell.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 04/03/16 11:27, Jan Beulich wrote: > Make sure the near JMP/CALL check doesn't consume uninitialized > data, not even in a benign way. And relax the length check at once. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/alternative.c > +++ b/xen/arch/x86/alternative.c > @@ -174,7 +174,7 @@ static void __init apply_alternatives(st > memcpy(insnbuf, replacement, a->replacementlen); > > /* 0xe8/0xe9 are relative branches; fix the offset. */ > - if ( (*insnbuf & 0xfe) == 0xe8 && a->replacementlen == 5 ) > + if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 ) > *(s32 *)(insnbuf + 1) += replacement - instr; > > add_nops(insnbuf + a->replacementlen, > > > Swapping the order is definitely a good thing. However, relaxing the length check seems less so. `E8 rel32` or `E9 rel32` encodings are strictly 5 bytes long. There are complications with the `67 E{8,9} rel16` encodings, but those are not catered for anyway, and the manual warns about undefined behaviour if used in long mode. What is your usecase for relaxing the check? IMO, if it isn't exactly 5 bytes long, there is some corruption somewhere and the relocation should't happen. ~Andrew
>>> On 07.03.16 at 16:43, <andrew.cooper3@citrix.com> wrote: > On 04/03/16 11:27, Jan Beulich wrote: >> Make sure the near JMP/CALL check doesn't consume uninitialized >> data, not even in a benign way. And relax the length check at once. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> --- a/xen/arch/x86/alternative.c >> +++ b/xen/arch/x86/alternative.c >> @@ -174,7 +174,7 @@ static void __init apply_alternatives(st >> memcpy(insnbuf, replacement, a->replacementlen); >> >> /* 0xe8/0xe9 are relative branches; fix the offset. */ >> - if ( (*insnbuf & 0xfe) == 0xe8 && a->replacementlen == 5 ) >> + if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 ) >> *(s32 *)(insnbuf + 1) += replacement - instr; >> >> add_nops(insnbuf + a->replacementlen, >> >> >> > > Swapping the order is definitely a good thing. > > However, relaxing the length check seems less so. `E8 rel32` or `E9 > rel32` encodings are strictly 5 bytes long. > > There are complications with the `67 E{8,9} rel16` encodings, but those > are not catered for anyway, and the manual warns about undefined > behaviour if used in long mode. > > What is your usecase for relaxing the check? IMO, if it isn't exactly 5 > bytes long, there is some corruption somewhere and the relocation > should't happen. The relaxation is solely because at least CALL could validly be followed by further instructions. Jan
On 07/03/16 15:56, Jan Beulich wrote: >>>> On 07.03.16 at 16:43, <andrew.cooper3@citrix.com> wrote: >> On 04/03/16 11:27, Jan Beulich wrote: >>> Make sure the near JMP/CALL check doesn't consume uninitialized >>> data, not even in a benign way. And relax the length check at once. >>> >>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>> >>> --- a/xen/arch/x86/alternative.c >>> +++ b/xen/arch/x86/alternative.c >>> @@ -174,7 +174,7 @@ static void __init apply_alternatives(st >>> memcpy(insnbuf, replacement, a->replacementlen); >>> >>> /* 0xe8/0xe9 are relative branches; fix the offset. */ >>> - if ( (*insnbuf & 0xfe) == 0xe8 && a->replacementlen == 5 ) >>> + if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 ) >>> *(s32 *)(insnbuf + 1) += replacement - instr; >>> >>> add_nops(insnbuf + a->replacementlen, >>> >>> >>> >> Swapping the order is definitely a good thing. >> >> However, relaxing the length check seems less so. `E8 rel32` or `E9 >> rel32` encodings are strictly 5 bytes long. >> >> There are complications with the `67 E{8,9} rel16` encodings, but those >> are not catered for anyway, and the manual warns about undefined >> behaviour if used in long mode. >> >> What is your usecase for relaxing the check? IMO, if it isn't exactly 5 >> bytes long, there is some corruption somewhere and the relocation >> should't happen. > The relaxation is solely because at least CALL could validly > be followed by further instructions. But without scanning the entire replacement buffer, there might be other relocations needing to happen. That would require decoding the instructions, which is an extreme faff. It would be better to leave it currently as-is to effectively disallow mixing a jmp/call replacement with other code, to avoid the subtle failure of a second relocation not taking effect ~Andrew
>>> On 07.03.16 at 17:11, <andrew.cooper3@citrix.com> wrote: > On 07/03/16 15:56, Jan Beulich wrote: >>>>> On 07.03.16 at 16:43, <andrew.cooper3@citrix.com> wrote: >>> On 04/03/16 11:27, Jan Beulich wrote: >>>> Make sure the near JMP/CALL check doesn't consume uninitialized >>>> data, not even in a benign way. And relax the length check at once. >>>> >>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>> >>>> --- a/xen/arch/x86/alternative.c >>>> +++ b/xen/arch/x86/alternative.c >>>> @@ -174,7 +174,7 @@ static void __init apply_alternatives(st >>>> memcpy(insnbuf, replacement, a->replacementlen); >>>> >>>> /* 0xe8/0xe9 are relative branches; fix the offset. */ >>>> - if ( (*insnbuf & 0xfe) == 0xe8 && a->replacementlen == 5 ) >>>> + if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 ) >>>> *(s32 *)(insnbuf + 1) += replacement - instr; >>>> >>>> add_nops(insnbuf + a->replacementlen, >>>> >>>> >>>> >>> Swapping the order is definitely a good thing. >>> >>> However, relaxing the length check seems less so. `E8 rel32` or `E9 >>> rel32` encodings are strictly 5 bytes long. >>> >>> There are complications with the `67 E{8,9} rel16` encodings, but those >>> are not catered for anyway, and the manual warns about undefined >>> behaviour if used in long mode. >>> >>> What is your usecase for relaxing the check? IMO, if it isn't exactly 5 >>> bytes long, there is some corruption somewhere and the relocation >>> should't happen. >> The relaxation is solely because at least CALL could validly >> be followed by further instructions. > > But without scanning the entire replacement buffer, there might be other > relocations needing to happen. > > That would require decoding the instructions, which is an extreme faff. > It would be better to leave it currently as-is to effectively disallow > mixing a jmp/call replacement with other code, to avoid the subtle > failure of a second relocation not taking effect Well, such missing further fixup would be noticed immediately by someone trying (unless the patch code path never gets executed). Whereas a simply adjustment to register state would seem quite reasonable to follow a call. While right now the subsequent patches don't depend on this being >= or ==, I think it was wrong to be == from the beginning. Plus - there are endless other possibilities of instructions needing fixups (most notably such with RIP-relative memory operands), none of which are even remotely reasonable to deal with here. I.e. namely in the absence of a CALL/JMP the same issue would exist anyway, which is why I'm not overly concerned of those. All we want is a specific special case to be treated correctly. Jan
On 07/03/16 16:21, Jan Beulich wrote: >>>> On 07.03.16 at 17:11, <andrew.cooper3@citrix.com> wrote: >> On 07/03/16 15:56, Jan Beulich wrote: >>>>>> On 07.03.16 at 16:43, <andrew.cooper3@citrix.com> wrote: >>>> On 04/03/16 11:27, Jan Beulich wrote: >>>>> Make sure the near JMP/CALL check doesn't consume uninitialized >>>>> data, not even in a benign way. And relax the length check at once. >>>>> >>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>> >>>>> --- a/xen/arch/x86/alternative.c >>>>> +++ b/xen/arch/x86/alternative.c >>>>> @@ -174,7 +174,7 @@ static void __init apply_alternatives(st >>>>> memcpy(insnbuf, replacement, a->replacementlen); >>>>> >>>>> /* 0xe8/0xe9 are relative branches; fix the offset. */ >>>>> - if ( (*insnbuf & 0xfe) == 0xe8 && a->replacementlen == 5 ) >>>>> + if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 ) >>>>> *(s32 *)(insnbuf + 1) += replacement - instr; >>>>> >>>>> add_nops(insnbuf + a->replacementlen, >>>>> >>>>> >>>>> >>>> Swapping the order is definitely a good thing. >>>> >>>> However, relaxing the length check seems less so. `E8 rel32` or `E9 >>>> rel32` encodings are strictly 5 bytes long. >>>> >>>> There are complications with the `67 E{8,9} rel16` encodings, but those >>>> are not catered for anyway, and the manual warns about undefined >>>> behaviour if used in long mode. >>>> >>>> What is your usecase for relaxing the check? IMO, if it isn't exactly 5 >>>> bytes long, there is some corruption somewhere and the relocation >>>> should't happen. >>> The relaxation is solely because at least CALL could validly >>> be followed by further instructions. >> But without scanning the entire replacement buffer, there might be other >> relocations needing to happen. >> >> That would require decoding the instructions, which is an extreme faff. >> It would be better to leave it currently as-is to effectively disallow >> mixing a jmp/call replacement with other code, to avoid the subtle >> failure of a second relocation not taking effect > Well, such missing further fixup would be noticed immediately by > someone trying (unless the patch code path never gets executed). > Whereas a simply adjustment to register state would seem quite > reasonable to follow a call. While right now the subsequent > patches don't depend on this being >= or ==, I think it was wrong > to be == from the beginning. > > Plus - there are endless other possibilities of instructions needing > fixups (most notably such with RIP-relative memory operands), > none of which are even remotely reasonable to deal with here. > I.e. namely in the absence of a CALL/JMP the same issue would > exist anyway, which is why I'm not overly concerned of those. > All we want is a specific special case to be treated correctly. Fair enough. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- a/xen/arch/x86/alternative.c +++ b/xen/arch/x86/alternative.c @@ -174,7 +174,7 @@ static void __init apply_alternatives(st memcpy(insnbuf, replacement, a->replacementlen); /* 0xe8/0xe9 are relative branches; fix the offset. */ - if ( (*insnbuf & 0xfe) == 0xe8 && a->replacementlen == 5 ) + if ( a->replacementlen >= 5 && (*insnbuf & 0xfe) == 0xe8 ) *(s32 *)(insnbuf + 1) += replacement - instr; add_nops(insnbuf + a->replacementlen,