diff mbox series

[v2,3/7] x86/altcall: Optimise away endbr64 instruction where possible

Message ID 20220214125632.24563-4-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86: Further harden function pointers | expand

Commit Message

Andrew Cooper Feb. 14, 2022, 12:56 p.m. UTC
With altcall, we convert indirect branches into direct ones.  With that
complete, none of the potential targets need an endbr64 instruction.

Furthermore, removing the endbr64 instructions is a security defence-in-depth
improvement, because it limits the options available to an attacker who has
managed to hijack a function pointer.

Introduce new .init.{ro,}data.cf_clobber sections.  Have _apply_alternatives()
walk over this, looking for any pointers into .text, and clobber an endbr64
instruction if found.  This is some minor structure (ab)use but it works
alarmingly well.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

It would be nice for the printk() to say "optimised away %u of %u", but the
latter number can only feasibly come from post-processing of xen-syms during
the build.

v2:
 * Drop hard tabs
 * Add __initconst_cf_clobber too
 * Change types to reduce casting
---
 xen/arch/x86/alternative.c | 38 ++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/xen.lds.S     |  6 ++++++
 xen/include/xen/init.h     |  3 +++
 3 files changed, 47 insertions(+)

Comments

Jan Beulich Feb. 14, 2022, 1:06 p.m. UTC | #1
On 14.02.2022 13:56, Andrew Cooper wrote:
> With altcall, we convert indirect branches into direct ones.  With that
> complete, none of the potential targets need an endbr64 instruction.
> 
> Furthermore, removing the endbr64 instructions is a security defence-in-depth
> improvement, because it limits the options available to an attacker who has
> managed to hijack a function pointer.
> 
> Introduce new .init.{ro,}data.cf_clobber sections.  Have _apply_alternatives()
> walk over this, looking for any pointers into .text, and clobber an endbr64
> instruction if found.  This is some minor structure (ab)use but it works
> alarmingly well.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with two remarks, which ideally would be addressed by respective
small adjustments:

> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>          text_poke(orig, buf, total_len);
>      }
> +
> +    /*
> +     * Clobber endbr64 instructions now that altcall has finished optimising
> +     * all indirect branches to direct ones.
> +     */
> +    if ( force && cpu_has_xen_ibt )
> +    {
> +        void *const *val;
> +        unsigned int clobbered = 0;
> +
> +        /*
> +         * This is some minor structure (ab)use.  We walk the entire contents
> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
> +         *
> +         * If the pointer points into .text, and at an endbr64 instruction,
> +         * nop out the endbr64.  This causes the pointer to no longer be a
> +         * legal indirect branch target under CET-IBT.  This is a
> +         * defence-in-depth measure, to reduce the options available to an
> +         * adversary who has managed to hijack a function pointer.
> +         */
> +        for ( val = __initdata_cf_clobber_start;
> +              val < __initdata_cf_clobber_end;
> +              val++ )
> +        {
> +            void *ptr = *val;
> +
> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
> +                continue;
> +
> +            add_nops(ptr, 4);

This literal 4 would be nice to have a #define next to where the ENDBR64
encoding has its central place.

> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -221,6 +221,12 @@ SECTIONS
>         *(.initcall1.init)
>         __initcall_end = .;
>  
> +       . = ALIGN(POINTER_ALIGN);
> +       __initdata_cf_clobber_start = .;
> +       *(.init.data.cf_clobber)
> +       *(.init.rodata.cf_clobber)
> +       __initdata_cf_clobber_end = .;
> +
>         *(.init.data)
>         *(.init.data.rel)
>         *(.init.data.rel.*)

With r/o data ahead and r/w data following, may I suggest to flip the
order of the two section specifiers you add?

Jan
Andrew Cooper Feb. 14, 2022, 1:31 p.m. UTC | #2
On 14/02/2022 13:06, Jan Beulich wrote:
> On 14.02.2022 13:56, Andrew Cooper wrote:
>> With altcall, we convert indirect branches into direct ones.  With that
>> complete, none of the potential targets need an endbr64 instruction.
>>
>> Furthermore, removing the endbr64 instructions is a security defence-in-depth
>> improvement, because it limits the options available to an attacker who has
>> managed to hijack a function pointer.
>>
>> Introduce new .init.{ro,}data.cf_clobber sections.  Have _apply_alternatives()
>> walk over this, looking for any pointers into .text, and clobber an endbr64
>> instruction if found.  This is some minor structure (ab)use but it works
>> alarmingly well.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks,

> with two remarks, which ideally would be addressed by respective
> small adjustments:
>
>> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>>          text_poke(orig, buf, total_len);
>>      }
>> +
>> +    /*
>> +     * Clobber endbr64 instructions now that altcall has finished optimising
>> +     * all indirect branches to direct ones.
>> +     */
>> +    if ( force && cpu_has_xen_ibt )
>> +    {
>> +        void *const *val;
>> +        unsigned int clobbered = 0;
>> +
>> +        /*
>> +         * This is some minor structure (ab)use.  We walk the entire contents
>> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
>> +         *
>> +         * If the pointer points into .text, and at an endbr64 instruction,
>> +         * nop out the endbr64.  This causes the pointer to no longer be a
>> +         * legal indirect branch target under CET-IBT.  This is a
>> +         * defence-in-depth measure, to reduce the options available to an
>> +         * adversary who has managed to hijack a function pointer.
>> +         */
>> +        for ( val = __initdata_cf_clobber_start;
>> +              val < __initdata_cf_clobber_end;
>> +              val++ )
>> +        {
>> +            void *ptr = *val;
>> +
>> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
>> +                continue;
>> +
>> +            add_nops(ptr, 4);
> This literal 4 would be nice to have a #define next to where the ENDBR64
> encoding has its central place.

We don't have an encoding of ENDBR64 in a central place.

The best you can probably have is

#define ENDBR64_LEN 4

in endbr.h ?

>
>> --- a/xen/arch/x86/xen.lds.S
>> +++ b/xen/arch/x86/xen.lds.S
>> @@ -221,6 +221,12 @@ SECTIONS
>>         *(.initcall1.init)
>>         __initcall_end = .;
>>  
>> +       . = ALIGN(POINTER_ALIGN);
>> +       __initdata_cf_clobber_start = .;
>> +       *(.init.data.cf_clobber)
>> +       *(.init.rodata.cf_clobber)
>> +       __initdata_cf_clobber_end = .;
>> +
>>         *(.init.data)
>>         *(.init.data.rel)
>>         *(.init.data.rel.*)
> With r/o data ahead and r/w data following, may I suggest to flip the
> order of the two section specifiers you add?

I don't follow.  This is all initdata which is merged together into a
single section.

The only reason const data is split out in the first place is to appease
the toolchains, not because it makes a difference.

~Andrew
Jan Beulich Feb. 14, 2022, 1:51 p.m. UTC | #3
On 14.02.2022 14:31, Andrew Cooper wrote:
> On 14/02/2022 13:06, Jan Beulich wrote:
>> On 14.02.2022 13:56, Andrew Cooper wrote:
>>> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>>>          text_poke(orig, buf, total_len);
>>>      }
>>> +
>>> +    /*
>>> +     * Clobber endbr64 instructions now that altcall has finished optimising
>>> +     * all indirect branches to direct ones.
>>> +     */
>>> +    if ( force && cpu_has_xen_ibt )
>>> +    {
>>> +        void *const *val;
>>> +        unsigned int clobbered = 0;
>>> +
>>> +        /*
>>> +         * This is some minor structure (ab)use.  We walk the entire contents
>>> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
>>> +         *
>>> +         * If the pointer points into .text, and at an endbr64 instruction,
>>> +         * nop out the endbr64.  This causes the pointer to no longer be a
>>> +         * legal indirect branch target under CET-IBT.  This is a
>>> +         * defence-in-depth measure, to reduce the options available to an
>>> +         * adversary who has managed to hijack a function pointer.
>>> +         */
>>> +        for ( val = __initdata_cf_clobber_start;
>>> +              val < __initdata_cf_clobber_end;
>>> +              val++ )
>>> +        {
>>> +            void *ptr = *val;
>>> +
>>> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
>>> +                continue;
>>> +
>>> +            add_nops(ptr, 4);
>> This literal 4 would be nice to have a #define next to where the ENDBR64
>> encoding has its central place.
> 
> We don't have an encoding of ENDBR64 in a central place.
> 
> The best you can probably have is
> 
> #define ENDBR64_LEN 4
> 
> in endbr.h ?

Perhaps. That's not in this series nor in staging already, so it's a little
hard to check. By "central place" I really meant is_enbr64() if that's the
only place where the encoding actually appears.

>>> --- a/xen/arch/x86/xen.lds.S
>>> +++ b/xen/arch/x86/xen.lds.S
>>> @@ -221,6 +221,12 @@ SECTIONS
>>>         *(.initcall1.init)
>>>         __initcall_end = .;
>>>  
>>> +       . = ALIGN(POINTER_ALIGN);
>>> +       __initdata_cf_clobber_start = .;
>>> +       *(.init.data.cf_clobber)
>>> +       *(.init.rodata.cf_clobber)
>>> +       __initdata_cf_clobber_end = .;
>>> +
>>>         *(.init.data)
>>>         *(.init.data.rel)
>>>         *(.init.data.rel.*)
>> With r/o data ahead and r/w data following, may I suggest to flip the
>> order of the two section specifiers you add?
> 
> I don't follow.  This is all initdata which is merged together into a
> single section.
> 
> The only reason const data is split out in the first place is to appease
> the toolchains, not because it makes a difference.

It's marginal, I agree, but it would still seem more clean to me if all
(pseudo) r/o init data lived side by side.

Jan
Andrew Cooper Feb. 14, 2022, 4:03 p.m. UTC | #4
On 14/02/2022 13:51, Jan Beulich wrote:
> On 14.02.2022 14:31, Andrew Cooper wrote:
>> On 14/02/2022 13:06, Jan Beulich wrote:
>>> On 14.02.2022 13:56, Andrew Cooper wrote:
>>>> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>>>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>>>>          text_poke(orig, buf, total_len);
>>>>      }
>>>> +
>>>> +    /*
>>>> +     * Clobber endbr64 instructions now that altcall has finished optimising
>>>> +     * all indirect branches to direct ones.
>>>> +     */
>>>> +    if ( force && cpu_has_xen_ibt )
>>>> +    {
>>>> +        void *const *val;
>>>> +        unsigned int clobbered = 0;
>>>> +
>>>> +        /*
>>>> +         * This is some minor structure (ab)use.  We walk the entire contents
>>>> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
>>>> +         *
>>>> +         * If the pointer points into .text, and at an endbr64 instruction,
>>>> +         * nop out the endbr64.  This causes the pointer to no longer be a
>>>> +         * legal indirect branch target under CET-IBT.  This is a
>>>> +         * defence-in-depth measure, to reduce the options available to an
>>>> +         * adversary who has managed to hijack a function pointer.
>>>> +         */
>>>> +        for ( val = __initdata_cf_clobber_start;
>>>> +              val < __initdata_cf_clobber_end;
>>>> +              val++ )
>>>> +        {
>>>> +            void *ptr = *val;
>>>> +
>>>> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
>>>> +                continue;
>>>> +
>>>> +            add_nops(ptr, 4);
>>> This literal 4 would be nice to have a #define next to where the ENDBR64
>>> encoding has its central place.
>> We don't have an encoding of ENDBR64 in a central place.
>>
>> The best you can probably have is
>>
>> #define ENDBR64_LEN 4
>>
>> in endbr.h ?
> Perhaps. That's not in this series nor in staging already, so it's a little
> hard to check. By "central place" I really meant is_enbr64() if that's the
> only place where the encoding actually appears.

endbr.h is the header which contains is_endbr64(), and deliberately does
not contain the raw encoding.

>
>>>> --- a/xen/arch/x86/xen.lds.S
>>>> +++ b/xen/arch/x86/xen.lds.S
>>>> @@ -221,6 +221,12 @@ SECTIONS
>>>>         *(.initcall1.init)
>>>>         __initcall_end = .;
>>>>  
>>>> +       . = ALIGN(POINTER_ALIGN);
>>>> +       __initdata_cf_clobber_start = .;
>>>> +       *(.init.data.cf_clobber)
>>>> +       *(.init.rodata.cf_clobber)
>>>> +       __initdata_cf_clobber_end = .;
>>>> +
>>>>         *(.init.data)
>>>>         *(.init.data.rel)
>>>>         *(.init.data.rel.*)
>>> With r/o data ahead and r/w data following, may I suggest to flip the
>>> order of the two section specifiers you add?
>> I don't follow.  This is all initdata which is merged together into a
>> single section.
>>
>> The only reason const data is split out in the first place is to appease
>> the toolchains, not because it makes a difference.
> It's marginal, I agree, but it would still seem more clean to me if all
> (pseudo) r/o init data lived side by side.

I still don't understand what you're asking.

There is no such thing as actually read-only init data.

Wherever the .init.rodata goes in here, it's bounded by .init.data.

~Andrew
Jan Beulich Feb. 14, 2022, 4:16 p.m. UTC | #5
On 14.02.2022 17:03, Andrew Cooper wrote:
> On 14/02/2022 13:51, Jan Beulich wrote:
>> On 14.02.2022 14:31, Andrew Cooper wrote:
>>> On 14/02/2022 13:06, Jan Beulich wrote:
>>>> On 14.02.2022 13:56, Andrew Cooper wrote:
>>>>> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>>>>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>>>>>          text_poke(orig, buf, total_len);
>>>>>      }
>>>>> +
>>>>> +    /*
>>>>> +     * Clobber endbr64 instructions now that altcall has finished optimising
>>>>> +     * all indirect branches to direct ones.
>>>>> +     */
>>>>> +    if ( force && cpu_has_xen_ibt )
>>>>> +    {
>>>>> +        void *const *val;
>>>>> +        unsigned int clobbered = 0;
>>>>> +
>>>>> +        /*
>>>>> +         * This is some minor structure (ab)use.  We walk the entire contents
>>>>> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
>>>>> +         *
>>>>> +         * If the pointer points into .text, and at an endbr64 instruction,
>>>>> +         * nop out the endbr64.  This causes the pointer to no longer be a
>>>>> +         * legal indirect branch target under CET-IBT.  This is a
>>>>> +         * defence-in-depth measure, to reduce the options available to an
>>>>> +         * adversary who has managed to hijack a function pointer.
>>>>> +         */
>>>>> +        for ( val = __initdata_cf_clobber_start;
>>>>> +              val < __initdata_cf_clobber_end;
>>>>> +              val++ )
>>>>> +        {
>>>>> +            void *ptr = *val;
>>>>> +
>>>>> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
>>>>> +                continue;
>>>>> +
>>>>> +            add_nops(ptr, 4);
>>>> This literal 4 would be nice to have a #define next to where the ENDBR64
>>>> encoding has its central place.
>>> We don't have an encoding of ENDBR64 in a central place.
>>>
>>> The best you can probably have is
>>>
>>> #define ENDBR64_LEN 4
>>>
>>> in endbr.h ?
>> Perhaps. That's not in this series nor in staging already, so it's a little
>> hard to check. By "central place" I really meant is_enbr64() if that's the
>> only place where the encoding actually appears.
> 
> endbr.h is the header which contains is_endbr64(), and deliberately does
> not contain the raw encoding.

Well, yes, it's intentionally the inverted encoding, but I thought
you would get the point.

>>>>> --- a/xen/arch/x86/xen.lds.S
>>>>> +++ b/xen/arch/x86/xen.lds.S
>>>>> @@ -221,6 +221,12 @@ SECTIONS
>>>>>         *(.initcall1.init)
>>>>>         __initcall_end = .;
>>>>>  
>>>>> +       . = ALIGN(POINTER_ALIGN);
>>>>> +       __initdata_cf_clobber_start = .;
>>>>> +       *(.init.data.cf_clobber)
>>>>> +       *(.init.rodata.cf_clobber)
>>>>> +       __initdata_cf_clobber_end = .;
>>>>> +
>>>>>         *(.init.data)
>>>>>         *(.init.data.rel)
>>>>>         *(.init.data.rel.*)
>>>> With r/o data ahead and r/w data following, may I suggest to flip the
>>>> order of the two section specifiers you add?
>>> I don't follow.  This is all initdata which is merged together into a
>>> single section.
>>>
>>> The only reason const data is split out in the first place is to appease
>>> the toolchains, not because it makes a difference.
>> It's marginal, I agree, but it would still seem more clean to me if all
>> (pseudo) r/o init data lived side by side.
> 
> I still don't understand what you're asking.
> 
> There is no such thing as actually read-only init data.
> 
> Wherever the .init.rodata goes in here, it's bounded by .init.data.

Well, looking at the linker script again I notice that while r/o items
like .init.setup and .initcall*.init come first, some further ones
(.init_array etc) come quite late. Personally I'd prefer if all r/o
items sat side by side, no matter that currently we munge them all
into a single section. Then, if we decided to stop this practice, all
it would take would be to insert an output section closing and re-
opening. (Or it would have been so until now; with your addition it
wouldn't be as simple anymore anyway.)

But anyway, if at this point I still didn't get my point across, then
please leave things as you have them.

Jan
Jan Beulich March 1, 2022, 11:59 a.m. UTC | #6
On 14.02.2022 13:56, Andrew Cooper wrote:
> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>          text_poke(orig, buf, total_len);
>      }
> +
> +    /*
> +     * Clobber endbr64 instructions now that altcall has finished optimising
> +     * all indirect branches to direct ones.
> +     */
> +    if ( force && cpu_has_xen_ibt )

Btw, this is now also entered when the function is called from
apply_alternatives() (i.e. when livepatching), but ...

> +    {
> +        void *const *val;
> +        unsigned int clobbered = 0;
> +
> +        /*
> +         * This is some minor structure (ab)use.  We walk the entire contents
> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
> +         *
> +         * If the pointer points into .text, and at an endbr64 instruction,
> +         * nop out the endbr64.  This causes the pointer to no longer be a
> +         * legal indirect branch target under CET-IBT.  This is a
> +         * defence-in-depth measure, to reduce the options available to an
> +         * adversary who has managed to hijack a function pointer.
> +         */
> +        for ( val = __initdata_cf_clobber_start;
> +              val < __initdata_cf_clobber_end;

... this being main binary boundaries, no action would be taken on
the livepatch binary. Hence (also due to having been here before
during boot), all that I understand will happen ...

> +              val++ )
> +        {
> +            void *ptr = *val;
> +
> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
> +                continue;
> +
> +            add_nops(ptr, 4);
> +            clobbered++;
> +        }
> +
> +        printk("altcall: Optimised away %u endbr64 instructions\n", clobbered);

... that this message be logged once per patch load (with a number
of 0). I think the enclosing if() wants to be amended by
"&& system_state < SYS_STATE_active". If you agree, I can easily
make a patch.

Jan
Andrew Cooper March 1, 2022, 2:51 p.m. UTC | #7
On 01/03/2022 11:59, Jan Beulich wrote:
> On 14.02.2022 13:56, Andrew Cooper wrote:
>> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>>          text_poke(orig, buf, total_len);
>>      }
>> +
>> +    /*
>> +     * Clobber endbr64 instructions now that altcall has finished optimising
>> +     * all indirect branches to direct ones.
>> +     */
>> +    if ( force && cpu_has_xen_ibt )
> Btw, this is now also entered when the function is called from
> apply_alternatives() (i.e. when livepatching), but ...
>
>> +    {
>> +        void *const *val;
>> +        unsigned int clobbered = 0;
>> +
>> +        /*
>> +         * This is some minor structure (ab)use.  We walk the entire contents
>> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
>> +         *
>> +         * If the pointer points into .text, and at an endbr64 instruction,
>> +         * nop out the endbr64.  This causes the pointer to no longer be a
>> +         * legal indirect branch target under CET-IBT.  This is a
>> +         * defence-in-depth measure, to reduce the options available to an
>> +         * adversary who has managed to hijack a function pointer.
>> +         */
>> +        for ( val = __initdata_cf_clobber_start;
>> +              val < __initdata_cf_clobber_end;
> ... this being main binary boundaries, no action would be taken on
> the livepatch binary. Hence (also due to having been here before
> during boot), all that I understand will happen ...
>
>> +              val++ )
>> +        {
>> +            void *ptr = *val;
>> +
>> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
>> +                continue;
>> +
>> +            add_nops(ptr, 4);
>> +            clobbered++;
>> +        }
>> +
>> +        printk("altcall: Optimised away %u endbr64 instructions\n", clobbered);
> ... that this message be logged once per patch load (with a number
> of 0). I think the enclosing if() wants to be amended by
> "&& system_state < SYS_STATE_active". If you agree, I can easily
> make a patch.

Hmm.  There are other livepatching fixes going on, but they're starting
with fixing the build system breakage.  (The major livepatching fix is
to adjust how we patch an old function that has an ENDBR64 at the start.)

That said, a livepatch needs to contain a section equivalent to
__initdata_cf_clobber, to be processed during load, dependent on
cpu_has_xen_ibt.

Perhaps the best option is to break the clobber out into a helper that
takes a start/end pair and returns the number clobbered.  That way, it
can be reused by the livepatch logic, and independently of this printk().

~Andrew
Jan Beulich March 1, 2022, 2:58 p.m. UTC | #8
On 01.03.2022 15:51, Andrew Cooper wrote:
> On 01/03/2022 11:59, Jan Beulich wrote:
>> On 14.02.2022 13:56, Andrew Cooper wrote:
>>> @@ -330,6 +333,41 @@ static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>>          add_nops(buf + a->repl_len, total_len - a->repl_len);
>>>          text_poke(orig, buf, total_len);
>>>      }
>>> +
>>> +    /*
>>> +     * Clobber endbr64 instructions now that altcall has finished optimising
>>> +     * all indirect branches to direct ones.
>>> +     */
>>> +    if ( force && cpu_has_xen_ibt )
>> Btw, this is now also entered when the function is called from
>> apply_alternatives() (i.e. when livepatching), but ...
>>
>>> +    {
>>> +        void *const *val;
>>> +        unsigned int clobbered = 0;
>>> +
>>> +        /*
>>> +         * This is some minor structure (ab)use.  We walk the entire contents
>>> +         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
>>> +         *
>>> +         * If the pointer points into .text, and at an endbr64 instruction,
>>> +         * nop out the endbr64.  This causes the pointer to no longer be a
>>> +         * legal indirect branch target under CET-IBT.  This is a
>>> +         * defence-in-depth measure, to reduce the options available to an
>>> +         * adversary who has managed to hijack a function pointer.
>>> +         */
>>> +        for ( val = __initdata_cf_clobber_start;
>>> +              val < __initdata_cf_clobber_end;
>> ... this being main binary boundaries, no action would be taken on
>> the livepatch binary. Hence (also due to having been here before
>> during boot), all that I understand will happen ...
>>
>>> +              val++ )
>>> +        {
>>> +            void *ptr = *val;
>>> +
>>> +            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
>>> +                continue;
>>> +
>>> +            add_nops(ptr, 4);
>>> +            clobbered++;
>>> +        }
>>> +
>>> +        printk("altcall: Optimised away %u endbr64 instructions\n", clobbered);
>> ... that this message be logged once per patch load (with a number
>> of 0). I think the enclosing if() wants to be amended by
>> "&& system_state < SYS_STATE_active". If you agree, I can easily
>> make a patch.
> 
> Hmm.  There are other livepatching fixes going on, but they're starting
> with fixing the build system breakage.  (The major livepatching fix is
> to adjust how we patch an old function that has an ENDBR64 at the start.)
> 
> That said, a livepatch needs to contain a section equivalent to
> __initdata_cf_clobber, to be processed during load, dependent on
> cpu_has_xen_ibt.

IOW you say altcall patching can occur in live patches? If so, then ...

> Perhaps the best option is to break the clobber out into a helper that
> takes a start/end pair and returns the number clobbered.  That way, it
> can be reused by the livepatch logic, and independently of this printk().

... yes, parametrizing would be necessary.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 65537fe1f0bd..dd4609070001 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -173,6 +173,9 @@  text_poke(void *addr, const void *opcode, size_t len)
     return memcpy(addr, opcode, len);
 }
 
+extern void *const __initdata_cf_clobber_start[];
+extern void *const __initdata_cf_clobber_end[];
+
 /*
  * Replace instructions with better alternatives for this CPU type.
  * This runs before SMP is initialized to avoid SMP problems with
@@ -330,6 +333,41 @@  static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
         add_nops(buf + a->repl_len, total_len - a->repl_len);
         text_poke(orig, buf, total_len);
     }
+
+    /*
+     * Clobber endbr64 instructions now that altcall has finished optimising
+     * all indirect branches to direct ones.
+     */
+    if ( force && cpu_has_xen_ibt )
+    {
+        void *const *val;
+        unsigned int clobbered = 0;
+
+        /*
+         * This is some minor structure (ab)use.  We walk the entire contents
+         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
+         *
+         * If the pointer points into .text, and at an endbr64 instruction,
+         * nop out the endbr64.  This causes the pointer to no longer be a
+         * legal indirect branch target under CET-IBT.  This is a
+         * defence-in-depth measure, to reduce the options available to an
+         * adversary who has managed to hijack a function pointer.
+         */
+        for ( val = __initdata_cf_clobber_start;
+              val < __initdata_cf_clobber_end;
+              val++ )
+        {
+            void *ptr = *val;
+
+            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
+                continue;
+
+            add_nops(ptr, 4);
+            clobbered++;
+        }
+
+        printk("altcall: Optimised away %u endbr64 instructions\n", clobbered);
+    }
 }
 
 void init_or_livepatch apply_alternatives(struct alt_instr *start,
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index ca22e984f807..c399178ac123 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -221,6 +221,12 @@  SECTIONS
        *(.initcall1.init)
        __initcall_end = .;
 
+       . = ALIGN(POINTER_ALIGN);
+       __initdata_cf_clobber_start = .;
+       *(.init.data.cf_clobber)
+       *(.init.rodata.cf_clobber)
+       __initdata_cf_clobber_end = .;
+
        *(.init.data)
        *(.init.data.rel)
        *(.init.data.rel.*)
diff --git a/xen/include/xen/init.h b/xen/include/xen/init.h
index bfe789e93f6b..0af0e234ec80 100644
--- a/xen/include/xen/init.h
+++ b/xen/include/xen/init.h
@@ -18,6 +18,9 @@ 
 #define __init_call(lvl)  __used_section(".initcall" lvl ".init")
 #define __exit_call       __used_section(".exitcall.exit")
 
+#define __initdata_cf_clobber  __section(".init.data.cf_clobber")
+#define __initconst_cf_clobber __section(".init.rodata.cf_clobber")
+
 /* These macros are used to mark some functions or 
  * initialized data (doesn't apply to uninitialized data)
  * as `initialization' functions. The kernel can take this