mbox series

[RFC,v3,00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation

Message ID 20230202074036.507249-1-madvenka@linux.microsoft.com (mailing list archive)
Headers show
Series arm64: livepatch: Use ORC for dynamic frame pointer validation | expand

Message

Madhavan T. Venkataraman Feb. 2, 2023, 7:40 a.m. UTC
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>

Introduction
============

The livepatch feature requires an unwinder that can provide a reliable stack
trace. General requirements for a reliable unwinder are described in this
document from Mark Rutland:

	Documentation/livepatch/reliable-stacktrace.rst

The requirements have two parts:

1. The unwinder must be enhanced with certain features. E.g.,

	- Identifying successful termination of stack trace
	- Identifying unwindable and non-unwindable code
	- Identifying interrupts and exceptions occurring in the frame pointer
	  prolog and epilog
	- Identifying features such as kretprobe and ftrace graph tracing
	  that can modify the return address stored on the stack
	- Identifying corrupted/unreliable stack contents
	- Architecture-specific items that can render a stack trace unreliable
	  at certain points in code

2. Validation of the frame pointer

	This assumes that the unwinder is based on the frame pointer (FP).
	The actual frame pointer that the unwinder uses cannot just be
	assumed to be correct. It needs to be validated somehow.

This patch series is to address the following:

	- Identifying unwindable and non-unwindable code
	- Identifying interrupts and exceptions occurring in the frame pointer
	  prolog and epilog
	- Validation of the frame pointer

The rest are already in place AFAICT.

Validation of the FP (aka FRAME_POINTER_VALIDATION)
====================

The current approach in Linux is to use objtool, a build time tool, for this
purpose. When configured, objtool is invoked on every relocatable object file
during kernel build. It performs static analysis of the code in each file. It
walks the instructions in every function and notes the changes to the stack
pointer (SP) and the frame pointer (FP). It makes sure that the changes are in
accordance with the ABI rules. There are also a lot of other checks that
Objtool performs. Once objtool completes successfully, the kernel can then be
used for livepatch purposes.

Objtool can have uses other than just FP validation. For instance, it can check
control flow integrity during its analysis.

Problem
=======

Objtool is complex and highly architecture-dependent. There are a lot of
different checks in objtool that all of the code in the kernel must pass
before livepatch can be enabled. If a check fails, it must be corrected
before we can proceed. Sometimes, the kernel code needs to be fixed.
Sometimes, it is a compiler bug that needs to be fixed. The challenge is
also to prove that all the work is complete for an architecture.

As such, it presents a great challenge to enable livepatch for an
architecture.

A different approach
====================

I would like to propose a different approach for FP validation. I would
like to be able to enable livepatch for an architecture as is. That is,
without "fixing" the kernel or the compiler for it:

There are three steps in this:

1. Objtool walks all the functions as usual. It computes the stack and
   frame pointer offsets at each instruction as usual. It generates ORC
   records and stores them in special sections as usual. This is simple
   enough to do.

2. Objtool performs validation of the offsets (see below) and checks
   if the frame is properly set up according to ABI rules. But the set
   of checks performed are a whole lot simpler than the existing Objtool
   checks for X86.
   
3. The unwinder in the kernel retrieves the ORC record for each PC in a
   stack trace. If there is no ORC record or if the ORC data indicates that
   a frame pointer has not been set up, the unwinder considers the stack
   frame unreliable. Otherwise, the unwinder computes a frame pointer from
   the ORC data. It compares the computed frame pointer with the actual
   frame pointer. If there is a match, the frame is reliable. If not, it
   isn't. A stack trace is reliable if every single frame in it is reliable.

   To summarize, the frame pointer validation is done dynamically instead
   of statically.

Using this scheme, the unwinder can always know what kernel code is reliable
for unwind and what is not. This is the requirement for livepatch.

Instruction decoder
===================

To do this, an instruction decoder needs to be implemented. I have implemented
a simple, table-driven decoder for ARM64. Only a subset of the instructions
needs to be fully decoded for this purpose:

	- Load-Store instructions
	- Add/Sub/Mov instructions
	- Branch instructions
	- Call instructions
	- Return instructions
	- Stack pointer authentication instruction

The rest of the instructions are either dont-care from an unwind perspective
or unexpected from the compiler. I have added checks for the unexpected ones
to catch them if the compiler ever generates them.

This decoder is simpler than a full-fledged one. But if a full-fledged one
is ever implemented, my decoder can be subsumed by it.

Code reorganization and reuse
=============================

Stack validation scheme
-----------------------

Currently, the stack validation scheme supported in Objtool is static stack
validation. Static stack validation is performed in check.c. There is a lot
of code in check.c that should be shared with other validation schemes such
as the dynamic FP validation scheme that I am proposing. Accordingly, I have
moved that code into separate files:

	- Code that walks instructions and decodes them
	- Code that manages instructions
	- CFI related code
	- Code that handles unwind hints

So, all of this is shared across all architectures and validation schemes.

Architecture-dependent code
---------------------------

Currently, the ORC definitions and code are X86-specific. I have separated
out the architecture-specific stuff from the generic stuff and placed
them in appropriate files so other architectures can share.

So, these are the architecture-specific parts that need to be supplied for
a new architecture for my proposal. Everything else is shared.

	- Instruction decoder as mentioned above
	- ORC register definitions
	- ORC support functions
	- Unwind hint support
	- Invoke ORC init from kernel initialization code
	- Invoke ORC init from module initialization code
	- Add ORC_UNWIND_TABLE to kernel data in vmlinux.lds.S
	- Modify the unwinder to use ORC data to validate the frame pointer
	- Add kernel config definitions for reliable stack trace and livepatch

Other than the decoder, all of this is very simple to do. Just follow the
example in ARM64.

For ARM64, the decoder turned out to be fairly simple. I cannot speak to
other architectures.

sorttable
---------

At build time, the ORC tables in special sections are sorted so that the
kernel does not have to spend time sorting them. The tables need to be
sorted for binary search. The sorttable program works without any change
in my proposal as well.

FP prolog, epilog, leaf functions, generated code, etc
======================================================

If the unwinder is not able to find an ORC record for a given instruction
address, it considers the code to be unreliable from an unwind perspective.
This enables the unwinder to deal with:

	- Generated code that will not have any ORC records.

If the unwinder finds an ORC record, but the record indicates that a frame
pointer has not been properly set up at that instruction, then the unwinder
considers that instruction unreliable from an unwind perspective. This enables
the unwinder to deal with:

	- Low level assembly code (SYM_CODE) that is not walked by Objtool.
	  See below.

	- Interrupts/exceptions in frame pointer prologs and epilogs.

	- Interrupts/exceptions in leaf functions that don't have a frame
	  setup.

	- Compiler not setting up the frame pointer properly before calling
	  a function. E.g., if inline assembly code occurs at the beginning
	  of a function and it contains a call.

If the unwinder finds an ORC record and the record indicates that a frame
pointer has been properly set up, then it computes a frame pointer from the
ORC data and compares it with the actual frame pointer. If the computed frame
pointer does not match the actual one, it considers the code to be unreliable
from an unwind perspective. This enables the unwinder to detect:

	- Cases where runtime patching of the kernel resulted in a change in
	  the ORC for an instruction

	- A corrupted frame pointer

Assembly functions
==================

Objtool does not walk SYM_CODE functions as they are low-level functions
that don't follow ABI rules or functions that manipulate register state
in such a way that unwind is unreliable. For these the ORC records will
show that the frame offset is 0. So, the unwinder will be able to tell that
they are unreliable for unwind.

As for SYM_FUNC functions, Objtool will walk them and compute ORC. However,
currently, most of the SYM_FUNC functions in ARM64 do not setup a frame.
So, these will look unreliable to the unwinder. While this will not impact
the ability to do livepatch, I plan to submit a separate patch series to add
a frame pointer prolog and epilog to many of these functions. This is to
reduce the number of retries during the livepatch process.

Unwind hints
============

Now, there are certain points in assembly code that we would like to unwind
through reliably. Like interrupt and exception handlers. This is mainly for
getting reliable stack traces in these cases and reducing the number of
retries during the livepatch process. For these, unwind hints can be placed
at strategic points in assembly code. Only a small number of these hints
should be needed.

In this work, I have defined the following unwind hints so stack traces that
contain these can be done reliably:

	- Exception handlers
	- Interrupt handlers

Unwind hints are collected in a special section. Objtool converts unwind hints
to ORC data. The unwinder processes unwind hints to handle special cases
mentioned above.

Now, unwind hints are generally a problem to maintain. So, I have only
defined them for the above cases.

Size of the memory consumed within the kernel for this feature
==============================================================

This depends on the amount of code in the kernel which, in turn, depends on
the number of configs turned on. E.g., on the kernel on my arm64 system, the
ORC data size for vmlinux is about 2MB.

GitHub repository
=================

My github repo for this version is here:

https://github.com/madvenka786/linux/tree/orc_v3

Please feel free to clone and check it out. And, please let me know if you
find any issues.

Testing
=======

- I have run all of the livepatch selftests successfully. I have written a
  couple of extra selftests myself which I will be posting separately.

- I have a test driver to induce a NULL pointer exception to make sure
  that unwinding through exception handlers is reliable.

- I use the test driver to create a timer to make sure that unwinding through
  the timer IRQ is reliable.

- I call the unwinder from different places during boot to make sure that
  the unwinding in each of those cases is reliable.

TBD
===

- I need to perform more rigorous testing with different scenarios. This
  is work in progress. Any ideas or suggestions are welcome.

- I plan to add a return address check in the unwinder. The unwinder will
  decode the instruction at the call site for each frame and make sure that
  it is a valid call instruction. This is just a paranoid check to catch it
  if Objtool generates an incorrect ORC entry or if the FP is corrupted.
---
Changelog:

v3:
	From Mark Brown <broonie@kernel.org>
	====================================

	Objtool no longer uses sub commands.

		I have addressed this. Objtool calls check() to perform
		all the validation. I have defined a check() function for
		the Dynamic FP Validation feature. Based on the config,
		the correct files will be included so that there is only
		one check() defined within Objtool.

	From Chen Zhongjin <chenzhongjin@huawei.com>
	============================================

	No need to rewrite the decoder. Merge the decoder with my patch.

		My decoder is table based. So, it is very compact. Also,
		in that table, I have included checks for the instructions
		that alter the stack or frame pointers that the compiler
		should not be using. This catches all the cases where the
		compiler generates unexpected code. So, for now, I am
		keeping my decoder.

		But your point is a good one. May be, in the future, our
		decoders can be merged. Currently, both patchsets are
		relatively new. It is not as if either has received
		enough review.

	From Peter Zijlstra <peterz@infradead.org>
	==========================================================

	Why can't you use the validate_branch() defined currently?
	Why do you want to define your own?

		Originally, I responded to this comment by saying that
		I will use validate_branch(). I studied it. It will probably
		take a long time for all the pieces to be in place for the
		current validate_branch() to work for ARM64. So, I have taken
		a different approach to try to shorten the time to market.

		In my approach, I generate the ORC data based on the actual
		code by walking all the instructions and following all the
		code paths statically.

		Now, a hidden code path may not be followed during my
		static analysis. E.g.,
		
			- a retpoline is obscuring an actual branch

			- runtime patching can potentially change a code path

		In most of the cases, even if these things occur, the ORC data
		will not change. In the unlikely event that ORC data can be
		different for an instruction because of the above, the unwinder
		will handle that. There are two cases:

		1) The SP offset or the FP offset is zero in the generated ORC.

		2) The SP offset and the FP offset are both non-zero in the
		   generated ORC.

		In case (1), the unwinder will return an error right away
		and the stack trace will be considered unreliable. So,
		livepatch has to retry.

		In case (2), the unwinder will detect the problem because the
		computed frame pointer will not match the actual one. So,
		livepatch has to retry.

		As long as the number of such retries is small, livepatch can
		easily be supported with this approach.

		AFAICT, the only way this will not work is if the actual frame
		pointer gets corrupted because of buggy kernel code and just
		happens to match the computed frame pointer. According to an
		earlier comment from Josh (IIRC), corruption of the frame
		pointer in the kernel is a super rare event. Even if it does
		occur, the frame pointer getting corrupted in a fashion that
		it exactly matches the computed frame pointer should be even
		rarer.

		Even if that were to happen, the unwinder checks must pass for
		every frame in the stack trace. This will not happen if the
		frame pointer is corrupted along the way.

		Also, if the kernel has buggy code that can corrupt the frame
		pointer, then all bets are off anyway.

		If you find any holes here, please let me know. I will work
		on it some more. Appreciate your feedback.

		I have added a number of checks to validate the CFI since
		version 2. They are described in:

[RFC PATCH v3 13/21] objtool: arm64: Walk instructions and compute CFI for each instruction

		FWIW, I have also compared the CFI I am generating with DWARF
		information that the compiler generates. The CFIs match a
		100% for Clang. In the case of gcc, the comparison fails
		in 1.7% of the cases. I have analyzed those cases and found
		the DWARF information generated by gcc is incorrect. The
		ORC generated by my Objtool is correct.

	From Miroslav Benes <mbenes@suse.cz>
	====================================

	klp_arch_set_pc() has been replaced by ftrace_instruction_pointer_set().
	klp_get_ftrace_location() is not needed either.

		I have addressed this in version 3.

	From me Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
	================================================================

	- I have removed the unwind hints for FTrace and the Kretprobe
	  trampoline. These can be added later. I have retained only
	  the unwind hints for exceptions and interrupts.

	- I have enhanced the decoder to recognize the paciasp instruction.
	  Both gcc and Clang use this to begin a frame pointer prolog. This
	  is really useful for working out the CFI.

v2:
	From Josh Poimboeuf <jpoimboe@redhat.com>:
	==========================================

	DWARF is not proven to be reliable. So, depending on it for livepatch
	is a problem.

		I have removed the DWARF part from the patch series. Instead,
		I have implemented the minimum ARM64 instruction decoder
		required for this work. I have implemented code to walk all
		the instructions in an object file and generate ORC data.
		The ORC data will be used by the unwinder to compute a
		frame pointer and validate the actual frame pointer with it.

	Unwind hints are a problem from a maintenance perspective.

		This is true. But there are only a few unwind hints that I
		have introduced in this work. Also, if an unwind hint becomes
		outdated, the dynamic frame pointer check will catch it so
		that the unwinder will know that it is unreliable.
	
	Inline ASM code can cause problems that DWARF cannot catch.

		Now that I walk all of the instructions, this problem is solved.
		In version 2, the ORC data is generated based on the actual
		machine code in the object file. So, the data reflects the
		actual code. Unreliability in any part of the code will be
		caught by the unwinder when it looks up the ORC data and
		performs a frame pointer check.

	Rename kernel code and data that currently contains the name dwarf
	to avoid confusion.

		This problem is solved as I have dropped DWARF altogether.

	Try to reuse the existing ORC data format and code as much as possible.

		I have reorganized the code in the following ways:

		- I have placed code that was in check.c in separate files
		  so that different stack validation schemes can share the code.

		- I have separated architecture-specific code and structures
		  from generic ones so that different architectures can share
		  common stuff.

		- I am using the ORC structure as it is currently defined. The
		  only cosmetic change I have done is to rename the fields
		  bp_* to fp_* (FP for frame pointer).

		- I completely reuse the ORC definitions and code. E.g., in
		  the kernel the ORC lookup code is shared across architectures.

	Objtool contains other features which other architectures are looking
	into. So, should we just implement static stack validation for other
	architectures or use dynamic FP validation just for the livepatch
	feature?

		For one thing, it will take a long time before the static
		validation scheme can even be proved to be complete on ARM64.
		Livepatching is an immediate need for security fixes.

		Also, since I am using the traditional approach in v2 of
		walking the instructions, computing CFI, generating ORC, etc,
		my current approach can be combined with the traditional
		approach. Dynamic FP validation can be offered either as an
		alternative to static stack validation or as something that
		can be combined with static stack validation to make the
		feature even more robust. Objtool can always have bugs and
		there can be bad ORC data.

	From Peter Zijlstra <peterz@infradead.org>:
	===========================================

	Please use/extend ORC.

		Done. Please see my description above.

	Why deviate from the traditional approach of static stack validation?

		I have given the answer above.

	Mandating DWARF sucks. Compile times are so much worse with DWARVES.

		I have removed DWARF from the work. I use the traditional
		approach of decoding the instructions and computing the
		same data as DWARF but in ORC format.

	DWARF does not cover assembly code.

		In v2, I walk all of the functions including assembly functions.
		SYM_CODE functions are not walked by Objtool anyway. So, I
		don't do that. But I walk all the SYM_FUNC functions. Currently,
		only a few SYM_FUNC functions in ARM64 have a proper FP prolog
		and epilog. So, I plan to submit a separate patch series to
		add an FP prolog and epilog for other SYM_FUNC functions.
		But this is only to reduce retries during the livepatch process.
		It is not absolutely required for livepatch to work. But I
		plan to address this separately.

	Compilers don't consider DWARF generation to be a correctness issue.

		I totally agree. I have myself found 4 bugs that I have had
		to compensate for. So, I have dropped DWARF.

	From Chen Zhongjin <chenzhongjin@huawei.com>:
	=============================================

	One cannot depend on compilers to generate correct DWARF info.

		Agreed. I have dropped DWARF.

	DWARF does not cover assembly. So, what if too many assembly
	functions exist so that the livepatch process can encounter too
	many retries?

		DWARF has been dropped. The code in version 2 walks all the
		functions including assembly functions.

	There is a corner case where an interrupt or an exception can happen
	in FP prologs/epilogs. The stack trace would be unreliable.

		Yes. This will be caught by the reliable unwinder in the kernel
		in my scheme when it retrieves the ORC data and validates
		the actual frame pointer. The validation will fail and the
		stack trace will be considered unreliable.

v1:
	- Introduced the livepatch feature based on DWARF Call Frame
	  Information generated by the compilers.

Previous versions and discussion
================================

v2: https://lore.kernel.org/linux-arm-kernel/20220524001637.1707472-1-madvenka@linux.microsoft.com/T/#t
v1: https://lore.kernel.org/linux-arm-kernel/20220407202518.19780-1-madvenka@linux.microsoft.com/T/#t

Madhavan T. Venkataraman (22):
  objtool: Reorganize CFI code
  objtool: Reorganize instruction-related code
  objtool: Move decode_instructions() to a separate file
  objtool: Reorganize Unwind hint code
  objtool: Reorganize ORC types
  objtool: Reorganize ORC code
  objtool: Reorganize ORC kernel code
  objtool: Introduce STATIC_CHECK
  objtool: arm64: Add basic definitions and compile
  objtool: arm64: Implement decoder for Dynamic FP validation
  objtool: arm64: Invoke the decoder
  objtool: arm64: Compute destinations for call and jump instructions
  objtool: arm64: Walk instructions and compute CFI for each instruction
  objtool: arm64: Generate ORC data from CFI for object files
  objtool: arm64: Add unwind hint support
  arm64: Add unwind hints to exception handlers
  arm64: Add kernel and module support for ORC
  arm64: Build the kernel with ORC information
  arm64: unwinder: Add a reliability check in the unwinder based on ORC
  arm64: Define HAVE_DYNAMIC_FTRACE_WITH_ARGS
  arm64: Define TIF_PATCH_PENDING for livepatch
  arm64: Enable livepatch for ARM64

 arch/arm64/Kconfig                            |   5 +
 arch/arm64/Kconfig.debug                      |  33 +
 arch/arm64/include/asm/ftrace.h               |  20 +
 arch/arm64/include/asm/module.h               |  12 +-
 arch/arm64/include/asm/orc_types.h            |  35 ++
 arch/arm64/include/asm/stacktrace/common.h    |  15 +
 arch/arm64/include/asm/thread_info.h          |   4 +-
 arch/arm64/include/asm/unwind_hints.h         | 104 ++++
 arch/arm64/kernel/entry.S                     |   3 +
 arch/arm64/kernel/module.c                    |  13 +-
 arch/arm64/kernel/setup.c                     |   2 +
 arch/arm64/kernel/signal.c                    |   4 +
 arch/arm64/kernel/stacktrace.c                | 167 ++++-
 arch/arm64/kernel/vmlinux.lds.S               |   3 +
 arch/x86/include/asm/orc_types.h              |  37 +-
 arch/x86/include/asm/unwind.h                 |   5 -
 arch/x86/include/asm/unwind_hints.h           |  83 +++
 arch/x86/kernel/module.c                      |   7 +-
 arch/x86/kernel/unwind_orc.c                  | 258 +-------
 arch/x86/kernel/vmlinux.lds.S                 |   2 +-
 .../asm => include/asm-generic}/orc_lookup.h  |  42 ++
 include/linux/objtool.h                       |  72 +--
 include/linux/orc_entry.h                     |  39 ++
 kernel/Makefile                               |   2 +
 kernel/orc_lookup.c                           | 261 ++++++++
 scripts/Makefile                              |   4 +-
 scripts/Makefile.lib                          |   9 +
 tools/arch/arm64/include/asm/orc_types.h      |  35 ++
 tools/arch/arm64/include/asm/unwind_hints.h   | 104 ++++
 tools/arch/x86/include/asm/orc_types.h        |  37 +-
 tools/arch/x86/include/asm/unwind_hints.h     | 157 +++++
 tools/include/linux/objtool.h                 |  72 +--
 tools/include/linux/orc_entry.h               |  39 ++
 tools/objtool/Build                           |   9 +-
 tools/objtool/Makefile                        |   8 +-
 tools/objtool/arch/arm64/Build                |   2 +
 tools/objtool/arch/arm64/decode.c             | 573 ++++++++++++++++++
 .../arch/arm64/include/arch/cfi_regs.h        |  13 +
 tools/objtool/arch/arm64/include/arch/elf.h   |   9 +
 .../arch/arm64/include/arch/endianness.h      |   9 +
 tools/objtool/arch/arm64/orc.c                |  90 +++
 tools/objtool/arch/x86/Build                  |   1 +
 tools/objtool/arch/x86/include/arch/elf.h     |   1 +
 tools/objtool/arch/x86/orc.c                  | 150 +++++
 tools/objtool/cfi.c                           | 108 ++++
 tools/objtool/check.c                         | 492 +--------------
 tools/objtool/dcheck.c                        | 360 +++++++++++
 tools/objtool/decode.c                        | 107 ++++
 tools/objtool/include/objtool/arch.h          |   2 +
 tools/objtool/include/objtool/cfi.h           |  12 +
 tools/objtool/include/objtool/check.h         |  77 +--
 tools/objtool/include/objtool/endianness.h    |   1 +
 tools/objtool/include/objtool/insn.h          | 130 ++++
 tools/objtool/include/objtool/objtool.h       |   1 +
 tools/objtool/include/objtool/orc.h           |  18 +
 tools/objtool/insn.c                          | 215 +++++++
 tools/objtool/orc_dump.c                      |  63 +-
 tools/objtool/orc_gen.c                       |  89 +--
 tools/objtool/sync-check.sh                   |  10 +
 tools/objtool/unwind_hints.c                  | 110 ++++
 60 files changed, 3176 insertions(+), 1169 deletions(-)
 create mode 100644 arch/arm64/include/asm/orc_types.h
 create mode 100644 arch/arm64/include/asm/unwind_hints.h
 rename {arch/x86/include/asm => include/asm-generic}/orc_lookup.h (51%)
 create mode 100644 include/linux/orc_entry.h
 create mode 100644 kernel/orc_lookup.c
 create mode 100644 tools/arch/arm64/include/asm/orc_types.h
 create mode 100644 tools/arch/arm64/include/asm/unwind_hints.h
 create mode 100644 tools/arch/x86/include/asm/unwind_hints.h
 create mode 100644 tools/include/linux/orc_entry.h
 create mode 100644 tools/objtool/arch/arm64/Build
 create mode 100644 tools/objtool/arch/arm64/decode.c
 create mode 100644 tools/objtool/arch/arm64/include/arch/cfi_regs.h
 create mode 100644 tools/objtool/arch/arm64/include/arch/elf.h
 create mode 100644 tools/objtool/arch/arm64/include/arch/endianness.h
 create mode 100644 tools/objtool/arch/arm64/orc.c
 create mode 100644 tools/objtool/arch/x86/orc.c
 create mode 100644 tools/objtool/cfi.c
 create mode 100644 tools/objtool/dcheck.c
 create mode 100644 tools/objtool/decode.c
 create mode 100644 tools/objtool/include/objtool/insn.h
 create mode 100644 tools/objtool/include/objtool/orc.h
 create mode 100644 tools/objtool/insn.c
 create mode 100644 tools/objtool/unwind_hints.c


base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476

Comments

Tomohiro Misono (Fujitsu) March 1, 2023, 3:12 a.m. UTC | #1
<snip>
> Testing
> =======
> 
> - I have run all of the livepatch selftests successfully. I have written a
>   couple of extra selftests myself which I will be posting separately
Hi,

What test configuration/environment you are using for test?
When I tried kselftest with fedora based config on VM, I got errors
because livepatch transition won't finish until signal is sent
(i.e. it takes 15s for every transition).

[excerpt from test result]
  ```
  $ sudo ./test-livepatch.sh
  TEST: basic function patching ... not ok
  
  --- expected
  +++ result
  @@ -2,11 +2,13 @@
   livepatch: enabling patch 'test_klp_livepatch'
   livepatch: 'test_klp_livepatch': initializing patching transition
   livepatch: 'test_klp_livepatch': starting patching transition
  +livepatch: signaling remaining tasks
   livepatch: 'test_klp_livepatch': completing patching transition
  ```

Thanks,
Tomohiro

> 
> - I have a test driver to induce a NULL pointer exception to make sure
>   that unwinding through exception handlers is reliable.
> 
> - I use the test driver to create a timer to make sure that unwinding through
>   the timer IRQ is reliable.
> 
> - I call the unwinder from different places during boot to make sure that
>   the unwinding in each of those cases is reliable.
>
Petr Mladek March 2, 2023, 4:23 p.m. UTC | #2
On Wed 2023-03-01 03:12:08, Tomohiro Misono (Fujitsu) wrote:
> <snip>
> > Testing
> > =======
> > 
> > - I have run all of the livepatch selftests successfully. I have written a
> >   couple of extra selftests myself which I will be posting separately
> Hi,
> 
> What test configuration/environment you are using for test?
> When I tried kselftest with fedora based config on VM, I got errors
> because livepatch transition won't finish until signal is sent
> (i.e. it takes 15s for every transition).
> 
> [excerpt from test result]
>   ```
>   $ sudo ./test-livepatch.sh
>   TEST: basic function patching ... not ok
>   
>   --- expected
>   +++ result
>   @@ -2,11 +2,13 @@
>    livepatch: enabling patch 'test_klp_livepatch'
>    livepatch: 'test_klp_livepatch': initializing patching transition
>    livepatch: 'test_klp_livepatch': starting patching transition
>   +livepatch: signaling remaining tasks
>    livepatch: 'test_klp_livepatch': completing patching transition
>   ```

It might be interesting to see what process is blocking the
transition. The transition state is visible in
/proc/<pid>/patch_state.

The transition is blocked when a process is in KLP_UNPATCHED state.
It is defined in include/linux/livepatch.h:

#define KLP_UNPATCHED	 0

Well, the timing against the transition is important. The following
might help to see the blocking processes:

$> modprobe livepatch-sample ; \
   sleep 1; \
   for proc_path in \
       `grep "\-1"  /proc/*/patch_state | cut -d '/'  -f-3` ; \
   do \
       cat $proc_path/comm ; \
       cat $proc_path/stack ; \
       echo ===  ; \
   done

After this the livepatch has to be manualy disabled and removed

$> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
$> rmmod livepatch_sample

Best Regards,
Petr
Tomohiro Misono (Fujitsu) March 3, 2023, 9:40 a.m. UTC | #3
> > <snip>
> > > Testing
> > > =======
> > >
> > > - I have run all of the livepatch selftests successfully. I have written a
> > >   couple of extra selftests myself which I will be posting separately
> > Hi,
> >
> > What test configuration/environment you are using for test?
> > When I tried kselftest with fedora based config on VM, I got errors
> > because livepatch transition won't finish until signal is sent
> > (i.e. it takes 15s for every transition).
> >
> > [excerpt from test result]
> >   ```
> >   $ sudo ./test-livepatch.sh
> >   TEST: basic function patching ... not ok
> >
> >   --- expected
> >   +++ result
> >   @@ -2,11 +2,13 @@
> >    livepatch: enabling patch 'test_klp_livepatch'
> >    livepatch: 'test_klp_livepatch': initializing patching transition
> >    livepatch: 'test_klp_livepatch': starting patching transition
> >   +livepatch: signaling remaining tasks
> >    livepatch: 'test_klp_livepatch': completing patching transition
> >   ```
> 
> It might be interesting to see what process is blocking the
> transition. The transition state is visible in
> /proc/<pid>/patch_state.
> 
> The transition is blocked when a process is in KLP_UNPATCHED state.
> It is defined in include/linux/livepatch.h:
> 
> #define KLP_UNPATCHED	 0
> 
> Well, the timing against the transition is important. The following
> might help to see the blocking processes:
> 
> $> modprobe livepatch-sample ; \
>    sleep 1; \
>    for proc_path in \
>        `grep "\-1"  /proc/*/patch_state | cut -d '/'  -f-3` ; \
>    do \
>        cat $proc_path/comm ; \
>        cat $proc_path/stack ; \
>        echo ===  ; \
>    done
> 
> After this the livepatch has to be manualy disabled and removed
> 
> $> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
> $> rmmod livepatch_sample

Thanks for the suggestion. This is quite helpful for debug.
I did some tests and in short, I could run all livepatch selftest successfully
on clang15-built kernel when RANDOMIZE_KSTACK_OFFSET=n.

Below is my analysis. Please let me know if I'm wrong.

When I checked the stack state while being live-patched, I saw some tasks
sleeping after system call are not transitioned. For example, I saw a task with
following stack:
```
sshd
[<0>] do_select+0x5cc/0x64c
[<0>] core_sys_select+0x174/0x210
[<0>] __arm64_sys_pselect6+0x11c/0x384
[<0>] invoke_syscall+0x78/0x108
[<0>] el0_svc_common+0xc0/0xfc
[<0>] do_el0_svc+0x38/0xd0
[<0>] el0_svc+0x34/0x110
[<0>] el0t_64_sync_handler+0x84/0xf0
[<0>] el0t_64_sync+0x190/0x194
```

Then, I noticed that invoke_syscall generates instructions to add random offset
in sp when RANDOMIZE_KSTACK_OFFSET=y, which is true in the above case.
Actually I see that sp can be modified in the binary:
```
$ objdump -d vmlinux --disassemble=invoke_syscall
...
ffff80000803076c <invoke_syscall>:
...
ffff8000080307b4:       9100011f        mov     sp, x8
...
ffff80000803085c:       d65f03c0        ret 
```
This will set the instruction UNRELIABLE as sp value is not deterministic:
  https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/arch/arm64/decode.c#L173
and in turn will skip the generation of orc data:
  https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/dcheck.c#L313
I can confirm the orc result in vmlinux:
```
./tools/objtool/objtool --dump vmlinux
...
# no entry in range of invoke_syscall (ffff80000803076c - ffff80000803085c)
ffff800008030764: cfa:sp+0 x29:cfa+0 type:call end:0
ffff800008030874: cfa:(und) x29:(und) type:call end:0
ffff800008030874: cfa:sp+0 x29:cfa+0 type:call end:0
...
```
So, when live-patch is performed, stacktrace of task containing invoke_syscall
cannot be validated in arch_stack_walk_reliable() and transition won't happen
until the fake signal is delivered (unless task's state changes).

It seems that stack validation itself works as intended.
As I said, when RANDOMIZE_KSTACK_OFFSET=n, selftests run fine.
Or am I misunderstood something completely?

Regards,
Tomohiro
Madhavan T. Venkataraman March 6, 2023, 4:57 p.m. UTC | #4
On 2/28/23 21:12, Tomohiro Misono (Fujitsu) wrote:
> <snip>
>> Testing
>> =======
>>
>> - I have run all of the livepatch selftests successfully. I have written a
>>   couple of extra selftests myself which I will be posting separately
> Hi,
> 
> What test configuration/environment you are using for test?
> When I tried kselftest with fedora based config on VM, I got errors
> because livepatch transition won't finish until signal is sent
> (i.e. it takes 15s for every transition).
> 

Sorry for not responding earlier. I was out sick.

I tested on a bare metal system (thunderx) running Ubuntu. I will try to reproduce
the error you are seeing on a VM running fedora.

Madhavan

> [excerpt from test result]
>   ```
>   $ sudo ./test-livepatch.sh
>   TEST: basic function patching ... not ok
>   
>   --- expected
>   +++ result
>   @@ -2,11 +2,13 @@
>    livepatch: enabling patch 'test_klp_livepatch'
>    livepatch: 'test_klp_livepatch': initializing patching transition
>    livepatch: 'test_klp_livepatch': starting patching transition
>   +livepatch: signaling remaining tasks
>    livepatch: 'test_klp_livepatch': completing patching transition
>   ```
> 
> Thanks,
> Tomohiro
> 
>>
>> - I have a test driver to induce a NULL pointer exception to make sure
>>   that unwinding through exception handlers is reliable.
>>
>> - I use the test driver to create a timer to make sure that unwinding through
>>   the timer IRQ is reliable.
>>
>> - I call the unwinder from different places during boot to make sure that
>>   the unwinding in each of those cases is reliable.
>>
Madhavan T. Venkataraman March 6, 2023, 4:58 p.m. UTC | #5
On 3/2/23 10:23, Petr Mladek wrote:
> On Wed 2023-03-01 03:12:08, Tomohiro Misono (Fujitsu) wrote:
>> <snip>
>>> Testing
>>> =======
>>>
>>> - I have run all of the livepatch selftests successfully. I have written a
>>>   couple of extra selftests myself which I will be posting separately
>> Hi,
>>
>> What test configuration/environment you are using for test?
>> When I tried kselftest with fedora based config on VM, I got errors
>> because livepatch transition won't finish until signal is sent
>> (i.e. it takes 15s for every transition).
>>
>> [excerpt from test result]
>>   ```
>>   $ sudo ./test-livepatch.sh
>>   TEST: basic function patching ... not ok
>>   
>>   --- expected
>>   +++ result
>>   @@ -2,11 +2,13 @@
>>    livepatch: enabling patch 'test_klp_livepatch'
>>    livepatch: 'test_klp_livepatch': initializing patching transition
>>    livepatch: 'test_klp_livepatch': starting patching transition
>>   +livepatch: signaling remaining tasks
>>    livepatch: 'test_klp_livepatch': completing patching transition
>>   ```
> 
> It might be interesting to see what process is blocking the
> transition. The transition state is visible in
> /proc/<pid>/patch_state.
> 
> The transition is blocked when a process is in KLP_UNPATCHED state.
> It is defined in include/linux/livepatch.h:
> 
> #define KLP_UNPATCHED	 0
> 
> Well, the timing against the transition is important. The following
> might help to see the blocking processes:
> 
> $> modprobe livepatch-sample ; \
>    sleep 1; \
>    for proc_path in \
>        `grep "\-1"  /proc/*/patch_state | cut -d '/'  -f-3` ; \
>    do \
>        cat $proc_path/comm ; \
>        cat $proc_path/stack ; \
>        echo ===  ; \
>    done
> 
> After this the livepatch has to be manualy disabled and removed
> 
> $> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
> $> rmmod livepatch_sample
> 
> Best Regards,
> Petr

Thanks for the suggestion. I will try to reproduce the problem and look at what process(es) are holding up
the livepatch.

Madhavan
Mark Rutland March 23, 2023, 5:17 p.m. UTC | #6
Hi Madhavan,

At a high-level, I think this still falls afoul of our desire to not reverse
engineer control flow from the binary, and so I do not think this is the right
approach. I've expanded a bit on that below.

I do think it would be nice to have *some* of the objtool changes, as I do
think we will want to use objtool for some things in future (e.g. some
build-time binary patching such as table sorting).

On Thu, Feb 02, 2023 at 01:40:14AM -0600, madvenka@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
> 
> Introduction
> ============
> 
> The livepatch feature requires an unwinder that can provide a reliable stack
> trace. General requirements for a reliable unwinder are described in this
> document from Mark Rutland:
> 
> 	Documentation/livepatch/reliable-stacktrace.rst
> 
> The requirements have two parts:
> 
> 1. The unwinder must be enhanced with certain features. E.g.,
> 
> 	- Identifying successful termination of stack trace
> 	- Identifying unwindable and non-unwindable code
> 	- Identifying interrupts and exceptions occurring in the frame pointer
> 	  prolog and epilog
> 	- Identifying features such as kretprobe and ftrace graph tracing
> 	  that can modify the return address stored on the stack
> 	- Identifying corrupted/unreliable stack contents
> 	- Architecture-specific items that can render a stack trace unreliable
> 	  at certain points in code
> 
> 2. Validation of the frame pointer
> 
> 	This assumes that the unwinder is based on the frame pointer (FP).
> 	The actual frame pointer that the unwinder uses cannot just be
> 	assumed to be correct. It needs to be validated somehow.
> 
> This patch series is to address the following:
> 
> 	- Identifying unwindable and non-unwindable code
> 	- Identifying interrupts and exceptions occurring in the frame pointer
> 	  prolog and epilog
> 	- Validation of the frame pointer
> 
> The rest are already in place AFAICT.

Just as a note: there are a few issues remaining (e.g. the kretprobe and fgraph
PC recovery both have windows where they lose the original return address), and
there are a few compiler-generated trampoline functions with non-AAPCS calling
conventions that will need special care.

> Validation of the FP (aka FRAME_POINTER_VALIDATION)
> ====================
> 
> The current approach in Linux is to use objtool, a build time tool, for this
> purpose. When configured, objtool is invoked on every relocatable object file
> during kernel build. It performs static analysis of the code in each file. It
> walks the instructions in every function and notes the changes to the stack
> pointer (SP) and the frame pointer (FP). It makes sure that the changes are in
> accordance with the ABI rules. There are also a lot of other checks that
> Objtool performs. Once objtool completes successfully, the kernel can then be
> used for livepatch purposes.
> 
> Objtool can have uses other than just FP validation. For instance, it can check
> control flow integrity during its analysis.
> 
> Problem
> =======
> 
> Objtool is complex and highly architecture-dependent. There are a lot of
> different checks in objtool that all of the code in the kernel must pass
> before livepatch can be enabled. If a check fails, it must be corrected
> before we can proceed. Sometimes, the kernel code needs to be fixed.
> Sometimes, it is a compiler bug that needs to be fixed. The challenge is
> also to prove that all the work is complete for an architecture.
> 
> As such, it presents a great challenge to enable livepatch for an
> architecture.

There's a more fundamental issue here in that objtool has to reverse-engineer
control flow, and so even if the kernel code and compiled code generation is
*perfect*, it's possible that objtool won't recognise the structure of the
generated code, and won't be able to reverse-engineer the correct control flow.

We've seen issues where objtool didn't understand jump tables, so support for
that got disabled on x86. A key objection from the arm64 side is that we don't
want to disable compile code generation strategies like this. Further, as
compiles evolve, their code generation strategies will change, and it's likely
there will be other cases that crop up. This is inherently fragile.

The key objections from the arm64 side is that we don't want to
reverse-engineer details from the binary, as this is complex, fragile, and
unstable. This is why we've previously suggested that we should work with
compiler folk to get what we need.

I'll note that at the last Linux Plumbers Conference, there was a discussion
about what is now called SFrame, which *might* give us sufficient information,
but I have not had the time to dig into that as I have been chasing other
problems and trying to get other infrastructure in place.

> A different approach
> ====================
> 
> I would like to propose a different approach for FP validation. I would
> like to be able to enable livepatch for an architecture as is. That is,
> without "fixing" the kernel or the compiler for it:
> 
> There are three steps in this:
> 
> 1. Objtool walks all the functions as usual. It computes the stack and
>    frame pointer offsets at each instruction as usual. It generates ORC
>    records and stores them in special sections as usual. This is simple
>    enough to do.

This still requires reverse-engineering the forward-edge control flow in order
to compute those offets, so the same objections apply with this approach. I do
not think this is the right approach.

I would *strongly* prefer that we work with compiler folk to get the
information that we need.

[...]

> 		FWIW, I have also compared the CFI I am generating with DWARF
> 		information that the compiler generates. The CFIs match a
> 		100% for Clang. In the case of gcc, the comparison fails
> 		in 1.7% of the cases. I have analyzed those cases and found
> 		the DWARF information generated by gcc is incorrect. The
> 		ORC generated by my Objtool is correct.


Have you reported this to the GCC folk, and can you give any examples?
I'm sure they would be interested in fixing this, regardless of whether we end
up using it.

Thanks,
Mark.
Madhavan T. Venkataraman April 8, 2023, 3:40 a.m. UTC | #7
Hi Mark,

Sorry for the long delay in responding. Was caught up in many things.
My responses inline..

On 3/23/23 12:17, Mark Rutland wrote:
> Hi Madhavan,
> 
> At a high-level, I think this still falls afoul of our desire to not reverse
> engineer control flow from the binary, and so I do not think this is the right
> approach. I've expanded a bit on that below.
> 
> I do think it would be nice to have *some* of the objtool changes, as I do
> think we will want to use objtool for some things in future (e.g. some
> build-time binary patching such as table sorting).
> 

OK. I have been under the impression that the arm64 folks are basically OK with
Objtool's approach of reverse engineering from the binary. I did not see
any specific objections to previously submitted patches based on this approach
including mine.

So, if the community is not in agreement with this approach, I will go back to the
drawing board for this one.

Are there any other opinions on this subject from others?

> On Thu, Feb 02, 2023 at 01:40:14AM -0600, madvenka@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> Introduction
>> ============
>>
>> The livepatch feature requires an unwinder that can provide a reliable stack
>> trace. General requirements for a reliable unwinder are described in this
>> document from Mark Rutland:
>>
>> 	Documentation/livepatch/reliable-stacktrace.rst
>>
>> The requirements have two parts:
>>
>> 1. The unwinder must be enhanced with certain features. E.g.,
>>
>> 	- Identifying successful termination of stack trace
>> 	- Identifying unwindable and non-unwindable code
>> 	- Identifying interrupts and exceptions occurring in the frame pointer
>> 	  prolog and epilog
>> 	- Identifying features such as kretprobe and ftrace graph tracing
>> 	  that can modify the return address stored on the stack
>> 	- Identifying corrupted/unreliable stack contents
>> 	- Architecture-specific items that can render a stack trace unreliable
>> 	  at certain points in code
>>
>> 2. Validation of the frame pointer
>>
>> 	This assumes that the unwinder is based on the frame pointer (FP).
>> 	The actual frame pointer that the unwinder uses cannot just be
>> 	assumed to be correct. It needs to be validated somehow.
>>
>> This patch series is to address the following:
>>
>> 	- Identifying unwindable and non-unwindable code
>> 	- Identifying interrupts and exceptions occurring in the frame pointer
>> 	  prolog and epilog
>> 	- Validation of the frame pointer
>>
>> The rest are already in place AFAICT.
> 
> Just as a note: there are a few issues remaining (e.g. the kretprobe and fgraph
> PC recovery both have windows where they lose the original return address), and
> there are a few compiler-generated trampoline functions with non-AAPCS calling
> conventions that will need special care.
> 

OK.

>> Validation of the FP (aka FRAME_POINTER_VALIDATION)
>> ====================
>>
>> The current approach in Linux is to use objtool, a build time tool, for this
>> purpose. When configured, objtool is invoked on every relocatable object file
>> during kernel build. It performs static analysis of the code in each file. It
>> walks the instructions in every function and notes the changes to the stack
>> pointer (SP) and the frame pointer (FP). It makes sure that the changes are in
>> accordance with the ABI rules. There are also a lot of other checks that
>> Objtool performs. Once objtool completes successfully, the kernel can then be
>> used for livepatch purposes.
>>
>> Objtool can have uses other than just FP validation. For instance, it can check
>> control flow integrity during its analysis.
>>
>> Problem
>> =======
>>
>> Objtool is complex and highly architecture-dependent. There are a lot of
>> different checks in objtool that all of the code in the kernel must pass
>> before livepatch can be enabled. If a check fails, it must be corrected
>> before we can proceed. Sometimes, the kernel code needs to be fixed.
>> Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>> also to prove that all the work is complete for an architecture.
>>
>> As such, it presents a great challenge to enable livepatch for an
>> architecture.
> 
> There's a more fundamental issue here in that objtool has to reverse-engineer
> control flow, and so even if the kernel code and compiled code generation is
> *perfect*, it's possible that objtool won't recognise the structure of the
> generated code, and won't be able to reverse-engineer the correct control flow.
> 
> We've seen issues where objtool didn't understand jump tables, so support for
> that got disabled on x86. A key objection from the arm64 side is that we don't
> want to disable compile code generation strategies like this. Further, as
> compiles evolve, their code generation strategies will change, and it's likely
> there will be other cases that crop up. This is inherently fragile.
> 
> The key objections from the arm64 side is that we don't want to
> reverse-engineer details from the binary, as this is complex, fragile, and
> unstable. This is why we've previously suggested that we should work with
> compiler folk to get what we need.
> 

So, what exactly do you have in mind? What help can the compiler folk provide?
By your own argument, we cannot rely on the compiler as compiler implementations,
optimization strategies, etc can change in ways that are incompatible with any
livepatch implementation. Also, there can always be bugs in the compiler
implementations.

Can you please elaborate? Are we looking for a way for the compiler folks to
provide us with something that we can use to implement reliable stack trace?

> I'll note that at the last Linux Plumbers Conference, there was a discussion
> about what is now called SFrame, which *might* give us sufficient information,
> but I have not had the time to dig into that as I have been chasing other
> problems and trying to get other infrastructure in place.
> 

I will try to locate the link. If you can provide me a link, that would be greatly
appreciated. I will study their SFrame proposal.

>> A different approach
>> ====================
>>
>> I would like to propose a different approach for FP validation. I would
>> like to be able to enable livepatch for an architecture as is. That is,
>> without "fixing" the kernel or the compiler for it:
>>
>> There are three steps in this:
>>
>> 1. Objtool walks all the functions as usual. It computes the stack and
>>    frame pointer offsets at each instruction as usual. It generates ORC
>>    records and stores them in special sections as usual. This is simple
>>    enough to do.
> 
> This still requires reverse-engineering the forward-edge control flow in order
> to compute those offets, so the same objections apply with this approach. I do
> not think this is the right approach.
> 
> I would *strongly* prefer that we work with compiler folk to get the
> information that we need.
> 

I am willing to do this. But I am not clear on the kind of features we want
from the compiler. Are you suggesting something for getting a reliable
stack trace? Is there any kind of proposal out there that I need to study?

> [...]
> 
>> 		FWIW, I have also compared the CFI I am generating with DWARF
>> 		information that the compiler generates. The CFIs match a
>> 		100% for Clang. In the case of gcc, the comparison fails
>> 		in 1.7% of the cases. I have analyzed those cases and found
>> 		the DWARF information generated by gcc is incorrect. The
>> 		ORC generated by my Objtool is correct.
> 
> 
> Have you reported this to the GCC folk, and can you give any examples?
> I'm sure they would be interested in fixing this, regardless of whether we end
> up using it.
> 

I will try to get the data again and put something together and send it to the
gcc folks.

Thanks for the suggestions.

Madhavan
Mark Rutland April 11, 2023, 1:25 p.m. UTC | #8
On Fri, Apr 07, 2023 at 10:40:07PM -0500, Madhavan T. Venkataraman wrote:
> Hi Mark,
> 
> Sorry for the long delay in responding. Was caught up in many things.
> My responses inline..
> 
> On 3/23/23 12:17, Mark Rutland wrote:
> > Hi Madhavan,
> > 
> > At a high-level, I think this still falls afoul of our desire to not reverse
> > engineer control flow from the binary, and so I do not think this is the right
> > approach. I've expanded a bit on that below.
> > 
> > I do think it would be nice to have *some* of the objtool changes, as I do
> > think we will want to use objtool for some things in future (e.g. some
> > build-time binary patching such as table sorting).
> 
> OK. I have been under the impression that the arm64 folks are basically OK with
> Objtool's approach of reverse engineering from the binary. I did not see
> any specific objections to previously submitted patches based on this approach
> including mine.

This has admittedly changed over time, but the preference to avoid
reverse-engineering control flow has been around for a while. For example,
during LPC 2021's "objtool on arm64" session:

  https://lpc.events/event/11/contributions/971/

... where Will and I expressed strong desires to get the compiler to help,
whether that's compiler-generated metadata, (agreed upon) restrictions on code
generation, or something else.

[...]

> > There's a more fundamental issue here in that objtool has to reverse-engineer
> > control flow, and so even if the kernel code and compiled code generation is
> > *perfect*, it's possible that objtool won't recognise the structure of the
> > generated code, and won't be able to reverse-engineer the correct control flow.
> > 
> > We've seen issues where objtool didn't understand jump tables, so support for
> > that got disabled on x86. A key objection from the arm64 side is that we don't
> > want to disable compile code generation strategies like this. Further, as
> > compiles evolve, their code generation strategies will change, and it's likely
> > there will be other cases that crop up. This is inherently fragile.
> > 
> > The key objections from the arm64 side is that we don't want to
> > reverse-engineer details from the binary, as this is complex, fragile, and
> > unstable. This is why we've previously suggested that we should work with
> > compiler folk to get what we need.
> > 
> 
> So, what exactly do you have in mind? What help can the compiler folk provide?

There are several possibilities, e.g.

* Generate some simple metadata that tells us for each PC whether to start an
  unwind from the LR or FP. My understanding was that SFrame *might* be
  sufficient for this.

  We might need some custom metadata for assembly (e.g. exception entry,
  trampolines), but it'd be ok for that to be different.

* Agree upon some restricted patterns for code generation (e.g. fixed
  prologues/epilogues), so that we can identify whether to use LR or FP based
  on the PC and a symbol lookup.

> By your own argument, we cannot rely on the compiler as compiler implementations,
> optimization strategies, etc can change in ways that are incompatible with any
> livepatch implementation.

That's not quite my argument.

My argument is that if we assume some set of properties that compiler folk
never agreed to (and were never made aware of), then compiler folk are well
within their rights to change the compiler such that it doesn't provide those
properties, and it's very likely that such expectation will be broken. We've
seen that happen before (e.g. with jump tables).

Consequently I think we should be working with compiler folk to agree upon some
solution, where compiler folk will actually try to maintain the properties we
depend upon (and e.g. they could have tests for). That sort of co-design has
worked well so far (e.g. with things like kCFI).

Ideally we'd have people in the same room to have a discussion (e.g. at LPC).

> Also, there can always be bugs in the compiler implementations.

I don't disagree with that.

> Can you please elaborate? Are we looking for a way for the compiler folks to
> provide us with something that we can use to implement reliable stack trace?

I tried to do so a bit above.

I'm looking for some agreement between kernel folk and compiler folk on a
reliable mechanism. That might be something that already exists, or something
new. It might be metadata or some restrictions on code generation.

> > I'll note that at the last Linux Plumbers Conference, there was a discussion
> > about what is now called SFrame, which *might* give us sufficient information,
> > but I have not had the time to dig into that as I have been chasing other
> > problems and trying to get other infrastructure in place.
> 
> I will try to locate the link. If you can provide me a link, that would be greatly
> appreciated. I will study their SFrame proposal.

From looking around, that session was:

  https://lpc.events/event/16/contributions/1177/

At the time it was called CTF Frame, but got renamed to SFrame.

I'm not sure where to find the most recent documentation. As I mentioned above
I have not had the time to look in detail.

> >> 		FWIW, I have also compared the CFI I am generating with DWARF
> >> 		information that the compiler generates. The CFIs match a
> >> 		100% for Clang. In the case of gcc, the comparison fails
> >> 		in 1.7% of the cases. I have analyzed those cases and found
> >> 		the DWARF information generated by gcc is incorrect. The
> >> 		ORC generated by my Objtool is correct.
> > 
> > Have you reported this to the GCC folk, and can you give any examples?
> > I'm sure they would be interested in fixing this, regardless of whether we end
> > up using it.
> 
> I will try to get the data again and put something together and send it to the
> gcc folks.

Thanks for doing so; that's much appreciated!

Thanks, 
Mark.
Josh Poimboeuf April 12, 2023, 4:17 a.m. UTC | #9
On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
> > By your own argument, we cannot rely on the compiler as compiler implementations,
> > optimization strategies, etc can change in ways that are incompatible with any
> > livepatch implementation.
> 
> That's not quite my argument.
> 
> My argument is that if we assume some set of properties that compiler folk
> never agreed to (and were never made aware of), then compiler folk are well
> within their rights to change the compiler such that it doesn't provide those
> properties, and it's very likely that such expectation will be broken. We've
> seen that happen before (e.g. with jump tables).
> 
> Consequently I think we should be working with compiler folk to agree upon some
> solution, where compiler folk will actually try to maintain the properties we
> depend upon (and e.g. they could have tests for). That sort of co-design has
> worked well so far (e.g. with things like kCFI).
> 
> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).

That was the goal of my talk at LPC last year:

  https://lpc.events/event/16/contributions/1392/

We discussed having the compiler annotate the tricky bits of control
flow, mainly jump tables and noreturns.  It's still on my TODO list to
prototype that.

Another alternative which has been suggested in the past by Indu and
others is for objtool to use DWARF/sframe as an input to help guide it
through the tricky bits.

That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
has some reliability issues -- and also defeats some of the benefits of
reverse-engineering in the first place (we've found many compiler bugs
and other surprising kernel-compiler interactions over the years).

Objtool's understanding of the control flow graph has been really
valuable for reasons beyond live patching (e.g., noinstr and uaccess
validation), it's definitely worth finding a way to make that more
sustainable.
Madhavan T. Venkataraman April 12, 2023, 4:48 a.m. UTC | #10
On 4/11/23 23:17, Josh Poimboeuf wrote:
> On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
>>> By your own argument, we cannot rely on the compiler as compiler implementations,
>>> optimization strategies, etc can change in ways that are incompatible with any
>>> livepatch implementation.
>>
>> That's not quite my argument.
>>
>> My argument is that if we assume some set of properties that compiler folk
>> never agreed to (and were never made aware of), then compiler folk are well
>> within their rights to change the compiler such that it doesn't provide those
>> properties, and it's very likely that such expectation will be broken. We've
>> seen that happen before (e.g. with jump tables).
>>
>> Consequently I think we should be working with compiler folk to agree upon some
>> solution, where compiler folk will actually try to maintain the properties we
>> depend upon (and e.g. they could have tests for). That sort of co-design has
>> worked well so far (e.g. with things like kCFI).
>>
>> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
> 
> That was the goal of my talk at LPC last year:
> 
>   https://lpc.events/event/16/contributions/1392/
> 
> We discussed having the compiler annotate the tricky bits of control
> flow, mainly jump tables and noreturns.  It's still on my TODO list to
> prototype that.
> 
> Another alternative which has been suggested in the past by Indu and
> others is for objtool to use DWARF/sframe as an input to help guide it
> through the tricky bits.
> 

I read through the SFrame spec file briefly. It looks like I can easily adapt my
version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
folks agree to properly support and maintain SFrame, then I could send the next version
of the patchset based on SFrame.

But I kinda need a clear path forward before I implement anything. I request the arm64
folks to comment on the above approach. Would it be useful to initiate an email discussion
with the compiler folks on what they plan to do to support SFrame? Or, should this all
happen face to face in some forum like LPC?

Madhavan

> That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
> has some reliability issues -- and also defeats some of the benefits of
> reverse-engineering in the first place (we've found many compiler bugs
> and other surprising kernel-compiler interactions over the years).
> 
> Objtool's understanding of the control flow graph has been really
> valuable for reasons beyond live patching (e.g., noinstr and uaccess
> validation), it's definitely worth finding a way to make that more
> sustainable.
>
Madhavan T. Venkataraman April 12, 2023, 4:50 a.m. UTC | #11
On 4/11/23 23:48, Madhavan T. Venkataraman wrote:
> 
> 
> On 4/11/23 23:17, Josh Poimboeuf wrote:
>> On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
>>>> By your own argument, we cannot rely on the compiler as compiler implementations,
>>>> optimization strategies, etc can change in ways that are incompatible with any
>>>> livepatch implementation.
>>>
>>> That's not quite my argument.
>>>
>>> My argument is that if we assume some set of properties that compiler folk
>>> never agreed to (and were never made aware of), then compiler folk are well
>>> within their rights to change the compiler such that it doesn't provide those
>>> properties, and it's very likely that such expectation will be broken. We've
>>> seen that happen before (e.g. with jump tables).
>>>
>>> Consequently I think we should be working with compiler folk to agree upon some
>>> solution, where compiler folk will actually try to maintain the properties we
>>> depend upon (and e.g. they could have tests for). That sort of co-design has
>>> worked well so far (e.g. with things like kCFI).
>>>
>>> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
>>
>> That was the goal of my talk at LPC last year:
>>
>>   https://lpc.events/event/16/contributions/1392/
>>
>> We discussed having the compiler annotate the tricky bits of control
>> flow, mainly jump tables and noreturns.  It's still on my TODO list to
>> prototype that.
>>
>> Another alternative which has been suggested in the past by Indu and
>> others is for objtool to use DWARF/sframe as an input to help guide it
>> through the tricky bits.
>>
> 
> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> folks agree to properly support and maintain SFrame, then I could send the next version
> of the patchset based on SFrame.
> 
> But I kinda need a clear path forward before I implement anything. I request the arm64
> folks to comment on the above approach. Would it be useful to initiate an email discussion
> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> happen face to face in some forum like LPC?
> 
> Madhavan
> 

Just to be clear. This is not to replace Objtool as it has other uses as well, not just
reliable stack trace. I am trying to solve the reliable stack trace issue alone with
SFrame.

Madhavan

>> That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
>> has some reliability issues -- and also defeats some of the benefits of
>> reverse-engineering in the first place (we've found many compiler bugs
>> and other surprising kernel-compiler interactions over the years).
>>
>> Objtool's understanding of the control flow graph has been really
>> valuable for reasons beyond live patching (e.g., noinstr and uaccess
>> validation), it's definitely worth finding a way to make that more
>> sustainable.
>>
Josh Poimboeuf April 12, 2023, 5:01 a.m. UTC | #12
On Tue, Apr 11, 2023 at 11:48:21PM -0500, Madhavan T. Venkataraman wrote:
> 
> 
> On 4/11/23 23:17, Josh Poimboeuf wrote:
> > On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
> >>> By your own argument, we cannot rely on the compiler as compiler implementations,
> >>> optimization strategies, etc can change in ways that are incompatible with any
> >>> livepatch implementation.
> >>
> >> That's not quite my argument.
> >>
> >> My argument is that if we assume some set of properties that compiler folk
> >> never agreed to (and were never made aware of), then compiler folk are well
> >> within their rights to change the compiler such that it doesn't provide those
> >> properties, and it's very likely that such expectation will be broken. We've
> >> seen that happen before (e.g. with jump tables).
> >>
> >> Consequently I think we should be working with compiler folk to agree upon some
> >> solution, where compiler folk will actually try to maintain the properties we
> >> depend upon (and e.g. they could have tests for). That sort of co-design has
> >> worked well so far (e.g. with things like kCFI).
> >>
> >> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
> > 
> > That was the goal of my talk at LPC last year:
> > 
> >   https://lpc.events/event/16/contributions/1392/
> > 
> > We discussed having the compiler annotate the tricky bits of control
> > flow, mainly jump tables and noreturns.  It's still on my TODO list to
> > prototype that.
> > 
> > Another alternative which has been suggested in the past by Indu and
> > others is for objtool to use DWARF/sframe as an input to help guide it
> > through the tricky bits.
> > 
> 
> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> folks agree to properly support and maintain SFrame, then I could send the next version
> of the patchset based on SFrame.
> 
> But I kinda need a clear path forward before I implement anything. I request the arm64
> folks to comment on the above approach. Would it be useful to initiate an email discussion
> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> happen face to face in some forum like LPC?

SFrame is basically a simplified version of DWARF unwind, using it as an
input to objtool is going to have the same issues I mentioned below (and
as was discussed with your v1).

> > That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
> > has some reliability issues -- and also defeats some of the benefits of
> > reverse-engineering in the first place (we've found many compiler bugs
> > and other surprising kernel-compiler interactions over the years).
> > 
> > Objtool's understanding of the control flow graph has been really
> > valuable for reasons beyond live patching (e.g., noinstr and uaccess
> > validation), it's definitely worth finding a way to make that more
> > sustainable.
Madhavan T. Venkataraman April 12, 2023, 2:50 p.m. UTC | #13
On 4/12/23 00:01, Josh Poimboeuf wrote:
> On Tue, Apr 11, 2023 at 11:48:21PM -0500, Madhavan T. Venkataraman wrote:
>>
>>
>> On 4/11/23 23:17, Josh Poimboeuf wrote:
>>> On Tue, Apr 11, 2023 at 02:25:11PM +0100, Mark Rutland wrote:
>>>>> By your own argument, we cannot rely on the compiler as compiler implementations,
>>>>> optimization strategies, etc can change in ways that are incompatible with any
>>>>> livepatch implementation.
>>>>
>>>> That's not quite my argument.
>>>>
>>>> My argument is that if we assume some set of properties that compiler folk
>>>> never agreed to (and were never made aware of), then compiler folk are well
>>>> within their rights to change the compiler such that it doesn't provide those
>>>> properties, and it's very likely that such expectation will be broken. We've
>>>> seen that happen before (e.g. with jump tables).
>>>>
>>>> Consequently I think we should be working with compiler folk to agree upon some
>>>> solution, where compiler folk will actually try to maintain the properties we
>>>> depend upon (and e.g. they could have tests for). That sort of co-design has
>>>> worked well so far (e.g. with things like kCFI).
>>>>
>>>> Ideally we'd have people in the same room to have a discussion (e.g. at LPC).
>>>
>>> That was the goal of my talk at LPC last year:
>>>
>>>   https://lpc.events/event/16/contributions/1392/
>>>
>>> We discussed having the compiler annotate the tricky bits of control
>>> flow, mainly jump tables and noreturns.  It's still on my TODO list to
>>> prototype that.
>>>
>>> Another alternative which has been suggested in the past by Indu and
>>> others is for objtool to use DWARF/sframe as an input to help guide it
>>> through the tricky bits.
>>>
>>
>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>> folks agree to properly support and maintain SFrame, then I could send the next version
>> of the patchset based on SFrame.
>>
>> But I kinda need a clear path forward before I implement anything. I request the arm64
>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>> happen face to face in some forum like LPC?
> 
> SFrame is basically a simplified version of DWARF unwind, using it as an
> input to objtool is going to have the same issues I mentioned below (and
> as was discussed with your v1).
> 

Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
can provide the feature with a reliability guarantee. DWARF is too complex.

Madhavan

>>> That seems more fragile -- as Madhavan mentioned, GCC-generated DWARF
>>> has some reliability issues -- and also defeats some of the benefits of
>>> reverse-engineering in the first place (we've found many compiler bugs
>>> and other surprising kernel-compiler interactions over the years).
>>>
>>> Objtool's understanding of the control flow graph has been really
>>> valuable for reasons beyond live patching (e.g., noinstr and uaccess
>>> validation), it's definitely worth finding a way to make that more
>>> sustainable.
>
Josh Poimboeuf April 12, 2023, 3:52 p.m. UTC | #14
On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
> >> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> >> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> >> folks agree to properly support and maintain SFrame, then I could send the next version
> >> of the patchset based on SFrame.
> >>
> >> But I kinda need a clear path forward before I implement anything. I request the arm64
> >> folks to comment on the above approach. Would it be useful to initiate an email discussion
> >> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> >> happen face to face in some forum like LPC?
> > 
> > SFrame is basically a simplified version of DWARF unwind, using it as an
> > input to objtool is going to have the same issues I mentioned below (and
> > as was discussed with your v1).
> > 
> 
> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
> can provide the feature with a reliability guarantee. DWARF is too complex.

I don't see what the complexity (or lack thereof) of the unwinding data
format has to do with it.  The unreliability comes from the underlying
data source, not the formatting of the data.
Madhavan T. Venkataraman April 13, 2023, 2:59 p.m. UTC | #15
On 4/12/23 10:52, Josh Poimboeuf wrote:
> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>> of the patchset based on SFrame.
>>>>
>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>> happen face to face in some forum like LPC?
>>>
>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>> input to objtool is going to have the same issues I mentioned below (and
>>> as was discussed with your v1).
>>>
>>
>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>> can provide the feature with a reliability guarantee. DWARF is too complex.
> 
> I don't see what the complexity (or lack thereof) of the unwinding data
> format has to do with it.  The unreliability comes from the underlying
> data source, not the formatting of the data.
> 

What I meant is - if SFrame is implemented by simply extracting unwind info from
DWARF data and placing it in a separate section (as it is probably implemented now),
then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
then either they have to make DWARF reliable. Or, they have to implement SFrame as a
separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
The latter is a lot easier to do.

Sorry if that was not clear.

Madhavan
Josh Poimboeuf April 13, 2023, 4:30 p.m. UTC | #16
On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
> On 4/12/23 10:52, Josh Poimboeuf wrote:
> > On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
> >>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> >>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> >>>> folks agree to properly support and maintain SFrame, then I could send the next version
> >>>> of the patchset based on SFrame.
> >>>>
> >>>> But I kinda need a clear path forward before I implement anything. I request the arm64
> >>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
> >>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> >>>> happen face to face in some forum like LPC?
> >>>
> >>> SFrame is basically a simplified version of DWARF unwind, using it as an
> >>> input to objtool is going to have the same issues I mentioned below (and
> >>> as was discussed with your v1).
> >>>
> >>
> >> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
> >> can provide the feature with a reliability guarantee. DWARF is too complex.
> > 
> > I don't see what the complexity (or lack thereof) of the unwinding data
> > format has to do with it.  The unreliability comes from the underlying
> > data source, not the formatting of the data.
> > 
> 
> What I meant is - if SFrame is implemented by simply extracting unwind info from
> DWARF data and placing it in a separate section (as it is probably implemented now),
> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
> The latter is a lot easier to do.

[ adding linux-toolchains ]

I don't think ensuring reliability is an easy task, regardless of the
complexity of the unwinding format.

Whether it's SFrame or DWARF/eh_frame, the question would be how to
ensure it's always reliable for a compiler "power user" like the kernel
which has many edge cases (including lots of inline asm which the
compiler has no visibility to) and which uses unwinding for more than
just debugging.

It would need some kind of black-box testing on a complex code base.
(hint: kind of like what objtool already does today)
Nick Desaulniers April 13, 2023, 5:04 p.m. UTC | #17
On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
> Hi Madhavan,
> 
> At a high-level, I think this still falls afoul of our desire to not reverse
> engineer control flow from the binary, and so I do not think this is the right
> approach. I've expanded a bit on that below.
> 
> I do think it would be nice to have *some* of the objtool changes, as I do
> think we will want to use objtool for some things in future (e.g. some
> build-time binary patching such as table sorting).
> 
> > Problem
> > =======
> > 
> > Objtool is complex and highly architecture-dependent. There are a lot of
> > different checks in objtool that all of the code in the kernel must pass
> > before livepatch can be enabled. If a check fails, it must be corrected
> > before we can proceed. Sometimes, the kernel code needs to be fixed.
> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
> > also to prove that all the work is complete for an architecture.
> > 
> > As such, it presents a great challenge to enable livepatch for an
> > architecture.
> 
> There's a more fundamental issue here in that objtool has to reverse-engineer
> control flow, and so even if the kernel code and compiled code generation is
> *perfect*, it's possible that objtool won't recognise the structure of the
> generated code, and won't be able to reverse-engineer the correct control flow.
> 
> We've seen issues where objtool didn't understand jump tables, so support for
> that got disabled on x86. A key objection from the arm64 side is that we don't
> want to disable compile code generation strategies like this. Further, as
> compiles evolve, their code generation strategies will change, and it's likely
> there will be other cases that crop up. This is inherently fragile.
> 
> The key objections from the arm64 side is that we don't want to
> reverse-engineer details from the binary, as this is complex, fragile, and
> unstable. This is why we've previously suggested that we should work with
> compiler folk to get what we need.

> This still requires reverse-engineering the forward-edge control flow in order
> to compute those offets, so the same objections apply with this approach. I do
> not think this is the right approach.
> 
> I would *strongly* prefer that we work with compiler folk to get the
> information that we need.

IDK if it's relevant here, but I did see a commit go by to LLVM that
seemed to include such info in a custom ELF section (for the purposes of
improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
to see if it's reliable or usable?
- https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
- https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow

> 
> [...]
> 
> > 		FWIW, I have also compared the CFI I am generating with DWARF
> > 		information that the compiler generates. The CFIs match a
> > 		100% for Clang. In the case of gcc, the comparison fails
> > 		in 1.7% of the cases. I have analyzed those cases and found
> > 		the DWARF information generated by gcc is incorrect. The
> > 		ORC generated by my Objtool is correct.
> 
> 
> Have you reported this to the GCC folk, and can you give any examples?
> I'm sure they would be interested in fixing this, regardless of whether we end
> up using it.

Yeah, at least a bug report is good. "See something, say something."
Jose E. Marchesi April 13, 2023, 6:15 p.m. UTC | #18
> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
>> Hi Madhavan,
>> 
>> At a high-level, I think this still falls afoul of our desire to not reverse
>> engineer control flow from the binary, and so I do not think this is the right
>> approach. I've expanded a bit on that below.
>> 
>> I do think it would be nice to have *some* of the objtool changes, as I do
>> think we will want to use objtool for some things in future (e.g. some
>> build-time binary patching such as table sorting).
>> 
>> > Problem
>> > =======
>> > 
>> > Objtool is complex and highly architecture-dependent. There are a lot of
>> > different checks in objtool that all of the code in the kernel must pass
>> > before livepatch can be enabled. If a check fails, it must be corrected
>> > before we can proceed. Sometimes, the kernel code needs to be fixed.
>> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>> > also to prove that all the work is complete for an architecture.
>> > 
>> > As such, it presents a great challenge to enable livepatch for an
>> > architecture.
>> 
>> There's a more fundamental issue here in that objtool has to reverse-engineer
>> control flow, and so even if the kernel code and compiled code generation is
>> *perfect*, it's possible that objtool won't recognise the structure of the
>> generated code, and won't be able to reverse-engineer the correct control flow.
>> 
>> We've seen issues where objtool didn't understand jump tables, so support for
>> that got disabled on x86. A key objection from the arm64 side is that we don't
>> want to disable compile code generation strategies like this. Further, as
>> compiles evolve, their code generation strategies will change, and it's likely
>> there will be other cases that crop up. This is inherently fragile.
>> 
>> The key objections from the arm64 side is that we don't want to
>> reverse-engineer details from the binary, as this is complex, fragile, and
>> unstable. This is why we've previously suggested that we should work with
>> compiler folk to get what we need.
>
>> This still requires reverse-engineering the forward-edge control flow in order
>> to compute those offets, so the same objections apply with this approach. I do
>> not think this is the right approach.
>> 
>> I would *strongly* prefer that we work with compiler folk to get the
>> information that we need.
>
> IDK if it's relevant here, but I did see a commit go by to LLVM that
> seemed to include such info in a custom ELF section (for the purposes of
> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
> to see if it's reliable or usable?
> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>
>> 
>> [...]
>> 
>> > 		FWIW, I have also compared the CFI I am generating with DWARF
>> > 		information that the compiler generates. The CFIs match a
>> > 		100% for Clang. In the case of gcc, the comparison fails
>> > 		in 1.7% of the cases. I have analyzed those cases and found
>> > 		the DWARF information generated by gcc is incorrect. The
>> > 		ORC generated by my Objtool is correct.
>> 
>> 
>> Have you reported this to the GCC folk, and can you give any examples?
>> I'm sure they would be interested in fixing this, regardless of whether we end
>> up using it.
>
> Yeah, at least a bug report is good. "See something, say something."

By all means, please.  If you guys report these issues on CFI
divergences in the GCC bugzilla, we will look into fixing them.

https://gcc.gnu.org/bugzilla
Madhavan T. Venkataraman April 15, 2023, 4:14 a.m. UTC | #19
On 4/13/23 13:15, Jose E. Marchesi wrote:
> 
>> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
>>> Hi Madhavan,
>>>
>>> At a high-level, I think this still falls afoul of our desire to not reverse
>>> engineer control flow from the binary, and so I do not think this is the right
>>> approach. I've expanded a bit on that below.
>>>
>>> I do think it would be nice to have *some* of the objtool changes, as I do
>>> think we will want to use objtool for some things in future (e.g. some
>>> build-time binary patching such as table sorting).
>>>
>>>> Problem
>>>> =======
>>>>
>>>> Objtool is complex and highly architecture-dependent. There are a lot of
>>>> different checks in objtool that all of the code in the kernel must pass
>>>> before livepatch can be enabled. If a check fails, it must be corrected
>>>> before we can proceed. Sometimes, the kernel code needs to be fixed.
>>>> Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>>>> also to prove that all the work is complete for an architecture.
>>>>
>>>> As such, it presents a great challenge to enable livepatch for an
>>>> architecture.
>>>
>>> There's a more fundamental issue here in that objtool has to reverse-engineer
>>> control flow, and so even if the kernel code and compiled code generation is
>>> *perfect*, it's possible that objtool won't recognise the structure of the
>>> generated code, and won't be able to reverse-engineer the correct control flow.
>>>
>>> We've seen issues where objtool didn't understand jump tables, so support for
>>> that got disabled on x86. A key objection from the arm64 side is that we don't
>>> want to disable compile code generation strategies like this. Further, as
>>> compiles evolve, their code generation strategies will change, and it's likely
>>> there will be other cases that crop up. This is inherently fragile.
>>>
>>> The key objections from the arm64 side is that we don't want to
>>> reverse-engineer details from the binary, as this is complex, fragile, and
>>> unstable. This is why we've previously suggested that we should work with
>>> compiler folk to get what we need.
>>
>>> This still requires reverse-engineering the forward-edge control flow in order
>>> to compute those offets, so the same objections apply with this approach. I do
>>> not think this is the right approach.
>>>
>>> I would *strongly* prefer that we work with compiler folk to get the
>>> information that we need.
>>
>> IDK if it's relevant here, but I did see a commit go by to LLVM that
>> seemed to include such info in a custom ELF section (for the purposes of
>> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
>> to see if it's reliable or usable?
>> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
>> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>>
>>>
>>> [...]
>>>
>>>> 		FWIW, I have also compared the CFI I am generating with DWARF
>>>> 		information that the compiler generates. The CFIs match a
>>>> 		100% for Clang. In the case of gcc, the comparison fails
>>>> 		in 1.7% of the cases. I have analyzed those cases and found
>>>> 		the DWARF information generated by gcc is incorrect. The
>>>> 		ORC generated by my Objtool is correct.
>>>
>>>
>>> Have you reported this to the GCC folk, and can you give any examples?
>>> I'm sure they would be interested in fixing this, regardless of whether we end
>>> up using it.
>>
>> Yeah, at least a bug report is good. "See something, say something."
> 
> By all means, please.  If you guys report these issues on CFI
> divergences in the GCC bugzilla, we will look into fixing them.
> 
> https://gcc.gnu.org/bugzilla

I will try to get the data again and report the problems that I see.

Thanks.

Madhavan
Madhavan T. Venkataraman April 15, 2023, 4:27 a.m. UTC | #20
On 4/13/23 11:30, Josh Poimboeuf wrote:
> On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
>> On 4/12/23 10:52, Josh Poimboeuf wrote:
>>> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>>>> of the patchset based on SFrame.
>>>>>>
>>>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>>>> happen face to face in some forum like LPC?
>>>>>
>>>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>>>> input to objtool is going to have the same issues I mentioned below (and
>>>>> as was discussed with your v1).
>>>>>
>>>>
>>>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>>>> can provide the feature with a reliability guarantee. DWARF is too complex.
>>>
>>> I don't see what the complexity (or lack thereof) of the unwinding data
>>> format has to do with it.  The unreliability comes from the underlying
>>> data source, not the formatting of the data.
>>>
>>
>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>> DWARF data and placing it in a separate section (as it is probably implemented now),
>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>> The latter is a lot easier to do.
> 
> [ adding linux-toolchains ]
> 
> I don't think ensuring reliability is an easy task, regardless of the
> complexity of the unwinding format.
> 
> Whether it's SFrame or DWARF/eh_frame, the question would be how to
> ensure it's always reliable for a compiler "power user" like the kernel
> which has many edge cases (including lots of inline asm which the
> compiler has no visibility to) and which uses unwinding for more than
> just debugging.
> 
> It would need some kind of black-box testing on a complex code base.
> (hint: kind of like what objtool already does today)
> 

I could use the ORC data I generate by using the decoder against the SFrame data.
A function is reliable only if both data sources agree for the whole function.

Also, in my approach, the actual frame pointer is dynamically checked against the
frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.

IMHO, this is sufficient to provide livepatch. Do you agree?

Madhavan
Josh Poimboeuf April 15, 2023, 5:05 a.m. UTC | #21
On Fri, Apr 14, 2023 at 11:27:44PM -0500, Madhavan T. Venkataraman wrote:
> >> What I meant is - if SFrame is implemented by simply extracting unwind info from
> >> DWARF data and placing it in a separate section (as it is probably implemented now),
> >> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
> >> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
> >> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
> >> The latter is a lot easier to do.
> > 
> > [ adding linux-toolchains ]
> > 
> > I don't think ensuring reliability is an easy task, regardless of the
> > complexity of the unwinding format.
> > 
> > Whether it's SFrame or DWARF/eh_frame, the question would be how to
> > ensure it's always reliable for a compiler "power user" like the kernel
> > which has many edge cases (including lots of inline asm which the
> > compiler has no visibility to) and which uses unwinding for more than
> > just debugging.
> > 
> > It would need some kind of black-box testing on a complex code base.
> > (hint: kind of like what objtool already does today)
> > 
> 
> I could use the ORC data I generate by using the decoder against the SFrame data.
> A function is reliable only if both data sources agree for the whole function.

This is somewhat similar to what I'm saying in another thread:

  https://lore.kernel.org/live-patching/20230415043949.7y4tvshe26zday3e@treble/

If objtool and DWARF/SFrame agree, all is well.

> Also, in my approach, the actual frame pointer is dynamically checked against the
> frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.
> 
> IMHO, this is sufficient to provide livepatch. Do you agree?

The dynamic reliable stacktrace checks for CONFIG_FRAME_POINTER on x86
are much simpler, as they don't require ORC or any other metadata.  They
just need to detect preemption and page faults on the stack, and to
identify the end of the stack.  Those simple dynamic checks, combined
with objtool's build-time frame pointer validation, worked very well
until we switched to ORC.

So I'm not sure I see the benefit of the additional complexity involved
in cross-checking frame pointers with ORC at runtime.  But I'm just a
bystander.  What really matters is what the arm64 folks think ;-)
Madhavan T. Venkataraman April 15, 2023, 4:15 p.m. UTC | #22
On 4/15/23 00:05, Josh Poimboeuf wrote:
> On Fri, Apr 14, 2023 at 11:27:44PM -0500, Madhavan T. Venkataraman wrote:
>>>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>>>> DWARF data and placing it in a separate section (as it is probably implemented now),
>>>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>>>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>>>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>>>> The latter is a lot easier to do.
>>>
>>> [ adding linux-toolchains ]
>>>
>>> I don't think ensuring reliability is an easy task, regardless of the
>>> complexity of the unwinding format.
>>>
>>> Whether it's SFrame or DWARF/eh_frame, the question would be how to
>>> ensure it's always reliable for a compiler "power user" like the kernel
>>> which has many edge cases (including lots of inline asm which the
>>> compiler has no visibility to) and which uses unwinding for more than
>>> just debugging.
>>>
>>> It would need some kind of black-box testing on a complex code base.
>>> (hint: kind of like what objtool already does today)
>>>
>>
>> I could use the ORC data I generate by using the decoder against the SFrame data.
>> A function is reliable only if both data sources agree for the whole function.
> 
> This is somewhat similar to what I'm saying in another thread:
> 
>   https://lore.kernel.org/live-patching/20230415043949.7y4tvshe26zday3e@treble/
> 
> If objtool and DWARF/SFrame agree, all is well.
> 
>> Also, in my approach, the actual frame pointer is dynamically checked against the
>> frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.
>>
>> IMHO, this is sufficient to provide livepatch. Do you agree?
> 
> The dynamic reliable stacktrace checks for CONFIG_FRAME_POINTER on x86
> are much simpler, as they don't require ORC or any other metadata.  They
> just need to detect preemption and page faults on the stack, and to
> identify the end of the stack.  Those simple dynamic checks, combined
> with objtool's build-time frame pointer validation, worked very well
> until we switched to ORC.
> 
> So I'm not sure I see the benefit of the additional complexity involved
> in cross-checking frame pointers with ORC at runtime.  But I'm just a
> bystander.  What really matters is what the arm64 folks think ;-)
> 

The unwinder on arm64 is frame-pointer based. I don't want to deviate from that.
I just want to use the metadata to validate the frame pointer. This approach
also catches the rare cases of frame pointer corruption and any bugs in
SFrame that the metadata check did not catch.

Of course, this is all moot if the arm64 folks do not even want the reverse engineering.
I guess we wait until the microconference to discuss all this.

Madhavan
Indu Bhagat April 16, 2023, 8:21 a.m. UTC | #23
On 4/13/23 9:30 AM, Josh Poimboeuf wrote:
> On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
>> On 4/12/23 10:52, Josh Poimboeuf wrote:
>>> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>>>> of the patchset based on SFrame.
>>>>>>
>>>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>>>> happen face to face in some forum like LPC?
>>>>>
>>>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>>>> input to objtool is going to have the same issues I mentioned below (and
>>>>> as was discussed with your v1).
>>>>>
>>>>
>>>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>>>> can provide the feature with a reliability guarantee. DWARF is too complex.
>>>
>>> I don't see what the complexity (or lack thereof) of the unwinding data
>>> format has to do with it.  The unreliability comes from the underlying
>>> data source, not the formatting of the data.
>>>
>>
>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>> DWARF data and placing it in a separate section (as it is probably implemented now),
>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>> The latter is a lot easier to do.

SFrame stack trace data is generated by the GNU assembler, by using the 
.cfi_* asm directives embedded by the compiler.  So, it is true that the 
source of EH_Frame info and SFrame stack trace data is the same.

That said, yes, if you see bugs/inconsistencies in SFrame/EH_Frame info, 
please file the issue(s).

> 
> [ adding linux-toolchains ]
> 
> I don't think ensuring reliability is an easy task, regardless of the
> complexity of the unwinding format.
> 
> Whether it's SFrame or DWARF/eh_frame, the question would be how to
> ensure it's always reliable for a compiler "power user" like the kernel
> which has many edge cases (including lots of inline asm which the
> compiler has no visibility to) and which uses unwinding for more than
> just debugging.
> 
> It would need some kind of black-box testing on a complex code base.
> (hint: kind of like what objtool already does today)
>
Madhavan T. Venkataraman Dec. 14, 2023, 8:49 p.m. UTC | #24
Hi Mark,

I attended your presentation in the LPC. You mentioned that you could use some help with some pre-requisites for the Livepatch feature.
I would like to lend a hand.

What would you like me to implement?

I would also like to implement Unwind Hints for the feature. If you want a specific format for the hints, let me know.

Looking forward to help out with the feature.

Madhavan
Mark Rutland Dec. 15, 2023, 1:04 p.m. UTC | #25
On Thu, Dec 14, 2023 at 02:49:29PM -0600, Madhavan T. Venkataraman wrote:
> Hi Mark,

Hi Madhavan,

> I attended your presentation in the LPC. You mentioned that you could use
> some help with some pre-requisites for the Livepatch feature.
> I would like to lend a hand.

Cool!

I've been meaning to send a mail round with a summary of the current state of
things, and what needs to be done going forward, but I haven't had the time
since LPC to put that together (as e.g. that requires learning some more about
SFrame).  I'll be disappearing for the holiday shortly, and I intend to pick
that up in the new year.

> What would you like me to implement?

I'm not currently sure exactly what we need/want to implement, and as above I
think that needs to wait until the new year.

However, one thing that you can do that would be very useful is to write up and
report the GCC DWARF issues that you mentioned in:

  https://lore.kernel.org/linux-arm-kernel/20230202074036.507249-1-madvenka@linux.microsoft.com/

... as (depending on exactly what those are) those could also affect SFrame
generation (and thus we'll need to get those fixed in GCC), and regardless it
would be useful information to know.

I understood that you planned to do that from:

  https://lore.kernel.org/linux-arm-kernel/054ce0d6-70f0-b834-d4e5-1049c8df7492@linux.microsoft.com/

... but I couldn't spot any relevant mails or issues in the GCC bugzilla, so
either I'm failing to search hard enough, or did that get forgotten about?

> I would also like to implement Unwind Hints for the feature. If you want a
> specific format for the hints, let me know.

I will get back to you on that in the new year; I think the specifics we want
are going to depend on other details above we need to analyse first.

Thanks,
Mark.
Madhavan T. Venkataraman Dec. 15, 2023, 3:15 p.m. UTC | #26
On 12/15/23 07:04, Mark Rutland wrote:
> On Thu, Dec 14, 2023 at 02:49:29PM -0600, Madhavan T. Venkataraman wrote:
>> Hi Mark,
> 
> Hi Madhavan,
> 
>> I attended your presentation in the LPC. You mentioned that you could use
>> some help with some pre-requisites for the Livepatch feature.
>> I would like to lend a hand.
> 
> Cool!
> 
> I've been meaning to send a mail round with a summary of the current state of
> things, and what needs to be done going forward, but I haven't had the time
> since LPC to put that together (as e.g. that requires learning some more about
> SFrame).  I'll be disappearing for the holiday shortly, and I intend to pick
> that up in the new year.
> 
>> What would you like me to implement?
> 
> I'm not currently sure exactly what we need/want to implement, and as above I
> think that needs to wait until the new year.
> 

OK.

> However, one thing that you can do that would be very useful is to write up and
> report the GCC DWARF issues that you mentioned in:
> 
>   https://lore.kernel.org/linux-arm-kernel/20230202074036.507249-1-madvenka@linux.microsoft.com/
> 
> ... as (depending on exactly what those are) those could also affect SFrame
> generation (and thus we'll need to get those fixed in GCC), and regardless it
> would be useful information to know.
> 
> I understood that you planned to do that from:
> 
>   https://lore.kernel.org/linux-arm-kernel/054ce0d6-70f0-b834-d4e5-1049c8df7492@linux.microsoft.com/
> 
> ... but I couldn't spot any relevant mails or issues in the GCC bugzilla, so
> either I'm failing to search hard enough, or did that get forgotten about?
> 

Yeah. I had notes on that. But I seem to have lost them. I need to reproduce the
problems and analyze them again which is not trivial. So, I have been procrastinating.

I am also disappearing for the rest of this year. I will try to look at it in the
new year.

>> I would also like to implement Unwind Hints for the feature. If you want a
>> specific format for the hints, let me know.
> 
> I will get back to you on that in the new year; I think the specifics we want
> are going to depend on other details above we need to analyse first.
> 

OK.

For now, I will implement something and send it out just for reference. We can revisit
this topic next year sometime.

Thanks.

Madhavan