mbox series

[bpf-next,v2,0/7] exact states comparison for iterator convergence checks

Message ID 20231022010812.9201-1-eddyz87@gmail.com (mailing list archive)
Headers show
Series exact states comparison for iterator convergence checks | expand

Message

Eduard Zingerman Oct. 22, 2023, 1:08 a.m. UTC
Iterator convergence logic in is_state_visited() uses state_equals()
for states with branches counter > 0 to check if iterator based loop
converges. This is not fully correct because state_equals() relies on
presence of read and precision marks on registers. These marks are not
guaranteed to be finalized while state has branches.
Commit message for patch #3 describes a program that exhibits such
behavior.

This patch-set aims to fix iterator convergence logic by adding notion
of exact states comparison. Exact comparison does not rely on presence
of read or precision marks and thus is more strict.
As explained in commit message for patch #3 exact comparisons require
addition of speculative register bounds widening. The end result for
BPF verifier users could be summarized as follows:

(!) After this update verifier would reject programs that conjure an
    imprecise value on the first loop iteration and use it as precise
    on the second (for iterator based loops).

I urge people to at least skim over the commit message for patch #3.

Patches are organized as follows:
- patches #1,2: moving/extracting utility functions;
- patch #3: introduces exact mode for states comparison and adds
  widening heuristic;
- patch #4: adds test-cases that demonstrate why the series is
  necessary;
- patch #5: extends patch #3 with a notion of state loop entries,
  these entries have to be tracked to correctly identify that
  different verifier states belong to the same states loop;
- patch #6: adds a test-case that demonstrates a program
  which requires loop entry tracking for correct verification;
- patch #7: just adds a few debug prints.

The following actions are planned as a followup for this patch-set:
- implementation has to be adapted for callbacks handling logic as a
  part of a fix for [1];
- it is necessary to explore ways to improve widening heuristic to
  handle iters_task_vma test w/o need to insert barrier_var() calls;
- explored states eviction logic on cache miss has to be extended
  to either:
  - allow eviction of checkpoint states -or-
  - be sped up in case if there are many active checkpoints associated
    with the same instruction.

The patch-set is a followup for mailing list discussion [1].

Changelog:
V1 -> V2 [2], applied changes suggested by Alexei offlist:
- __explored_state() function removed;
- same_callsites() function is now used in clean_live_states();
- patches #1,2 are added as preparatory code movement;
- in process_iter_next_call() a safeguard is added to verify that
  cur_st->parent exists and has expected insn index / call sites.

[1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/
[2] https://lore.kernel.org/bpf/20231021005939.1041-1-eddyz87@gmail.com/

Eduard Zingerman (7):
  bpf: move explored_state() closer to the beginning of verifier.c
  bpf: extract same_callsites() as utility function
  bpf: exact states comparison for iterator convergence checks
  selftests/bpf: tests with delayed read/precision makrs in loop body
  bpf: correct loop detection for iterators convergence
  selftests/bpf: test if state loops are detected in a tricky case
  bpf: print full verifier states on infinite loop detection

 include/linux/bpf_verifier.h                  |  16 +
 kernel/bpf/verifier.c                         | 469 +++++++++++--
 tools/testing/selftests/bpf/progs/iters.c     | 620 ++++++++++++++++++
 .../selftests/bpf/progs/iters_task_vma.c      |   1 +
 4 files changed, 1053 insertions(+), 53 deletions(-)

Comments

Eduard Zingerman Oct. 23, 2023, 5:17 p.m. UTC | #1
On Sun, 2023-10-22 at 04:08 +0300, Eduard Zingerman wrote:
[...]
> Changelog:
> V1 -> V2 [2], applied changes suggested by Alexei offlist:
> - __explored_state() function removed;
> - same_callsites() function is now used in clean_live_states();
> - patches #1,2 are added as preparatory code movement;
> - in process_iter_next_call() a safeguard is added to verify that
>   cur_st->parent exists and has expected insn index / call sites.

I have V3 ready and passing CI.

However I checked on Alexei's concerns regarding performance on
explored states cache miss and verifier does not behave well with this
patch-set. For example, the program at the end of the email causes
verifier to "hang" (loop inside is_state_visited() to no end).

There are several options to fix this:
(a) limit total iteration depth, as in [1], the limit would have to be
    at-least 1000 to make iters/task_vma pass;
(b) limit maximal number of checkpoint states associated with
    instruction and evict those with lowest dfs_depth;
(c) choose bigger constants in `sl->miss_cnt > sl->hit_cnt * 3 + 3` for
    checkpoint states.

Given that current failure mode is bad, should I submit V3 as-is or
should I explore options (b,c) and add one of those to V3?

[1] https://github.com/eddyz87/bpf/tree/bpf-iter-next-exact-widening-max-iter-depth

---

SEC("?raw_tp")
__failure
__naked int max_iter_depth(void)
{
	/* This is equivalent to C program below.
	 * The counter stored in r6 is used as precise after the loop,
	 * thus preventing widening. Verifier won't be able to conclude
	 * that such program terminates but it should gracefully exit.
	 *
	 * r6 = 0
	 * bpf_iter_num_new(&fp[-8], 0, 10)
	 * while (bpf_iter_num_next(&fp[-8])) {
	 *   r6 += 1;
	 * }
	 * bpf_iter_num_destroy(&fp[-8])
	 * ... force r6 precise ...
	 * return 0
	 */
	asm volatile (
		"r6 = 0;"
		"r1 = r10;"
		"r1 += -8;"
		"r2 = 0;"
		"r3 = 10;"
		"call %[bpf_iter_num_new];"
	"loop_%=:"
		"r1 = r10;"
		"r1 += -8;"
		"call %[bpf_iter_num_next];"
		"if r0 == 0 goto loop_end_%=;"
		"r6 += 1;"
		"goto loop_%=;"
	"loop_end_%=:"
		"r1 = r10;"
		"r1 += -8;"
		"call %[bpf_iter_num_destroy];"
		"r0 = r10;"
		"r0 += r6;" /* this forces r6 to be precise */
		"r0 = 0;"
		"exit;"
		:
		: __imm(bpf_iter_num_new),
		  __imm(bpf_iter_num_next),
		  __imm(bpf_iter_num_destroy)
		: __clobber_all
	);
}
Eduard Zingerman Oct. 23, 2023, 9:40 p.m. UTC | #2
On Mon, 2023-10-23 at 20:17 +0300, Eduard Zingerman wrote:
> On Sun, 2023-10-22 at 04:08 +0300, Eduard Zingerman wrote:
> [...]
> > Changelog:
> > V1 -> V2 [2], applied changes suggested by Alexei offlist:
> > - __explored_state() function removed;
> > - same_callsites() function is now used in clean_live_states();
> > - patches #1,2 are added as preparatory code movement;
> > - in process_iter_next_call() a safeguard is added to verify that
> >   cur_st->parent exists and has expected insn index / call sites.
> 
> I have V3 ready and passing CI.
> 
> However I checked on Alexei's concerns regarding performance on
> explored states cache miss and verifier does not behave well with this
> patch-set. For example, the program at the end of the email causes
> verifier to "hang" (loop inside is_state_visited() to no end).
> 
> There are several options to fix this:
> (a) limit total iteration depth, as in [1], the limit would have to be
>     at-least 1000 to make iters/task_vma pass;
> (b) limit maximal number of checkpoint states associated with
>     instruction and evict those with lowest dfs_depth;
> (c) choose bigger constants in `sl->miss_cnt > sl->hit_cnt * 3 + 3` for
>     checkpoint states.

I played a bit with constants in 'eviction on miss' formula using [1] (option c).
There are three relevant tests:
- iters/max_iter_depth: should report load failure within a reasonable time;
- iters/checkpoint_states_deletion: should pass;
- verif_scale_pyperf600_iter: should pass.

I think iters/checkpoint_states_deletion represents the worst case scenario,
because depending on number of variables N, it produces 2**N distinct states.
The formula for eviction that does not loose relevant states is:

    sl->miss_cnt > sl->hit_cnt * 2**N + 2**N

(because states start to repeat after 2**N iterations).

W/o eviction for checkpoint states maximal number of variables
verifier could handle in this test is 9, with reported 958,883 insns processed.
Which corresponds to formula (sl->miss_cnt > sl->hit_cnt * 512 + 512).

Using these values I get the following execution times:

| test                       | time ./test_progs -a <test> |
|----------------------------+-----------------------------|
| verif_scale_pyperf600_iter |                        0.2s |
| checkpoint_states_deletion |                        5.8s |
| max_iter_depth             |                       23.9s |

Going one step lower to 8 variables (and 256 as a constant),
checkpoint_states_deletion takes 248,133 insns to complete
and timings table looks as follows:

| test                       | time ./test_progs -a <test> |
|----------------------------+-----------------------------|
| verif_scale_pyperf600_iter |                        0.2s |
| checkpoint_states_deletion |                        1.0s |
| max_iter_depth             |                       15.2s |

So, it is possible to get speedup for worst case scenario by leaving
some instruction budget on the table.

IMO using formula (sl->miss_cnt > sl->hit_cnt * 512 + 512) to evict
checkpoints kind-off sort-off makes sense but is very hacky.
(Or 256).

I think that better solution would be to go for option (b) from a
previous email:
- evict old checkpoints basing on dfs_depth;
- also use a secondary hash-table for checkpoints and hash not only
  insn_idx but also some fingerprint of register states, thus avoiding
  long state list walks.

But that would require some additional coding and I know that Alexei
wants to land this thing sooner than later.

[1] https://github.com/eddyz87/bpf/tree/iter-exact-eviction-formula-experiments