diff mbox series

[3/4] bpf, docs: Generate nicer tables for instruction encodings

Message ID 20211223101906.977624-4-hch@lst.de (mailing list archive)
State Accepted
Delegated to: BPF
Headers show
Series [1/4] bpf, docs: Fix verifier references | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next success VM_Test
netdev/tree_selection success Not a local patch

Commit Message

Christoph Hellwig Dec. 23, 2021, 10:19 a.m. UTC
Use RST tables that are nicely readable both in plain ascii as well as
in html to render the instruction encodings, and add a few subheadings
to better structure the text.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/bpf/instruction-set.rst | 158 ++++++++++++++++----------
 1 file changed, 95 insertions(+), 63 deletions(-)

Comments

Alexei Starovoitov Dec. 31, 2021, 12:43 a.m. UTC | #1
On Thu, Dec 23, 2021 at 11:19:05AM +0100, Christoph Hellwig wrote:
>  
> +For class BPF_ALU or BPF_ALU64:
> +
> +  ========  =====  =========================
> +  code      value  description
> +  ========  =====  =========================
>    BPF_ADD   0x00
>    BPF_SUB   0x10
>    BPF_MUL   0x20
> @@ -68,26 +76,31 @@ If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 BPF_OP(code) is one of::
>    BPF_NEG   0x80
>    BPF_MOD   0x90
>    BPF_XOR   0xa0
> -  BPF_MOV   0xb0  /* mov reg to reg */
> -  BPF_ARSH  0xc0  /* sign extending shift right */
> -  BPF_END   0xd0  /* endianness conversion */
> +  BPF_MOV   0xb0   mov reg to reg
> +  BPF_ARSH  0xc0   sign extending shift right
> +  BPF_END   0xd0   endianness conversion
> +  ========  =====  =========================
>  
> -If BPF_CLASS(code) == BPF_JMP or BPF_JMP32 BPF_OP(code) is one of::
> +For class BPF_JMP or BPF_JMP32:
>  
> -  BPF_JA    0x00  /* BPF_JMP only */
> +  ========  =====  =========================
> +  code      value  description
> +  ========  =====  =========================
> +  BPF_JA    0x00   BPF_JMP only
>    BPF_JEQ   0x10
>    BPF_JGT   0x20
>    BPF_JGE   0x30
>    BPF_JSET  0x40

Not your fault, but the new table looks odd with
only some opcodes documented.
Same issue with BPF_ALU table.
In the past the documented opcodes were for eBPF only and
not documented in both, so it wasn't that bad.
At least there was a reason for discrepancy.
Now it just odd.
May be add a comment to all rows?

> -  BPF_JNE   0x50  /* jump != */
> -  BPF_JSGT  0x60  /* signed '>' */
> -  BPF_JSGE  0x70  /* signed '>=' */
> -  BPF_CALL  0x80  /* function call */
> -  BPF_EXIT  0x90  /*  function return */
> -  BPF_JLT   0xa0  /* unsigned '<' */
> -  BPF_JLE   0xb0  /* unsigned '<=' */
> -  BPF_JSLT  0xc0  /* signed '<' */
> -  BPF_JSLE  0xd0  /* signed '<=' */
> +  BPF_JNE   0x50   jump '!='
> +  BPF_JSGT  0x60   signed '>'
> +  BPF_JSGE  0x70   signed '>='
> +  BPF_CALL  0x80   function call
> +  BPF_EXIT  0x90   function return
> +  BPF_JLT   0xa0   unsigned '<'
> +  BPF_JLE   0xb0   unsigned '<='
> +  BPF_JSLT  0xc0   signed '<'
> +  BPF_JSLE  0xd0   signed '<='
> +  ========  =====  =========================
>  
>  So BPF_ADD | BPF_X | BPF_ALU means::
>  
> @@ -108,37 +121,58 @@ the return value into register R0 before doing a BPF_EXIT. Class 6 is used as
>  BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide
>  operands for the comparisons instead.
>  
> -For load and store instructions the 8-bit 'code' field is divided as::
>  
> -  +--------+--------+-------------------+
> -  | 3 bits | 2 bits |   3 bits          |
> -  |  mode  |  size  | instruction class |
> -  +--------+--------+-------------------+
> -  (MSB)                             (LSB)
> +Load and store instructions
> +===========================
> +
> +For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
> +8-bit 'opcode' field is divided as:
> +
> +  ============  ======  =================
> +  3 bits (MSB)  2 bits  3 bits (LSB)
> +  ============  ======  =================
> +  mode          size    instruction class
> +  ============  ======  =================
> +
> +The size modifier is one of:
>  
> -Size modifier is one of ...
> +  =============  =====  =====================
> +  size modifier  value  description
> +  =============  =====  =====================
> +  BPF_W          0x00   word        (4 bytes)
> +  BPF_H          0x08   half word   (2 bytes)
> +  BPF_B          0x10   byte
> +  BPF_DW         0x18   double word (8 bytes)
> +  =============  =====  =====================
>  
> -::
> +The mode modifier is one of:
>  
> -  BPF_W   0x00    /* word */
> -  BPF_H   0x08    /* half word */
> -  BPF_B   0x10    /* byte */
> -  BPF_DW  0x18    /* double word */
> +  =============  =====  =====================
> +  mode modifier  value  description
> +  =============  =====  =====================
> +  BPF_IMM        0x00   used for 64-bit mov
> +  BPF_ABS        0x20
> +  BPF_IND        0x40
> +  BPF_MEM        0x60

May be say here that ABS and IND are legacy for compat with classic only?
and MEM is the most common modifier for load/store?

> +  BPF_ATOMIC     0xc0   atomic operations 

I removed trailing space in above line.

And applied all patches to bpf-next. Thanks!
Christoph Hellwig Jan. 3, 2022, 9:57 a.m. UTC | #2
On Thu, Dec 30, 2021 at 04:43:24PM -0800, Alexei Starovoitov wrote:
> > +  ========  =====  =========================
> > +  code      value  description
> > +  ========  =====  =========================
> > +  BPF_JA    0x00   BPF_JMP only
> >    BPF_JEQ   0x10
> >    BPF_JGT   0x20
> >    BPF_JGE   0x30
> >    BPF_JSET  0x40
> 
> Not your fault, but the new table looks odd with
> only some opcodes documented.
> Same issue with BPF_ALU table.
> In the past the documented opcodes were for eBPF only and
> not documented in both, so it wasn't that bad.
> At least there was a reason for discrepancy.
> Now it just odd.
> May be add a comment to all rows?

Yes, having the description everywhere would be good.  But I'll have to
do research to actually figure out what should go in there for some.

> > +  =============  =====  =====================
> > +  mode modifier  value  description
> > +  =============  =====  =====================
> > +  BPF_IMM        0x00   used for 64-bit mov
> > +  BPF_ABS        0x20
> > +  BPF_IND        0x40
> > +  BPF_MEM        0x60
> 
> May be say here that ABS and IND are legacy for compat with classic only?
> and MEM is the most common modifier for load/store?

Sure.
diff mbox series

Patch

diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 3967842e00234..4e3041cf04325 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -19,19 +19,10 @@  The eBPF calling convention is defined as:
 R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
 necessary across calls.
 
-eBPF opcode encoding
-====================
-
-For arithmetic and jump instructions the 8-bit 'opcode' field is divided into
-three parts::
-
-  +----------------+--------+--------------------+
-  |   4 bits       |  1 bit |   3 bits           |
-  | operation code | source | instruction class  |
-  +----------------+--------+--------------------+
-  (MSB)                                      (LSB)
+Instruction classes
+===================
 
-Three LSB bits store instruction class which is one of:
+The three LSB bits of the 'opcode' field store the instruction class:
 
   ========= =====
   class     value
@@ -46,17 +37,34 @@  Three LSB bits store instruction class which is one of:
   BPF_ALU64 0x07
   ========= =====
 
-When BPF_CLASS(code) == BPF_ALU or BPF_JMP, 4th bit encodes source operand ...
+Arithmetic and jump instructions
+================================
+
+For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and
+BPF_JMP32), the 8-bit 'opcode' field is divided into three parts:
 
-::
+  ==============  ======  =================
+  4 bits (MSB)    1 bit   3 bits (LSB)
+  ==============  ======  =================
+  operation code  source  instruction class
+  ==============  ======  =================
 
-  BPF_K     0x00 /* use 32-bit immediate as source operand */
-  BPF_X     0x08 /* use 'src_reg' register as source operand */
+The 4th bit encodes the source operand:
 
-... and four MSB bits store operation code.
+  ======  =====  ========================================
+  source  value  description
+  ======  =====  ========================================
+  BPF_K   0x00   use 32-bit immediate as source operand
+  BPF_X   0x08   use 'src_reg' register as source operand
+  ======  =====  ========================================
 
-If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 BPF_OP(code) is one of::
+The four MSB bits store the operation code.
 
+For class BPF_ALU or BPF_ALU64:
+
+  ========  =====  =========================
+  code      value  description
+  ========  =====  =========================
   BPF_ADD   0x00
   BPF_SUB   0x10
   BPF_MUL   0x20
@@ -68,26 +76,31 @@  If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 BPF_OP(code) is one of::
   BPF_NEG   0x80
   BPF_MOD   0x90
   BPF_XOR   0xa0
-  BPF_MOV   0xb0  /* mov reg to reg */
-  BPF_ARSH  0xc0  /* sign extending shift right */
-  BPF_END   0xd0  /* endianness conversion */
+  BPF_MOV   0xb0   mov reg to reg
+  BPF_ARSH  0xc0   sign extending shift right
+  BPF_END   0xd0   endianness conversion
+  ========  =====  =========================
 
-If BPF_CLASS(code) == BPF_JMP or BPF_JMP32 BPF_OP(code) is one of::
+For class BPF_JMP or BPF_JMP32:
 
-  BPF_JA    0x00  /* BPF_JMP only */
+  ========  =====  =========================
+  code      value  description
+  ========  =====  =========================
+  BPF_JA    0x00   BPF_JMP only
   BPF_JEQ   0x10
   BPF_JGT   0x20
   BPF_JGE   0x30
   BPF_JSET  0x40
-  BPF_JNE   0x50  /* jump != */
-  BPF_JSGT  0x60  /* signed '>' */
-  BPF_JSGE  0x70  /* signed '>=' */
-  BPF_CALL  0x80  /* function call */
-  BPF_EXIT  0x90  /*  function return */
-  BPF_JLT   0xa0  /* unsigned '<' */
-  BPF_JLE   0xb0  /* unsigned '<=' */
-  BPF_JSLT  0xc0  /* signed '<' */
-  BPF_JSLE  0xd0  /* signed '<=' */
+  BPF_JNE   0x50   jump '!='
+  BPF_JSGT  0x60   signed '>'
+  BPF_JSGE  0x70   signed '>='
+  BPF_CALL  0x80   function call
+  BPF_EXIT  0x90   function return
+  BPF_JLT   0xa0   unsigned '<'
+  BPF_JLE   0xb0   unsigned '<='
+  BPF_JSLT  0xc0   signed '<'
+  BPF_JSLE  0xd0   signed '<='
+  ========  =====  =========================
 
 So BPF_ADD | BPF_X | BPF_ALU means::
 
@@ -108,37 +121,58 @@  the return value into register R0 before doing a BPF_EXIT. Class 6 is used as
 BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide
 operands for the comparisons instead.
 
-For load and store instructions the 8-bit 'code' field is divided as::
 
-  +--------+--------+-------------------+
-  | 3 bits | 2 bits |   3 bits          |
-  |  mode  |  size  | instruction class |
-  +--------+--------+-------------------+
-  (MSB)                             (LSB)
+Load and store instructions
+===========================
+
+For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
+8-bit 'opcode' field is divided as:
+
+  ============  ======  =================
+  3 bits (MSB)  2 bits  3 bits (LSB)
+  ============  ======  =================
+  mode          size    instruction class
+  ============  ======  =================
+
+The size modifier is one of:
 
-Size modifier is one of ...
+  =============  =====  =====================
+  size modifier  value  description
+  =============  =====  =====================
+  BPF_W          0x00   word        (4 bytes)
+  BPF_H          0x08   half word   (2 bytes)
+  BPF_B          0x10   byte
+  BPF_DW         0x18   double word (8 bytes)
+  =============  =====  =====================
 
-::
+The mode modifier is one of:
 
-  BPF_W   0x00    /* word */
-  BPF_H   0x08    /* half word */
-  BPF_B   0x10    /* byte */
-  BPF_DW  0x18    /* double word */
+  =============  =====  =====================
+  mode modifier  value  description
+  =============  =====  =====================
+  BPF_IMM        0x00   used for 64-bit mov
+  BPF_ABS        0x20
+  BPF_IND        0x40
+  BPF_MEM        0x60
+  BPF_ATOMIC     0xc0   atomic operations 
+  =============  =====  =====================
 
-... which encodes size of load/store operation::
+BPF_MEM | <size> | BPF_STX means::
 
- B  - 1 byte
- H  - 2 byte
- W  - 4 byte
- DW - 8 byte
+  *(size *) (dst_reg + off) = src_reg
 
-Mode modifier is one of::
+BPF_MEM | <size> | BPF_ST means::
 
-  BPF_IMM     0x00  /* used for 64-bit mov */
-  BPF_ABS     0x20
-  BPF_IND     0x40
-  BPF_MEM     0x60
-  BPF_ATOMIC  0xc0  /* atomic operations */
+  *(size *) (dst_reg + off) = imm32
+
+BPF_MEM | <size> | BPF_LDX means::
+
+  dst_reg = *(size *) (src_reg + off)
+
+Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.
+
+Packet access instructions
+--------------------------
 
 eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
 (BPF_IND | <size> | BPF_LD) which are used to access packet data.
@@ -165,15 +199,10 @@  For example::
     R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
     and R1 - R5 were scratched.
 
-eBPF has generic load/store operations::
+Atomic operations
+-----------------
 
-    BPF_MEM | <size> | BPF_STX:  *(size *) (dst_reg + off) = src_reg
-    BPF_MEM | <size> | BPF_ST:   *(size *) (dst_reg + off) = imm32
-    BPF_MEM | <size> | BPF_LDX:  dst_reg = *(size *) (src_reg + off)
-
-Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.
-
-It also includes atomic operations, which use the immediate field for extra
+eBPF includes atomic operations, which use the immediate field for extra
 encoding::
 
    .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W  | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
@@ -217,6 +246,9 @@  You may encounter ``BPF_XADD`` - this is a legacy name for ``BPF_ATOMIC``,
 referring to the exclusive-add operation encoded when the immediate field is
 zero.
 
+16-byte instructions
+--------------------
+
 eBPF has one 16-byte instruction: ``BPF_LD | BPF_DW | BPF_IMM`` which consists
 of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
 instruction that loads 64-bit immediate value into a dst_reg.