diff mbox series

[dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types

Message ID 20230228202357.2766051-1-eddyz87@gmail.com (mailing list archive)
State Not Applicable
Delegated to: BPF
Headers show
Series [dwarves] dwarf_loader: Fix for BTF id drift caused by adding unspecified types | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Eduard Zingerman Feb. 28, 2023, 8:23 p.m. UTC
Recent changes to handle unspecified types (see [1]) cause BTF ID drift.

Specifically, the intent of commits [2], [3] and [4] is to render
references to unspecified types as void type.
However, as a consequence:
- in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
  for unspecified type tags and adds these tags to `cu->types_table`;
- `btf_encoder__encode_tag()` skips generation of BTF entries for
  `DW_TAG_unspecified_type` tags.

Such logic causes ID drift if unspecified type is not the last type
processed for compilation unit. `small_id` of each type following
unspecified type in the `cu->types_table` would have its BTF id off by -1.
Thus, rendering references established on recode phase invalid.

This commit reverts `unspecified_type` id/tag tracking.
Instead, the following is done:
- `small_id` for unspecified type tags is set to 0, thus reference to
  unspecified type tag would render BTF id of a `void` on recode phase;
- unspecified type tags are not added to `cu->types_table`.

This change also happens to fix issue reported in [5], the gist of
that issue is that the field `encoder->unspecified_type` is set but
not reset by function `btf_encoder__encode_cu()`. Thus, the following
sequence of events might occur when BTF encoding is requested:
- CU with unspecified type is processed:
  - unspecified type id is 42
  - encoder->unspecified_type is set to 42
- CU without unspecified type is processed next using the same
  `encoder` object:
  - some `struct foo` has id 42 in this CU
  - the references to `struct foo` are set 0 by function
    `btf_encoder__tag_type()`.

[1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
[2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
[3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
[4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
[5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/

Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
Tested-by: KP Singh <kpsingh@kernel.org>
Reported-by: Matt Bobrowski <mattbobrowski@google.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 btf_encoder.c  |  8 --------
 dwarf_loader.c | 25 +++++++++++++++++++------
 dwarves.h      |  8 --------
 3 files changed, 19 insertions(+), 22 deletions(-)

Comments

Andrii Nakryiko March 1, 2023, 6:58 p.m. UTC | #1
On Tue, Feb 28, 2023 at 12:24 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Recent changes to handle unspecified types (see [1]) cause BTF ID drift.
>
> Specifically, the intent of commits [2], [3] and [4] is to render
> references to unspecified types as void type.
> However, as a consequence:
> - in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
>   for unspecified type tags and adds these tags to `cu->types_table`;
> - `btf_encoder__encode_tag()` skips generation of BTF entries for
>   `DW_TAG_unspecified_type` tags.
>
> Such logic causes ID drift if unspecified type is not the last type
> processed for compilation unit. `small_id` of each type following
> unspecified type in the `cu->types_table` would have its BTF id off by -1.
> Thus, rendering references established on recode phase invalid.
>
> This commit reverts `unspecified_type` id/tag tracking.
> Instead, the following is done:
> - `small_id` for unspecified type tags is set to 0, thus reference to
>   unspecified type tag would render BTF id of a `void` on recode phase;
> - unspecified type tags are not added to `cu->types_table`.
>
> This change also happens to fix issue reported in [5], the gist of
> that issue is that the field `encoder->unspecified_type` is set but
> not reset by function `btf_encoder__encode_cu()`. Thus, the following
> sequence of events might occur when BTF encoding is requested:
> - CU with unspecified type is processed:
>   - unspecified type id is 42
>   - encoder->unspecified_type is set to 42
> - CU without unspecified type is processed next using the same
>   `encoder` object:
>   - some `struct foo` has id 42 in this CU
>   - the references to `struct foo` are set 0 by function
>     `btf_encoder__tag_type()`.
>
> [1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
> [2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> [3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
> [4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
> [5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/
>
> Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
> Tested-by: KP Singh <kpsingh@kernel.org>
> Reported-by: Matt Bobrowski <mattbobrowski@google.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---

Looks like a sensible approach to me.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  btf_encoder.c  |  8 --------
>  dwarf_loader.c | 25 +++++++++++++++++++------
>  dwarves.h      |  8 --------
>  3 files changed, 19 insertions(+), 22 deletions(-)
>

[...]
Jiri Olsa March 1, 2023, 9:19 p.m. UTC | #2
On Tue, Feb 28, 2023 at 10:23:57PM +0200, Eduard Zingerman wrote:
> Recent changes to handle unspecified types (see [1]) cause BTF ID drift.
> 
> Specifically, the intent of commits [2], [3] and [4] is to render
> references to unspecified types as void type.
> However, as a consequence:
> - in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
>   for unspecified type tags and adds these tags to `cu->types_table`;
> - `btf_encoder__encode_tag()` skips generation of BTF entries for
>   `DW_TAG_unspecified_type` tags.
> 
> Such logic causes ID drift if unspecified type is not the last type
> processed for compilation unit. `small_id` of each type following
> unspecified type in the `cu->types_table` would have its BTF id off by -1.
> Thus, rendering references established on recode phase invalid.
> 
> This commit reverts `unspecified_type` id/tag tracking.
> Instead, the following is done:
> - `small_id` for unspecified type tags is set to 0, thus reference to
>   unspecified type tag would render BTF id of a `void` on recode phase;
> - unspecified type tags are not added to `cu->types_table`.
> 
> This change also happens to fix issue reported in [5], the gist of
> that issue is that the field `encoder->unspecified_type` is set but
> not reset by function `btf_encoder__encode_cu()`. Thus, the following
> sequence of events might occur when BTF encoding is requested:
> - CU with unspecified type is processed:
>   - unspecified type id is 42
>   - encoder->unspecified_type is set to 42
> - CU without unspecified type is processed next using the same
>   `encoder` object:
>   - some `struct foo` has id 42 in this CU
>   - the references to `struct foo` are set 0 by function
>     `btf_encoder__tag_type()`.
> 
> [1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
> [2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> [3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
> [4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
> [5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/
> 
> Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
> Tested-by: KP Singh <kpsingh@kernel.org>
> Reported-by: Matt Bobrowski <mattbobrowski@google.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

lgtm, tested on top of the pahole next branch with bpf selftests

Tested-by: Jiri Olsa <jolsa@kernel.org>

jirka

> ---
>  btf_encoder.c  |  8 --------
>  dwarf_loader.c | 25 +++++++++++++++++++------
>  dwarves.h      |  8 --------
>  3 files changed, 19 insertions(+), 22 deletions(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index da776f4..07a9dc5 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -69,7 +69,6 @@ struct btf_encoder {
>  	const char	  *filename;
>  	struct elf_symtab *symtab;
>  	uint32_t	  type_id_off;
> -	uint32_t	  unspecified_type;
>  	int		  saved_func_cnt;
>  	bool		  has_index_type,
>  			  need_index_type,
> @@ -635,11 +634,6 @@ static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_t
>  	if (tag_type == 0)
>  		return 0;
>  
> -	if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {
> -		// No provision for encoding this, turn it into void.
> -		return 0;
> -	}
> -
>  	return encoder->type_id_off + tag_type;
>  }
>  
> @@ -1746,8 +1740,6 @@ int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
>  
>  	encoder->cu = cu;
>  	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
> -	if (encoder->cu->unspecified_type.tag)
> -		encoder->unspecified_type = encoder->cu->unspecified_type.type;
>  
>  	if (!encoder->has_index_type) {
>  		/* cu__find_base_type_by_name() takes "type_id_t *id" */
> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 014e130..c37bd7b 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -2155,8 +2155,7 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
>  	case DW_TAG_atomic_type:
>  		tag = die__create_new_tag(die, cu);		break;
>  	case DW_TAG_unspecified_type:
> -		cu->unspecified_type.tag =
> -			tag = die__create_new_tag(die, cu);     break;
> +		tag = die__create_new_tag(die, cu);		break;
>  	case DW_TAG_pointer_type:
>  		tag = die__create_new_pointer_tag(die, cu, conf);	break;
>  	case DW_TAG_ptr_to_member_type:
> @@ -2219,13 +2218,27 @@ static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *co
>  			continue;
>  		}
>  
> -		uint32_t id;
> -		cu__add_tag(cu, tag, &id);
> +		uint32_t id = 0;
> +		/* There is no BTF representation for unspecified types.
> +		 * Currently we want such types to be represented as `void`
> +		 * (and thus skip BTF encoding).
> +		 *
> +		 * As BTF encoding is skipped, such types must not be added to type table,
> +		 * otherwise an ID for a type would be allocated and we would be forced
> +		 * to put something in BTF at this ID.
> +		 * Thus avoid `cu__add_tag()` call for such types.
> +		 *
> +		 * On the other hand, there might be references to this type from other
> +		 * tags, so `dwarf_cu__find_tag_by_ref()` must return something.
> +		 * Thus call `cu__hash()` for such types.
> +		 *
> +		 * Note, that small_id of zero would be assigned to unspecified type entry.
> +		 */
> +		if (tag->tag != DW_TAG_unspecified_type)
> +			cu__add_tag(cu, tag, &id);
>  		cu__hash(cu, tag);
>  		struct dwarf_tag *dtag = tag->priv;
>  		dtag->small_id = id;
> -		if (tag->tag == DW_TAG_unspecified_type)
> -			cu->unspecified_type.type = id;
>  	} while (dwarf_siblingof(die, die) == 0);
>  
>  	return 0;
> diff --git a/dwarves.h b/dwarves.h
> index 5074cf8..e92b2fd 100644
> --- a/dwarves.h
> +++ b/dwarves.h
> @@ -236,10 +236,6 @@ struct debug_fmt_ops {
>  
>  #define ARCH_MAX_REGISTER_PARAMS	8
>  
> -/*
> - * unspecified_type: If this CU has a DW_TAG_unspecified_type, as BTF doesn't have a representation for this
> - * 		     and thus we need to check functions returning this to convert it to void.
> - */
>  struct cu {
>  	struct list_head node;
>  	struct list_head tags;
> @@ -248,10 +244,6 @@ struct cu {
>  	struct ptr_table functions_table;
>  	struct ptr_table tags_table;
>  	struct rb_root	 functions;
> -	struct {
> -		struct tag	 *tag;
> -		uint32_t	 type;
> -	} unspecified_type;
>  	char		 *name;
>  	char		 *filename;
>  	void 		 *priv;
> -- 
> 2.39.1
>
Arnaldo Carvalho de Melo March 2, 2023, 6 p.m. UTC | #3
Em Wed, Mar 01, 2023 at 10:19:56PM +0100, Jiri Olsa escreveu:
> On Tue, Feb 28, 2023 at 10:23:57PM +0200, Eduard Zingerman wrote:
> > Recent changes to handle unspecified types (see [1]) cause BTF ID drift.
> > 
> > Specifically, the intent of commits [2], [3] and [4] is to render
> > references to unspecified types as void type.
> > However, as a consequence:
> > - in `die__process_unit()` call to `cu__add_tag()` allocates `small_id`
> >   for unspecified type tags and adds these tags to `cu->types_table`;
> > - `btf_encoder__encode_tag()` skips generation of BTF entries for
> >   `DW_TAG_unspecified_type` tags.
> > 
> > Such logic causes ID drift if unspecified type is not the last type
> > processed for compilation unit. `small_id` of each type following
> > unspecified type in the `cu->types_table` would have its BTF id off by -1.
> > Thus, rendering references established on recode phase invalid.
> > 
> > This commit reverts `unspecified_type` id/tag tracking.
> > Instead, the following is done:
> > - `small_id` for unspecified type tags is set to 0, thus reference to
> >   unspecified type tag would render BTF id of a `void` on recode phase;
> > - unspecified type tags are not added to `cu->types_table`.
> > 
> > This change also happens to fix issue reported in [5], the gist of
> > that issue is that the field `encoder->unspecified_type` is set but
> > not reset by function `btf_encoder__encode_cu()`. Thus, the following
> > sequence of events might occur when BTF encoding is requested:
> > - CU with unspecified type is processed:
> >   - unspecified type id is 42
> >   - encoder->unspecified_type is set to 42
> > - CU without unspecified type is processed next using the same
> >   `encoder` object:
> >   - some `struct foo` has id 42 in this CU
> >   - the references to `struct foo` are set 0 by function
> >     `btf_encoder__tag_type()`.
> > 
> > [1] https://lore.kernel.org/all/Y0R7uu3s%2FimnvPzM@kernel.org/
> > [2] bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> > [3] cffe5e1f75e1 ("core: Record if a CU has a DW_TAG_unspecified_type")
> > [4] 75e0fe28bb02 ("core: Add DW_TAG_unspecified_type to tag__is_tag_type() set")
> > [5] https://lore.kernel.org/bpf/Y%2FP1yxAuV6Wj3A0K@google.com/
> > 
> > Fixes: bcc648a10cbc ("btf_encoder: Encode DW_TAG_unspecified_type returning routines as void")
> > Fixes: 52b25808e44a ("btf_encoder: Store type_id_off, unspecified type in encoder")
> > Tested-by: KP Singh <kpsingh@kernel.org>
> > Reported-by: Matt Bobrowski <mattbobrowski@google.com>
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> 
> lgtm, tested on top of the pahole next branch with bpf selftests
> 
> Tested-by: Jiri Olsa <jolsa@kernel.org>

Looks good to me as well, and way more elegant, thanks!

Applied.

- Arnaldo
diff mbox series

Patch

diff --git a/btf_encoder.c b/btf_encoder.c
index da776f4..07a9dc5 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -69,7 +69,6 @@  struct btf_encoder {
 	const char	  *filename;
 	struct elf_symtab *symtab;
 	uint32_t	  type_id_off;
-	uint32_t	  unspecified_type;
 	int		  saved_func_cnt;
 	bool		  has_index_type,
 			  need_index_type,
@@ -635,11 +634,6 @@  static int32_t btf_encoder__tag_type(struct btf_encoder *encoder, uint32_t tag_t
 	if (tag_type == 0)
 		return 0;
 
-	if (encoder->unspecified_type && tag_type == encoder->unspecified_type) {
-		// No provision for encoding this, turn it into void.
-		return 0;
-	}
-
 	return encoder->type_id_off + tag_type;
 }
 
@@ -1746,8 +1740,6 @@  int btf_encoder__encode_cu(struct btf_encoder *encoder, struct cu *cu, struct co
 
 	encoder->cu = cu;
 	encoder->type_id_off = btf__type_cnt(encoder->btf) - 1;
-	if (encoder->cu->unspecified_type.tag)
-		encoder->unspecified_type = encoder->cu->unspecified_type.type;
 
 	if (!encoder->has_index_type) {
 		/* cu__find_base_type_by_name() takes "type_id_t *id" */
diff --git a/dwarf_loader.c b/dwarf_loader.c
index 014e130..c37bd7b 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -2155,8 +2155,7 @@  static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
 	case DW_TAG_atomic_type:
 		tag = die__create_new_tag(die, cu);		break;
 	case DW_TAG_unspecified_type:
-		cu->unspecified_type.tag =
-			tag = die__create_new_tag(die, cu);     break;
+		tag = die__create_new_tag(die, cu);		break;
 	case DW_TAG_pointer_type:
 		tag = die__create_new_pointer_tag(die, cu, conf);	break;
 	case DW_TAG_ptr_to_member_type:
@@ -2219,13 +2218,27 @@  static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *co
 			continue;
 		}
 
-		uint32_t id;
-		cu__add_tag(cu, tag, &id);
+		uint32_t id = 0;
+		/* There is no BTF representation for unspecified types.
+		 * Currently we want such types to be represented as `void`
+		 * (and thus skip BTF encoding).
+		 *
+		 * As BTF encoding is skipped, such types must not be added to type table,
+		 * otherwise an ID for a type would be allocated and we would be forced
+		 * to put something in BTF at this ID.
+		 * Thus avoid `cu__add_tag()` call for such types.
+		 *
+		 * On the other hand, there might be references to this type from other
+		 * tags, so `dwarf_cu__find_tag_by_ref()` must return something.
+		 * Thus call `cu__hash()` for such types.
+		 *
+		 * Note, that small_id of zero would be assigned to unspecified type entry.
+		 */
+		if (tag->tag != DW_TAG_unspecified_type)
+			cu__add_tag(cu, tag, &id);
 		cu__hash(cu, tag);
 		struct dwarf_tag *dtag = tag->priv;
 		dtag->small_id = id;
-		if (tag->tag == DW_TAG_unspecified_type)
-			cu->unspecified_type.type = id;
 	} while (dwarf_siblingof(die, die) == 0);
 
 	return 0;
diff --git a/dwarves.h b/dwarves.h
index 5074cf8..e92b2fd 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -236,10 +236,6 @@  struct debug_fmt_ops {
 
 #define ARCH_MAX_REGISTER_PARAMS	8
 
-/*
- * unspecified_type: If this CU has a DW_TAG_unspecified_type, as BTF doesn't have a representation for this
- * 		     and thus we need to check functions returning this to convert it to void.
- */
 struct cu {
 	struct list_head node;
 	struct list_head tags;
@@ -248,10 +244,6 @@  struct cu {
 	struct ptr_table functions_table;
 	struct ptr_table tags_table;
 	struct rb_root	 functions;
-	struct {
-		struct tag	 *tag;
-		uint32_t	 type;
-	} unspecified_type;
 	char		 *name;
 	char		 *filename;
 	void 		 *priv;