Message ID | 20241114155822.898466-3-alan.maguire@oracle.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Check DW_OP_[GNU_]entry_value for possible parameter matching | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Thu, 2024-11-14 at 15:58 +0000, Alan Maguire wrote: > Eduard noticed [1] intermittent segmentation faults triggered by caching > done internally in dwarf_getlocation(s). A binary tree of location > information is cached in the CU, and if multiple threads access it > concurrently we can get segmentation faults. It is possible to > compile elfutils with experimental thread-safe support, but safer for > now to add locking to pahole. > > No additional overhead in pahole encoding was observed > as a result of these changes. > > Reported-by: Eduard Zingerman <eddyz87@gmail.com> > Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> > Suggested-by: Jiri Olsa <olsajiri@gmail.com> > Signed-off-by: Alan Maguire <alan.maguire@oracle.com> > > [1] https://lore.kernel.org/dwarves/080794545d8eb3df3d6eba90ac621111ab7171f5.camel@gmail.com/ > --- Looking at the libdw code, both dwarf_getlocations() and dwarf_getlocation() might end up calling __libdw_intern_expression(), where race condition occurs. So I think that this lock positioning should be safe. In theory, the locking could be pushed down, to only occur around __dwarf_getlocations / dwarf_getlocation, but that would complicate the code a bit. Acked-by: Eduard Zingerman <eddyz87@gmail.com> [...]
diff --git a/dwarf_loader.c b/dwarf_loader.c index bc862b5..9299c1f 100644 --- a/dwarf_loader.c +++ b/dwarf_loader.c @@ -450,7 +450,14 @@ static bool attr_type(Dwarf_Die *die, uint32_t attr_name, Dwarf_Off *offset) static int attr_location(Dwarf_Die *die, Dwarf_Op **expr, size_t *exprlen) { Dwarf_Attribute attr; + int ret = 1; + if (dwarf_attr(die, DW_AT_location, &attr) != NULL) { + /* use libdw__lock as dwarf_getlocation(s) has concurrency + * issues when libdw is not compiled with experimental + * --enable-thread-safety + */ + pthread_mutex_lock(&libdw__lock); if (dwarf_getlocation(&attr, expr, exprlen) == 0) { /* DW_OP_addrx needs additional lookup for real addr. */ if (*exprlen != 0 && expr[0]->atom == DW_OP_addrx) { @@ -462,11 +469,12 @@ static int attr_location(Dwarf_Die *die, Dwarf_Op **expr, size_t *exprlen) expr[0]->number = address; } - return 0; + ret = 0; } + pthread_mutex_unlock(&libdw__lock); } - return 1; + return ret; } /* The struct dwarf_tag has a fixed size while the 'struct tag' is just the base @@ -1193,6 +1201,10 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg) int loc_num = -1; int ret = -1; + /* use libdw__lock as dwarf_getlocation(s) has concurrency issues + * when libdw is not compiled with experimental --enable-thread-safety + */ + pthread_mutex_lock(&libdw__lock); while ((offset = __dwarf_getlocations(attr, offset, &base, &start, &end, &expr, &exprlen)) > 0) { loc_num++; @@ -1228,6 +1240,7 @@ static int parameter__reg(Dwarf_Attribute *attr, int expected_reg) } } out: + pthread_mutex_unlock(&libdw__lock); return ret; }
Eduard noticed [1] intermittent segmentation faults triggered by caching done internally in dwarf_getlocation(s). A binary tree of location information is cached in the CU, and if multiple threads access it concurrently we can get segmentation faults. It is possible to compile elfutils with experimental thread-safe support, but safer for now to add locking to pahole. No additional overhead in pahole encoding was observed as a result of these changes. Reported-by: Eduard Zingerman <eddyz87@gmail.com> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Suggested-by: Jiri Olsa <olsajiri@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> [1] https://lore.kernel.org/dwarves/080794545d8eb3df3d6eba90ac621111ab7171f5.camel@gmail.com/ --- dwarf_loader.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)