Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using debuginfo for better backtraces #96

Open
thesamesam opened this issue Aug 28, 2024 · 6 comments
Open

Using debuginfo for better backtraces #96

thesamesam opened this issue Aug 28, 2024 · 6 comments

Comments

@thesamesam
Copy link

This is maybe a better example of the kind of thing I was talking about in #84.

With splitdebug (-ggdb3 but in /usr/lib/debug and stripped less), ustack() output is not super friendly:

$ sudo dtrace -n 'syscall::fsync*:return,syscall::sync*:return { ustack(); }'
[...]
  6 119674                     fsync:return
              libc.so.6`fsync+0x10
              less`0x5ea421eafd4d
              0x5ea421eb85bd
              0x5ea421eafa30
              0x7796ac9e5407
              0x7ffc1f5cecc3
              0x2f65686361632f72

In this case, I genuinely didn't know that less would ever call fsync, so I was curious as to where from! But the backtrace isn't so helpful there.

I get better output if I disable stripping and use -fno-omit-frame-pointer:

$ sudo dtrace -n 'syscall::fsync*:return,syscall::sync*:return { ustack(); }'
 25 119674                     fsync:return
              libc.so.6`fsync+0x10
              less`quit+0x5d
              less`commands+0x83d
              0x59897f425a30
              0x78803df45407
              0x7ffd79035cc3
              0x2f65686361632f72

It's not perfect, but it's more than enough for me to pin down what's going on.

Could DTrace learn to read DWARF (elfutils should be able to do this, including understanding splitdebug and so on) for backtraces?

@kvanhees
Copy link
Member

We certainly can look at it being an optional support - if debuginfo is available it would make sense to make use of it if it does not negatively impact trace processing. Anything that improves backtraces while not adding to the runtime dependencies in general is good.

@nickalcock
Copy link
Member

There are two distinct issues here: DTrace wants backtrace info for reliable stack traces (which has to be something the kernel can understand --hopefully, in the future, sframe will do here), and DTrace's userspace wants a symbol table for symbol lookups. Even the latter is only going to work for longer-running traces where the process hasn't already died before userspace gets its hands on the trace, but even then this is troublesome for main programs which are routinely stripped. Solaris implemented an .ldynsym section for just this, but the Linux approach seems to have been quite different: a section containing a compressed ELF executable (!!) which only has symbol table sections in it. We do not yet handle this crazy thing, and in my last trials relatively few binaries were built with it at all. We do need a symtab from somewhere.

I'd be happy to add some sort of symbol server support, but I don't think Linux has any such thing either...

@thesamesam
Copy link
Author

thesamesam commented Aug 30, 2024

a section containing a compressed ELF executable

I'm pretty sure this is MiniDebugInfo (.gnu_debugdata). It looks like only Fedora ships with it by default (?) but I'd be open to us doing it in Gentoo.

One question is if we want to try lead some standardisation of making it a proper compressed section or not. But that would delay things substantially.

I'd be happy to add some sort of symbol server support, but I don't think Linux has any such thing either...

Isn't that debuginfod? What am I missing?

@nickalcock
Copy link
Member

It's debuginfod, but dtrace doesn't know how to request symbol info from there...

@thesamesam
Copy link
Author

The conclusion at Cauldron wrt standardising MiniDebugInfo from people I spoke to was basically "you can if you want, but I wouldn't worry that much over it" and that the only real thing to do there is improve find-debuginfo.sh from debugedit so that Fedora and Gentoo are using the same tooling. I still need to decide if we want to investigate adopting it more on our side.

@nickalcock
Copy link
Member

nickalcock commented Nov 6, 2024

While we're at it let's fix things so that find-debuginfo.sh doesn't strip out the CTF...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants