-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiling #327
Profiling #327
Conversation
@seemk please rebase this pull against |
@seemk also the |
Is there a design doc or can you describe how it is working? Not knowing much about this the one thing that caught my eye when trying to review is the global batch log processor, but I assume that is the best solution? :) But do other profilers use their own log processor/exporters instead of using one that might already be created by the application for logs? |
We went over the workings during a call, but I'll write a summary into the PR description. As for the global batch processor, I just used it for convenience to get the export pipeline set up. Other distros (Node) didn't have the logs SDK available when profiling was implemented and had to have a custom exporter for the logs protos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this Siim. A few questions/comments below.
Added a small description how the "profiler" operates, it turned out to be pretty small in Python. |
Added filtering of internal threads (enabled by default) and |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we can modify how we access the Profiler
global variable, such that its global state is accessed once by a global function, but all other access is via function arguments.
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA |
Looks great, thanks for taking my feedback. |
@pmcollins Any outstanding issues you want addressed? |
LGTM. |
Description
Add support for CPU profiling.
Brings in new environment variables:
SPLUNK_PROFILER_ENABLED
- disabled by defaultSPLUNK_PROFILER_LOGS_ENDPOINT
- defaults tohttp://localhost:4317
SPLUNK_PROFILER_CALL_STACK_INTERVAL
- thread sampling (stacktrace taking) interval in milliseconds, defaults to1000
SPLUNK_PROFILER_INCLUDE_INTERNAL_STACKS
- whether to include stacktraces from internal profiler threads (sampler thread and logs processor/exporter), disabled by default.The profiler / sampling works from a different thread, where call stacks for all threads are periodically sampled. These are then serialized to protobuf (pprof format), gzipped and encoded as base64. The base64 content are sent to collector as OTel logs. For the exporting we can leverage the OTel logs SDK, which makes it quite convenient.
To correlate active span contexts with sampled stacktraces,
opentelemetry.context.{attach,detach}
are monkeypatched, to store a mapping ofthread ID -> span context
. This mapping is available for the sampler thread.Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Checklist: