-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Observability using OpenTelemetry #478
Comments
Intersted |
// cc @chrmarti Wondering if you have any thoughts here! |
Yes but not now!
to 18. heinäk. 2024 klo 2.42 Samruddhi Khandale ***@***.***>
kirjoitti:
… // cc @chrmarti <https://github.com/chrmarti> Wondering if you have any
thoughts here!
—
Reply to this email directly, view it on GitHub
<#478 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BIJZESNQZWOYCUJPMUHGZ7DZM36ODAVCNFSM6AAAAABLBKBYPSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZUG42DMOJSGM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
@spuliaiev-sfdc Could you explain a bit more what the goal is here? You mentioned on the other issue that you need this ASAP, what is driving this? |
Right now the start of the container and installing all the features + postCreate scripts takes the most of the time for our initialization of VM with containers. And we have no observability on where the time is spent, and which operations fail. Therefore we want to report the spans for the operations, their attributes and status so we can analyze the flow, detect abnormalities and create Alerting and Reporting for customer VM+Containers startup process. |
Are you using the Dev Containers CLI directly? There is a JSON log format that includes some start and stop events, would that contain enough information? E.g.:
|
We are using HoneyComb to fetch the runtime status of the initialization process. |
It would be great to add the OTEL logger which will report this data to the telemetry service, along with logging these events. |
On related note I was thinking about creating some kind of distributed cache for UP. If one uses P. S. P. P. S. Still thinking on implementation, and it feels that one of the solutions could be relying on remote bazel builds to produce distroless like images. I'm new to TypeScript and Bazel but I'm thinking about implementing this feature as my pet project to learn TS and Bazel. P. P. P. S. My bad, this is spec repo and not cli tool... on 2nd thought maybe JSON should define something related to centralized building... Actually one can create a microservice to pass |
(Hi! I work with @spuliaiev-sfdc; I happen to be based in the greater Redmond area and would be happy to chat about what we're doing if it's of interest. We're big fans of devcontainers and o11y.) Yes, we're using the reference CLI. The goals we have with distributed tracing are a bit more complex than we can do with logs alone; for example, the logs don't expose parent-child relationships and children can't expose structured metadata, just text on stdout, so a lot of nuance gets lost and a lot of regexes get written and I'm honestly not entirely sure what to do with Thanks for the json log-format pointer; I'm probably going to draft and publish a collector that aggregates those log entries and emits ~equivalent otel spans to get us moving. Our immediate needs are figuring out the components of (@alexanderilyin you'd probably have more fun with nix flakes and devenv.sh if you're looking for things that can be composed and cached, and I know this because we've been going down a similar path :) |
On related note... I'm playing with "Dev Container Features" and have to rebuild often and now it seems I'm throttled by GH:
At the same time my connection seem pretty fast: Point is it's nice to be able to drill down to "RUN ..." instruction for the related docker build which would require adding support to Docker itself (attaching related outputs) and then passing Trace/Span ID into it and probably Docker can't do it just yet. P. S. Turns out Docker already provides traces (at least Docker for Desktop on Windows) |
@alexanderilyin ensuring you pass a personal access token with your requests used to help, possibly still does, not entirely sure in this case tho @chrmarti in your example, some of your start events, such as |
This is a long-running shell that we reuse to execute commands. The stop message is probably lost or skipped during process shutdown. |
Hi, I'm from the OpenTelemetry project! We'd love to see OpenTelemetry support added to DevContainers. Would be happy to answer any questions -- if you're looking for examples of how you might integrate tracing into the code, might be interesting to see how Moby does it: https://github.com/search?q=repo:moby/moby%20opentelemetry&type=code. They trace various requests around launching containers. |
`` |
800080 |
|
Is there a documentation on the format of this log?
Kind of a lot of questions... |
One more thing... If I understand correctly from your example, the {"type":"start","level":2,"timestamp":1721747265575,"text":"Run: docker buildx version"}
{"type":"stop","level":2,"timestamp":1721747265701,"text":"Run: docker buildx version","startTimestamp":1721747265575}
{"type":"text","level":2,"timestamp":1721747265701,"text":"github.com/docker/buildx v0.15.1-desktop.1 5a84cb97872a2e717a86a0dec58b20fd3f0bea46\r\n"}
{"type":"text","level":2,"timestamp":1721747265701,"text":"\u001b[1m\u001b[31m\u001b[39m\u001b[22m\r\n"} And unfortunately we cannot even use timestamp to link them together, because in our execution I see that {"type":"start","level":2,"timestamp":1726254415859,"text":"Run: docker buildx version"}
{"type":"stop","level":2,"timestamp":1726254418844,"text":"Run: docker buildx version","startTimestamp":1726254415859}
{"type":"text","level":2,"timestamp":1726254418845,"text":"github.com/docker/buildx v0.12.1 30feaa1\r\n"}
{"type":"text","level":2,"timestamp":1726254418845,"text":"\u001b[1m\u001b[31m\u001b[39m\u001b[22m\r\n"} |
In general, supporting OpenTelemetry tracing would require a tracing context to be passed around so that all actions/RPCs that were associated with each other could be linked. In languages like Go, this is typically performed through the Context library. Other languages (such as JS/TS) use different mechanisms. In the example logs in this thread, timestamps alone would be insufficient as context identifiers. |
@chrmarti |
@spuliaiev-sfdc There is no documentation. You can see the few variations here: https://github.com/devcontainers/cli/blob/9ba1fdaa11dee087b142d33e4ac13c5788392e34/src/spec-utils/log.ts#L35 |
Our objective is to enhance the visibility of the command execution process for DEVCONTAINER-CLI. To achieve this, we plan to integrate tracing support. Our chosen method for this integration is the OPENTELEMETRY library, owing to its robust capabilities. The initial focus will be on instrumenting the 'UP' command, which will serve as our first milestone in this endeavor. This enhancement will significantly improve our understanding and control of the process.
The text was updated successfully, but these errors were encountered: