Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break submit trace into two spans and add tracing to wrapper #592

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

retzkek
Copy link
Contributor

@retzkek retzkek commented Dec 11, 2024

This breaks the initial trace into two spans: a jobsub overarching span and a jobsub_submit span that contains the submit steps. This way any subsequent spans are not shown as being part of the submit, and the submit span can be collapsed without hiding them. There's still a bit of discontinuity where the jobsub span ends with the submit and does not encapsulate downstream spans, but that would be tough to avoid and I don't think it causes any major issues, the trace otherwise appears OK in Jaeger.

Further, this roughly adds a tracing span to the job wrapper (only simple.sh for now). I anticipate some iteration here before merging and including in DAGs. Some key items:

  • Span ID generation
  • Environment variables for endpoints and options

Screenshot 2024-12-11 at 12-23-02 Lightning talk distributed tracing - Google Slides
Screenshot 2024-12-11 at 12-23-12 Lightning talk distributed tracing - Google Slides
Screenshot 2024-12-11 at 12-30-19 Lightning talk distributed tracing - Google Slides

The first (main()) does basic arg parsing, and creates an overarching tracing span for the submission.
The second (submit()) does the rest.

The primary motivation is to create a container tracing span for the submit steps,
so they do not clutter the trace graph and provide a clear separation between submit
and execution stages.
@retzkek retzkek marked this pull request as draft December 11, 2024 17:33
@shreyb
Copy link
Collaborator

shreyb commented Dec 18, 2024

Discussion points:

Very good, but we need the following:

  1. Error handling in the wrapper script
  2. Standardization of ENV VARS that we want to pass to jobs

export JSB_PARENT_SPAN_ID=`echo $TRACEPARENT | cut -d'-' -f 3`
sample=`echo $TRACEPARENT | cut -d'-' -f 4`
# 8-byte random span ID e.g. adce7acee6441ec74
export JSB_SPAN_ID=`head -c8 /dev/urandom | hexdump -e '"%02x"'`
Copy link
Contributor Author

@retzkek retzkek Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any better portable ideas to generate a span ID?

uuidgen was suggested, which appears to be available alongside hexdump, but the output would need to be munged a bit - we need 8 bytes/16 chars, so could be the last two parts* which should be unique even if uuidgen falls back to time-based (which is probably unlikely). Is that better? ¯\_(ツ)_/¯

Apptainer> for x in {1..5}; do uuidgen | tr -d '-' | cut -c '17-'; done
98ea0d9203e32de4
a670a63510d929e5
a362b70ee7c45a2c
adbdcecf9ff8fe00
a003fc48358eb47b
Apptainer> for x in {1..5}; do uuidgen -t | tr -d '-' | cut -c '17-'; done
9604001a4af11e5f
9d58001a4af11e5f
a312001a4af11e5f
a2ed001a4af11e5f
b76a001a4af11e5f
  • the first 8 bytes will include the UUID version, but then maybe that's not bad really.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants