-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds blog on patching a Groq client instance to add tracing support + some general principles on tracing customization #135
base: main
Are you sure you want to change the base?
Conversation
Preview for 7c58ae4
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is fantastic. Minor nits and comments.
|
||
[MLflow Tracing](https://mlflow.org/docs/latest/llms/tracing/index.html) is an observability tool in MLflow that captures detailed execution traces for GenAI applications and workflows. In addition to inputs, outputs, and metadata for individual calls, MLflow tracing can also capture intermediate steps such as tool calls, reasoning steps, retrieval steps, or other custom steps. | ||
|
||
MLflow provides [built-in Tracing support](https://mlflow.org/docs/latest/llms/tracing/index.html#automatic-tracing) for many popular LLM providers and orchestration frameworks. If you are using one of these providers, you can enable tracing with a single line of code: `mlflow.<provider>.autolog()`. While MLflow's autologging capabilities cover many of the most widely-used LLM providers and orchestration frameworks, there may be times when you need to add tracing to an unsupported provider or customize tracing beyond what autologging provides. This post demonstrates how flexible and extensible MLflow Tracing can be by: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth mentioning that we do support adding span collection to autolog-enabled sessions (although this feature is only available in MLflow >= 2.19.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on this point? Not sure what this means in practice.
When adding tracing to a new provider, the main task is to map the provider's API methods to MLflow Tracing spans with appropriate span types. | ||
|
||
3. **Structure and preserve key data:** For each operation we want to trace, we need to identify the key information we want to preserve and make sure it is captured and displayed in a useful way. For example, we may want to capture the input and configuration data that control the operation's behavior, the outputs and metadata that explain the results, errors that terminated the operation prematurely, etc. Looking at traces and tracing implementations for similar providers can provide a good starting point for how to structure and preserve these data. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to mention tagging here for things like capturing session information?
|
||
A few points to note: | ||
|
||
- We are wrapping a method on a client *instance*, not a class. This is a fairly lightweight approach that does what we need it to do without requiring changes to the Groq SDK code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to use the term "patching" instead of wrapping. We are still guerilla patching here by overriding the existing implementation. Wrapping would be more of using the instance of the return obj + method reference of the return value of trace_groq_chat.
|
||
### Step 3: Wrap the instance method and try it out | ||
|
||
Now that we have a tracing decorator, we can wrap the `chat.completions.create` method on a Groq client instance and try it out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"apply our patch" instead of "wrap"
|
||
![Tool Calls](./5_tool_call.png) | ||
|
||
## Orchestration: Building a tool calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fragment?
if func is None: | ||
raise ValueError(f"No implementation for tool: {tool_call.function.name}") | ||
|
||
result = func(**tool_inputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to warn users about executing a function locally that is non-deterministic (it's not safe to ask an LLM to generate code for execution and run that callable body within the local environment process). Deterministic functions are fine, though :)
cc @B-Step62 @daniellok-db for inputs on tracing messaging for this blog :D |
Incorporating these changes means that we need to wait this until 2.20 release (mid Jan), but I feel we don't want to publish a blog that we know becomes stale in a month. |
Thanks @B-Step62! I'll hold off until January and then revise with the updated approach & a different provider. |
No description provided.