-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(js): opanai chat completion streaming support #87
Conversation
// This is a streaming response | ||
// handle the chunks and add them to the span | ||
// First split the stream via tee | ||
const [leftStream, rightStream] = result.tee(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does tee
create a two-speed scenario? For example:
- If user cancels the stream, do we go ahead and read the whole thing, incurring additional token costs?
- What if the stream never ends?
- If network error happens mid-stream, how does that error bubble up to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fully sure on either - both valid concerns. Right now I'm trying to avoid an extremely convoluted instrumentation. Let me add some tests and get back to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if stream.controller.abort
is called on one, the SSE connection is killed, killing both streams. So as long as the stream abort controller is used, both streams see the same data technically. I think there is a case that you highlight that's real that if in user-land you break out of the steam and leave it for the other to consume the rest of the stream. I'll look into instrumenting the method finalizeChatCompletion
What if the stream never ends?
There's no real way that openAI would keep the stream open forever. But yeah if the stream was open forever the stream span would never terminate and never get exported.
If network error happens mid-stream, how does that error bubble up to the user?
technically the steam has an error event, which we could capture - I'm gonna leave that part off for now because I think there's probably other instrumentation that's better suited for capturing that info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, thanks!
let streamResponse = ""; | ||
for await (const chunk of stream) { | ||
if (chunk.choices.length > 0 && chunk.choices[0].delta.content) { | ||
streamResponse += chunk.choices[0].delta.content; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need a follow-up for function_call
and tool_calls
, in particular, the arguments
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. in which case the output value mime type would have to be JSON (instead of TEXT)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call out, I've completely not added tool call support so here's the ticket #90
there's also this auto tool calling that I probably need to consider (https://github.com/openai/openai-node?tab=readme-ov-file#automated-function-calls), though it looks to be in beta
resolves #39
Adds the ability to get LLM spans from streaming responses. Does this by splitting the stream via
.tee()
and iterating over the chunks.