-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLM pipeline with stream support #3114
Conversation
core/orchestrator.go
Outdated
@@ -130,6 +130,11 @@ func (orch *orchestrator) AudioToText(ctx context.Context, req worker.AudioToTex | |||
return orch.node.AudioToText(ctx, req) | |||
} | |||
|
|||
// Return type is LlmResponse, but a stream is available as well as chan(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this comment
4d54872
to
8e654d7
Compare
a39333c
to
6d44a0d
Compare
b053506
to
d939f0a
Compare
@rickstaa I have completed my review and confirmed the pipeline works with local testing after rebase. Most of the rebase updates are from codegen changes with the recent release of SDKs. There was also a small update to check if @kyriediculous do you have time to fix your branch for the changes in the draft PR? Also, do you have docs to add to livepeer/docs explaining how to use this pipeline? Notes with review:
Orchestrator logs after fix for returning container discussed in ai-worker pr
|
f9549bc
to
a17f4ba
Compare
a17f4ba
to
eca254f
Compare
Yep with some models I noticed this too, some models don't treat it as a strict cut-off but a guideline and might need a few additional tokens to complete a sentence. We could enforce strict token counting on our end, the last thing I'll do tomorrow first thing is looking to improve pricing/token counting. Howerver as you say it's not a big issue as long as the amount of "overdraft" is always limited, the user will just end up with a negative credit balance and have to send more PM tickets with the next request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leszko, @thomshutt, @ad-astra-video I've briefly reviewed this pull request, and it seems ready to be merged. Since I'm currently out of the office and unable to perform a full E2E test, could one of you please confirm that everything is working as expected so we can proceed with the merge? Thanks! 🙏🏻
core/capabilities.go
Outdated
@@ -115,6 +116,7 @@ var CapabilityNameLookup = map[Capability]string{ | |||
Capability_ImageToVideo: "Image to video", | |||
Capability_Upscale: "Upscale", | |||
Capability_AudioToText: "Audio to text", | |||
Capability_LLM: "Large Language Model", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to update to Large language model
to work with the PipelineToCapability
function. I know this is brittle, will look at making it better going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no reason to have this mapping part of the core package.
It's only used for monitoring and should live in the monitoring package if we want it for readability of monitoring.
There's also inconsistent usage of a helper and direct mapping looking.
I really don't see much utility for this mapping other than as you say, a brittle part of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for applying the commit. We can remove it or improve it in the future 👍🏻. I made a backlog item for it.
Co-authored-by: Rick Staa <rick.staa@outlook.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now thanks 🚀!
What does this pull request do? Explain your changes. (required)
Adds support for an LLM pipeline (see livepeer/ai-worker#137).
The LLM pipeline returns either a stream or a final response. Both are handled over HTTP on the go-livepeer side using SSE (server sent events).
TODO: revert ai-worker version after merge
How did you test each of these updates (required)
Ran manuel tests
Checklist:
make
runs successfully./test.sh
pass