Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add realtime livekit multimodal worker #309

Merged
merged 6 commits into from
Oct 2, 2024

Conversation

benxu3
Copy link
Contributor

@benxu3 benxu3 commented Oct 2, 2024

Context:
OpenAI released their Realtime S2S API at DevDay 2024. LiveKit supports this new API with the MultimodalAgent class.

This Change:
This PR uses the quick start template provided by LiveKit to support a Multimodal S2S Agent that is more emotive for the 01 Light mobile app. This change uses the direct OpenAI API.

  • added MultimodalAgent
  • added --multimodal boolean flag in the CLI to run the Multimodal agent instead of the standard STT/TTS
  • updated pyproject.toml with the latest versions of livekit dependencies

note: this requires an OPENAI_API_KEY that supports the new Realtime API

Test:
Tested the Multimodal agent on v0.32 (current live production version) of 01 Light app on Android 14

@KillianLucas KillianLucas merged commit 207ec08 into OpenInterpreter:main Oct 2, 2024
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants