The Stay API equips builders with the important instruments to craft purposes and clever brokers able to processing streaming audio, video, and textual content with extremely low latency. This velocity is paramount for creating actually interactive experiences, opening doorways for buyer assist options, instructional platforms, and real-time monitoring providers.
Lately we introduced the preview launch of the Stay API for Gemini fashions – a big step ahead in enabling builders to construct strong and scalable real-time purposes. Attempt the most recent options now utilizing the Gemini API in Google AI Studio and in Vertex AI.
What’s new within the Stay API
Since our experimental launch in December, we have been listening intently to your suggestions and have included new options and capabilities to make the Stay API manufacturing prepared. Discover full particulars within the Stay API documentation:
Enhanced session administration & reliability
- Longer periods by way of context compression: Allow prolonged interactions past earlier cut-off dates. Configure context window compression with a sliding window mechanism to mechanically handle context size, stopping abrupt terminations as a consequence of context limits.
- Session resumption: Hold periods alive throughout short-term community disruptions. The Stay API now helps server-side session state storage (for as much as 24 hours) and offers handles (
session_resumption
) to reconnect and resume the place you left off.
- Sleek disconnect notification: Obtain a
GoAway
server message indicating when a connection is about to shut, permitting for sleek dealing with earlier than termination.
- Configurable flip protection: Select whether or not the Stay API processes all audio and video enter repeatedly or solely captures it when the end-user is detected talking.
- Configurable media decision: Optimize for high quality or token utilization by choosing the decision for enter media.
Extra management over interplay dynamics
- Configurable voice exercise detection (VAD): Select sensitivity ranges or disable automated VAD totally and use new shopper occasions (
activityStart
,activityEnd
) for handbook flip management.
- Configurable interruption dealing with: Resolve whether or not consumer enter ought to interrupt the mannequin’s response.
- Versatile session settings: Modify system instruction and different setup configurations at any time in the course of the session.
Richer output & options
- Expanded voice & language choices: Select from two new voices and 30 new languages for audio output. The output language is now configurable inside
speechConfig
.
- Textual content streaming: Obtain textual content responses incrementally as they’re generated, enabling quicker show to the consumer.
- Token utilization reporting: Achieve insights into utilization with detailed token counts offered within the
usageMetadata
area of server messages, damaged down by modality and immediate/response phases.
See the Stay API in motion: real-world purposes
To encourage your subsequent venture, we’re showcasing builders who’re already leveraging the facility of the Stay API of their purposes:
Each day.co
Each day integrates Stay API assist into the Pipecat Open Supply SDKs for Net, Android, iOS and C++.
Through the use of the facility of the Stay API, Pipecat Each day has created a voice-based phrase guessing recreation – Phrase Wrangler. Take a look at your description expertise on this AI-powered twist on traditional phrase video games and see how one can construct one for your self!
LiveKit
LiveKit integrates Stay API assist into LiveKit Brokers. This framework for constructing voice AI brokers offers a totally open-source platform for creating server-side agentic purposes.
“Till the Stay API, no different LLM provided a developer interface that would instantly ingest streaming video.”
– Russell d’Sa, CEO
Take a look at their demo the place they constructed an AI copilot that may browse the web alongside you whereas sharing ideas about what it might see in real-time.
Bubba.ai
Hey Bubba is an agentic, voice-first AI software particularly developed for truck drivers. Using the Stay API, it allows seamless, multi-language voice communication, permitting drivers to function hands-free. Key functionalities embrace:
- Trying to find freight masses and offering particulars.
- Initiating calls to brokers/shippers.
- Negotiating freight charges primarily based on market information.
- Reserving masses and verifying price confirmations.
- Discovering and reserving truck parking, together with calling accommodations to verify availability.
- Scheduling appointments with shippers and receivers.
The Stay API powers each driver interplay (leveraging perform calling and context caching for queries like future pickups) and Bubba’s potential to work together throughout telephone requires negotiation and reserving. This makes Hey Bubba a complete AI instrument for the most important and most various job sector within the USA.
Begin constructing as we speak
Stay API is able to energy your subsequent actual time voice software, to get began:
Completely satisfied constructing!