New Gemini 2.5 capabilities
Native audio output and enhancements to Dwell API
At the moment, the Dwell API is introducing a preview model of audio-visual enter and native audio out dialogue, so you may instantly construct conversational experiences, with a extra pure and expressive Gemini.
It additionally permits the person to steer its tone, accent and elegance of talking. For instance, you may inform the mannequin to make use of a dramatic voice when telling a narrative. And it helps device use, to have the ability to search in your behalf.
You possibly can experiment with a set of early options, together with:
- Affective Dialogue, wherein the mannequin detects emotion within the person’s voice and responds appropriately.
- Proactive Audio, wherein the mannequin will ignore background conversations and know when to reply.
- Considering within the Dwell API, wherein the mannequin leverages Gemini’s considering capabilities to help extra advanced duties.
We’re additionally releasing new previews for text-to-speech in 2.5 Professional and a couple of.5 Flash. These have first-of-its-kind help for a number of audio system, enabling text-to-speech with two voices by way of native audio out.
Like Native Audio dialogue, text-to-speech is expressive, and may seize actually refined nuances, corresponding to whispers. It really works in over 24 languages and seamlessly switches between them.