Gemini 3.5 Live Translate: real-time, real voices


Google has introduced Gemini 3.5 Live Translate, a new AI-powered speech-to-speech translation model designed to enable near real-time conversations between people speaking different languages. The technology marks a significant advancement in live translation, offering more natural and fluid communication while preserving key elements of a speaker’s voice, including tone, pacing, and pitch.

The launch represents the latest milestone in Google’s decades-long effort to improve language translation through artificial intelligence. According to the company, Gemini 3.5 Live Translate can automatically detect more than 70 languages and generate translated speech just seconds behind the original speaker, creating a smoother experience than traditional turn-based translation systems.

Unlike conventional translation tools that wait for a speaker to finish a sentence before generating a response, Gemini 3.5 Live Translate processes speech continuously as it is spoken. This approach allows conversations to flow more naturally, reducing awkward pauses and improving synchronization between speakers.

Google says the model balances translation speed with contextual understanding, helping maintain accuracy while keeping pace with live conversations. The system is also designed to perform reliably in noisy environments by filtering out background sounds and handling multilingual inputs without requiring manual configuration.

The new translation model is being rolled out across several Google products and services. Developers can begin experimenting with Gemini 3.5 Live Translate through a public preview available in the Gemini Live API and Google AI Studio. The company says the technology can be used to build applications for multilingual meetings, live broadcasts, online lessons, customer support, and real-time interpretation services.

Google has also partnered with several developer platforms, including Agora, Fishjam, LiveKit, Pipecat, and Vision Agents, to simplify the deployment of voice translation applications.

One early use case comes from ride-hailing giant Grab, which is testing the technology to facilitate communication between drivers and travelers. The company handles more than 10 million voice calls each month through its platform and hopes the new model will help bridge language barriers during pickups and customer interactions.

Enterprise users will soon see Gemini 3.5 Live Translate integrated into Google Meet. The company plans to expand support from just five languages to more than 70 languages, enabling over 2,000 language combinations within a single meeting.

Google is also redesigning the Meet interface to provide quicker access to live translation features. The updated experience is entering private preview for select Google Workspace business customers this month, with a broader rollout expected later this year.

Consumers will also benefit from the new technology through the Google Translate app on Android and iOS. Users can access live voice translation using virtually any pair of headphones, eliminating the need for specialized hardware such as Pixel Buds.

For Android users, Google is introducing a new “listening mode” that allows translated audio to be played directly through the phone’s earpiece. By holding the device to the ear like a regular phone call, users can listen to translations privately without headphones.

As AI-generated speech becomes increasingly realistic, Google is incorporating safeguards into the technology. Every audio stream generated by Gemini 3.5 Live Translate includes SynthID watermarking, an imperceptible marker embedded directly into the audio waveform.

The watermark allows AI-generated content to be identified while remaining inaudible to listeners. Google says the measure is intended to help address concerns around misinformation and ensure greater transparency as synthetic audio becomes more widespread.

With support for dozens of languages, low-latency voice translation, and integration across Google’s products, Gemini 3.5 Live Translate could bring the company closer to a long-standing goal: enabling seamless conversations between people regardless of the language they speak.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *