Particle News: Meta Unveils AI Model for Near-Instant Speech-to-Speech Translation

Overview

Meta's SeamlessM4T AI model enables direct speech-to-speech translation in 36 languages and supports speech-to-text and text-to-speech translation in over 100 languages.
The system bypasses traditional multi-step translation processes, reducing errors and improving efficiency in real-time communication.
SeamlessM4T achieves 23% higher accuracy in speech-to-speech tasks compared to existing models and is 50% more resilient to background noise.
Innovative data mining techniques allowed the model to train on 4.5 million hours of multilingual audio, including low-resource languages, by aligning audio with corresponding text from web sources.
Meta has made the system open-source, encouraging further research and development, though challenges remain with certain languages, accents, and expressivity in translations.