Neural Audio Codec Mimi: The Neural Audio Codec Behind Speech LLMs Mimi is a low-bitrate neural audio codec designed to tokenize speech for large language models, enabling real-time speech generation and the next wave of voice AI systems like Moshi.