AI audio

Total 77 articles 网址

Voxtral TTS

Mistral AI introduces an open source, low-latency text-to-speech model that supports cross-language timbre cloning with latency as low as 70ms and can be deployed at the edge.

07180

AI speech generation Open Source Project # Open Source # Text-to-speech

Lyria 3 Pro

Google launched an AI music generation tool that supports 3-minute long audio creation, multimodal input, copyright compliance and high sound quality to meet the needs of full-scene music creation.

04640

AI music composition # Music Generation

PrismAudio

Ali launched the video to generate audio framework, through the “chain of thought + reinforcement learning” technology to achieve a high degree of synchronization of audio and video, can efficiently generate environmental sound effects, applicable to film and television, games, short videos and other multi-scene creation.

06050

AI speech generation # Audio Generation

AudioPod AI

AI audio creation tool, voice cloning, noise reduction and translation in one click, 3 minutes to generate professional content, support 21 languages, easy to achieve globalization and dissemination.

07350

AI speech generation AI Audio Processing # Digital Split

BeFreed

A smart learning tool that uses AI technology to provide personalized audio learning content, supports real-time Q&A and smart recommendations, and efficiently utilizes fragmented time to facilitate knowledge acquisition.

07340

AI speech generation # Podcasting Tools # Audio Summary

SAM Audio

Meta introduces the world's first unified multimodal audio separation model that supports text, visual, and time cues to accurately separate target sounds from complex audio and video.

08030

AI Sound Separation Open Source Project # Audio Separation

Voquill

Open-source voice input tool supporting multiple languages and intelligent text optimization, boosting input efficiency by several times. It balances local privacy with cloud convenience, serving as a powerful assistant for productive professionals.

01,0940

AI Audio Processing Open Source Project # Voice Input

SongBloom

Tencent AI Lab and other joint research and development of open source song generation model, 10 seconds of audio + lyrics into 2 minutes 30 seconds of high-quality music, comparable to commercial standards.

01,7310

AI music composition Open Source Project # Song Generation

UntitledPen

The full-featured creation platform based on AI technology integrates intelligent writing, multi-language speech generation and audio editing, helping users efficiently complete a one-stop solution for text creation, speech customization and post-production.

01,1460

AI speech generation # AI Creation # Writing Assistance # Speech Generation

Meloflow

An AI-driven music generation platform that supports generating, expanding, covering and adding tracks from text, lyrics or audio to quickly create commercially available original music.

02,0220

AI music composition # Music Composition

Qwen3-ASR-Flash

Qwen3-ASR-Flash

Alibaba has introduced a multi-language high-precision speech recognition model that supports complex scenes, dialect and song transcription, and can be intelligently customized for recognition in context.

03,3060

AI speech generation # Speech Recognition

Roark AI

Quality assurance and observability tools designed specifically for speech AI systems provide automated testing, real-time monitoring, and intelligent feedback to ensure high-quality output and stable operation of speech AI.

01,1680

AI Audio Processing # Voice quality

MAI-Voice-1

Microsoft has introduced an efficient speech generation model that generates natural and smooth high-fidelity audio in seconds, which has been applied to scenarios such as news broadcasting, podcasting and Copilot voice interaction.

01,2860

AI speech generation # Speech Generation

TemPolor

AI music generation platform that supports text, audio, image or video input to generate copyright-free, commercially available original music in multiple styles with one click, suitable for creators and developers to create and integrate efficiently.

01,2580

AI music composition # Music Generation

KittenTTS

An open source lightweight text-to-speech model that is less than 25 MB and can run in real time on ordinary CPUs, supports a variety of natural tones and can be used offline.

03,1690

AI speech generation Open Source Project # TTS # Video Generation

AnyVoice

An AI tool that generates high-fidelity, multi-language voice clones in just 3 seconds of audio, enabling content creators to quickly voiceover and personalize their expression.

01,7100

AI voice cloning # Voice Cloning

Mozart AI

An innovative app that uses AI technology to transform images, text or speech into personalized music, making music creation easy and creative.

01,0580

AI music composition # Music Composition

VoiSpark

AI speech generation tool that supports text-to-speech, voice cloning and voice change, helping to create high-quality voice content.

01,6610

AI speech generation # Speech Generation

Vocloner

AI voice cloning tool, support for rapid generation of realistic personalized voice, applicable to content creation, language learning and fun entertainment and other scenarios, to help users efficiently produce high-quality voice content.

01,8960

AI voice cloning # Voice Cloning

AiMakeSong

Zero-threshold AI music creation tool that lets you generate high-quality, commercially available original songs by simply typing in text or lyrics.

01,6290

AI music composition # Music Composition

Lyria 2

Google DeepMind launched an AI music engine that revolutionizes the professional-grade music creation process with full-modal generation and real-time interaction capabilities.

02,1760

AI music composition # Music Composition

Image to Music

An AI music creation tool that combines advanced image recognition technology to transform user uploaded images into multiple styles of musical compositions.

01,3950

AI music composition # Image to Music

Transkriptor

The AI tool with automatic transcription and translation function supports more than 100 languages and is suitable for multiple scenarios such as meetings, courses, etc. It improves users' collaboration efficiency with simple interface and efficient processing.

04,6230

AI Sound Separation AI efficiency tools # Audio and video transcription

Mureka O1

The world's first big model of music reasoning introduced with thought chain technology released by KunlunWanwei supports multi-style and emotional music generation, song reference and tone cloning with low latency and high quality performance, and opens up API services for enterprises and developers to integrate the application.

07,6770

AI music composition Large Model # Musical Reasoning

Narakeet

AI text-to-speech and video dubbing tool with multi-language and multi-tone support for video narration, PPT voice presentations and subtitle generation, easy to operate and natural voice.

02,3840

AI speech generation # Video Dubbing # Speech Generation

Noiz AI

Text-to-speech and video dubbing tools, with self-developed voice models to achieve high-quality, emotionally rich voice synthesis, suitable for multi-scene content creation.

06,9980

AI speech generation # Video Dubbing # Speech Generation

MiniMax Audio

MiniMax presents an AI speech synthesis tool based on the advanced T2A-01 speech model that supports multi-language, multi-tone selection and advanced parameter control.

04,4980

AI speech generation # Speech Synthesis

YueLu

AI intelligent voice tool, support recording to text, speaker differentiation, intelligent summary and other multi-functions, suitable for learning, work and life and other scenarios.

02,4480

AI speech generation # AI Voice

MakeBestMusic

AI music generation platform that transforms users' ideas into high-quality, multi-style musical compositions.

04,0320

AI music composition # Music Composition

TurboScribe

An efficient tool that utilizes AI technology to achieve fast and accurate transcription of audio and video to text, supporting multiple languages and multiple output formats.

01,8390

AI Audio Processing # Transcription Tools