Text-to-Speech Engine— The core TTS workflow: paste text, select a voice, and generate studio-quality audio in seconds
250,000 characters per input on Free — handle long-form content like full chapters or scripts without splitting manually
Unlimited characters per generation on Pro — no restrictions on long-form documents, full audiobooks, or extended narration sessions
603+ Azure neural voices — access the full Microsoft Azure voice library including standard Neural and premium HD variants
80+ supported languages — including Hindi, Mandarin Chinese, Japanese, Arabic, German, French, Spanish, Portuguese, and all major Latin-script languages
Per-voice style selection — apply emotional registers like cheerful, serious, customerservice, narration, newscast, and more, depending on the selected voice
Prosody control — adjust speech rate, pitch, and volume with presets or fully custom values for precise delivery tuning
Full SSML editor — visual editor with one-click inserts for voice, style, prosody, language overrides, and role tags; raw SSML paste also supported with pre-submission validation
File import (Pro) — import directly from TXT, PDF, DOC, and DOCX formats; handles standard document formatting automatically
Export formats — generate output as MP3 or WAV files ready for immediate use in any downstream production tool
Voice Picker— A searchable, filterable grid of every available Azure voice with rich metadata and live audio previews
Multi-dimensional filters — filter voices by gender, language, country, speaker name, or personality characteristics (bright, calm, friendly, authoritative, etc.)
Per-voice style samples — hear each voice read sample phrases in different emotional styles before committing to a generation
Premium HD voice access — select Azure’s highest-quality HD voice variants for maximum realism
MultiTalker voices — use paired voice models (such as Ava & Andrew) for multi-speaker dialogue scenes within a single generation
Usage tracking — the app logs which voices you use most, helping you build a consistent brand voice over time
Speech Transcription Tool— Convert spoken audio to accurate text using Azure Speech-to-Text in real time or from file
Live microphone transcription — start, pause, and stop a live recording session with real-time text preview on screen
File upload transcription — upload existing audio in WAV, MP3, OGG, or FLAC formats for batch transcription
Automatic language detection — Azure identifies the spoken language in most cases automatically; manual language selection also available
High-quality recording presets — record your microphone input simultaneously as a 44.1kHz MP3 or WAV file alongside the live transcript
Export options — copy the completed transcript directly to clipboard or save it as a plain text file
5 hours per month on Pro (BYOK Azure) — generous monthly allowance for regular transcription work within the Pro plan
AI Video Dubbing (Pro)— A full automated dubbing pipeline that replaces audio tracks with synthesized speech in any target language
End-to-end pipeline — the workflow runs as Upload → Translate → Generate → Download with visible progress indicators at each stage
Multi-language output — select any source language and any target language supported by Azure Video Translation
Optional subtitle track — generate and embed a subtitle file in the target language alongside the dubbed audio
Non-destructive processing — the original video file is always preserved; output is saved as a new file
Azure Blob Storage integration— requires Azure Speech Service plus Azure Blob Storage (guided 11-step setup included in-app)
Video & Audio Downloader— Paste any supported video URL and download the file locally in your chosen quality
YouTube and common platform support — works with YouTube and other widely-used video hosting platforms
Up to 1440p video quality — download full high-resolution video for use in dubbing, editing, or archival
Audio-only MP3 extraction — extract the audio track from any supported video directly to MP3 without downloading the full video file
Free tier: 2 downloads/month — a visible download counter in the UI tracks your usage; unlimited downloads on Pro
History & Cost Tracking Dashboard— A comprehensive transaction log that keeps every generation accountable and your Azure spending transparent
Full activity log — every TTS generation, transcription session, dubbing job, and AI Improve Text call is recorded with full metadata
Per-row cost in USD — see the exact Azure cost of every individual action so there are no billing surprises at month-end
Sort and filter controls — sort by cost, voice, language, or timestamp; search through your full generation history
One-click re-run — replay any past generation instantly with the same settings without re-entering inputs
Log management — clear, filter, or delete individual entries to keep your history organized