Paste your script
On the Script Import screen, paste any dialogue script. Expand the format guide if you want a quick reference. Tap `Review Script` and the parser runs.
flexCast turns a multi-speaker script into a finished podcast on your iPhone or iPad. Assign an AI voice to every character, generate speech and sound effects, regenerate the line that didn't land, and export a single M4A. No studio, no actors, no desktop software.
What flexCast is for
Most TTS tools start with one big text field and one voice. flexCast starts with a structured script and keeps every speaker, sound effect, and music cue tagged through the entire pipeline. The output sounds like a conversation because the input always knew it was one.
Each character maps to its own ElevenLabs voice with independent stability, similarity, style, and speed. The app handles the orchestration, batches dialogue calls for natural conversational flow, and never forces you to manage takes by hand.
Demo mode generates silent placeholder audio with realistic durations, so you can paste a script, parse it, assign voices, generate, edit, mix, and export — all without an ElevenLabs account. The only thing missing is the voices.
The workflow
A four-step progress bar walks you from raw text to a mixed M4A. Each tab unlocks as the project earns it, so you always know what to do next without reading a manual.
On the Script Import screen, paste any dialogue script. Expand the format guide if you want a quick reference. Tap `Review Script` and the parser runs.
Turns needing review are highlighted with confidence indicators. Tap one to confirm or reassign. Batch-assign unreviewed turns or merge duplicate speakers from the toolbar.
Open the voice library, search and preview, then map a voice to each speaker. Expand per-speaker settings to dial in stability, similarity, style, and speed. Quick Preview streams a sample of your cast before you commit.
A progress ring shows percentage, current turn, and estimated time remaining. Cancel any time without losing what's already generated. If something fails, `Resume` picks up from where it stopped.
Play each turn individually. Swipe to regenerate the ones that need work, compare variants side by side, adjust per-turn pauses, and exclude any segments that shouldn't ship.
All active, non-excluded turns mix into a single M4A. Listen to the full episode with the waveform scrubber, then share via the iOS share sheet — Files, AirDrop, email, anywhere audio goes.
Who it's for
Solo podcasters who want their show to sound like a conversation. Writers who want to hear a script before pitching it. flexCast was built for both — and for the e-learning designer building case-study dialogues at the same desk.
For solo podcasters
You write the dialogue. flexCast hands it to a different voice every time the speaker changes — and lets you fix one bad line without re-recording the episode.
For audio dramatists & screenwriters
Hear your dialogue performed by distinct voices before you pitch it. Swap a character's voice in seconds, re-read a scene with different intent, and share a rough draft with collaborators.
Educators producing dialogue-based lessons sit somewhere in the middle. flexCast handles all three jobs from the same four-step screen.
Features
Each one earns its place by removing a step the desktop workflow used to demand. The full list is on the features page; here are the ones people notice first.
The parser detects speakers, SFX, and music cues across colon (`HOST:`), bracket (`[Host]`), parenthesis (`(Host)`), and standalone-name formats, and assigns confidence scores to each attribution.
It reads the format you already write in, not the other way around.
Browse the ElevenLabs catalog with search, category and type filters, and paginated results. Inline play / stop preview on every row.
Audition voices the way you'd audition actors — by listening.
Write `[SFX: door slam (2s)]` directly in your script. flexCast generates the effect with a duration clamped between 0.5 and 30 seconds.
No separate sound library. No drag-and-drop timeline.
Write `[Music: upbeat jazz intro (10s)]` and the app generates it. Duration clamps between 3 seconds and 10 minutes; an instrumental-only flag is available.
An intro cue without a stock-music tab.
With no API key configured, a mock TTS service returns silent WAV audio with realistic durations. Every screen — import, review, voices, generate, edit, mix, export — works end to end.
Learn the entire workflow before spending a dollar.
Swipe a turn or use its context menu to regenerate it. The new take is saved as an additional variant — your previous take is never overwritten.
Fix one line. Leave everything else exactly where it was.
Each turn can have multiple takes. Browse them, play them back to back, mark one as active, and delete the rest. `Keep Only Active` cleans up across the whole project.
Pick the read you want. The rest go quietly.
Stream the first ~2,000 characters of dialogue with assigned voices via the ElevenLabs dialogue API. A segment timeline shows colored bars per speaker.
Catch a miscast voice before you spend a full generation run.
Where flexCast fits
flexCast does one thing well: turn a multi-speaker script into produced audio through a guided workflow. If you need waveform editing, beat matching, or multi-track mixing, a desktop DAW is the right tool. If you need a single voice reading a single block of text, almost anything will do.
| What it does | flexCast | Generic TTS apps | Desktop DAWs |
|---|---|---|---|
| Multi-speaker in one project | Yes — each character maps to its own voice | Usually single-voice | No built-in TTS |
| Script parsing | Automatic, with confidence scoring | Manual text entry | N/A |
| SFX & music generation | Inline tags generate audio in sequence | Not available | Manual import |
| Post-production editing | Per-turn regen, variants, pause control | Not available | Full waveform editing (complex) |
| Platform | Native iOS — iPhone & iPad | Mixed (web, desktop, mobile) | Desktop or iPad |
| Try before paying | Demo mode runs the full workflow offline | Usually requires login | Free / one-time purchase |
flexCast depends on ElevenLabs for voice generation. Audio quality and available voices are determined by that service. The app adds value through script intelligence, workflow structure, post-production controls, and a native mobile experience — not by training its own voice models.
Questions we get a lot
An iOS app that turns multi-speaker scripts into produced podcast audio. Paste a dialogue script, assign AI voices to each character, generate speech and sound effects, then mix and export the result — all on iPhone or iPad.
For real audio generation, yes — flexCast brings your own ElevenLabs API key (a free tier is available at elevenlabs.io). Demo mode works with no account, generating silent placeholder audio so you can explore every feature first.
Script import, parsing, and review work offline. Audio generation requires an internet connection to reach the ElevenLabs API. Demo mode works fully offline.
Yes. Swipe a turn in post-production or use its context menu. The new take is saved as a variant — your previous take isn't overwritten.
The final mix is exported as an M4A (AAC) file via the iOS share sheet.
Your ElevenLabs API key is stored in the iOS Keychain — the same secure storage iOS uses for passwords. It is never written to a plain file or sent anywhere other than the ElevenLabs API.
Send a note — tell us what you're trying to make, what stopped you last time, and whether you're a podcaster, a writer, or something we haven't named. We'll add you to the list and reply to anything that's not "thanks."
No waitlist form. No "thanks for subscribing" auto-reply. Just an inbox with humans in it.