73 features shipped · pre-release

Every feature,
in the order you'll meet it.

All documented in the current pre-release build, all grouped by what they help you do — script handling, voices, generation, post-production, export, shows, AI writing, settings, and Studio.

Script

How text gets into the app. Paste any of four common script formats, let the parser detect speakers and audio cues, and review only the lines it wasn't sure about.

Script import

Paste or type a script. The import screen ships with a collapsible format guide and a built-in sample script for first-time users.

Start with text. End with audio. Nothing in between is your problem.

Automatic script parsing

The parser detects speakers, SFX, music cues, scene markers, and chapter markers across colon (`HOST:`), bracket (`[Host]`), parenthesis (`(Host)`), standalone-name, `[SCENE:]`, `[CHAPTER:]`, `[ACT:]`, and `[PART:]` formats.

It reads the format you already write in, not the other way around.

Confidence-scored review

Every detected turn shows its speaker with a green / orange / red confidence indicator. Lines that need attention are highlighted; the rest you scroll past.

Review only the lines the parser wasn't sure about.

Speaker management

Add new speakers, merge duplicates, batch-assign unreviewed turns, and give each character a unique color badge for visual distinction.

Two slightly different spellings of `HOST` collapse into one in a single tap.

Scene and chapter support

Use `[SCENE: Title]`, `[CHAPTER: Title]`, `[ACT: Title]`, or `[PART: Title]` tags to structure a script. Scenes appear as collapsible sections, can be added manually, and turns can move between scenes.

Your audio drama can keep its shape instead of becoming one long list.

Turn insertion and deletion

Insert dialogue, SFX, or music turns at any point from review. Assign the new turn to a speaker and scene, or delete turns from the context menu.

A missing beat does not send you back to a blank import screen.

Script editing with re-parse

Edit the raw script after import. An audio-tag reference panel makes inserting SFX and music cues quick. Re-parsing replaces existing speaker assignments and generated audio (with a confirmation warning).

Tweak the dialogue without leaving the project.

Voices

How characters get voices. Browse the ElevenLabs catalog, audition voices in line, map one to each speaker, and dial in stability, similarity, style, and speed per character.

Voice library

Browse the ElevenLabs catalog with search, category and type filters, and paginated results. Inline play / stop preview on every row.

Audition voices the way you'd audition actors — by listening.

Voice mapping

Map each speaker to a TTS voice. `Cast All Speakers` opens a casting session with speaker progress, visible assignments, auto-advance, and inline AI casting suggestions.

You can cast a whole script without losing your place.

Auto-Cast

Assign voices to every unvoiced speaker in one tap. flexVox analyzes each speaker's role, generates a search query from an AI archetype suggestion, and selects the best-match voice while preserving manual assignments.

Start with a plausible cast, then change only what needs changing.

Per-speaker voice settings

Independent sliders for stability, similarity boost, style exaggeration, and speed, plus a speaker boost toggle. V3 models show a simplified panel.

Two characters using the same voice can still sound like different people.

Voice detail & similar voices

Per-voice metadata, verified languages, category, and a `Find Similar` feature that analyzes the voice's preview audio and returns related options from the catalog.

Almost the right voice? Find the cousin.

Pronunciation dictionary

Per-project rules for tricky names and technical terms. Alias replacements or phoneme overrides in IPA or CMU Arpabet. Synced to ElevenLabs before generation.

Your guest's name pronounced the same way every time.

Generation

How the script becomes audio. Speech, sound effects, and music all generate in one run — with progress tracking, resume-on-failure, and an offline demo mode for the whole pipeline.

Per-turn generation

Each speech turn is generated with the assigned voice profile and contextual parameters (previous / next text, request-ID history) for voice continuity. SFX and music turns hit dedicated endpoints.

The voice doesn't forget who it just was.

Dialogue generation

Batch multiple speech turns into a single multi-speaker call for natural conversational flow. The app respects a 2,000-character and 10-unique-voice limit per batch and splits larger scripts automatically.

Conversations sound like conversations, not stitched-together monologue.

SFX generation

Write `[SFX: door slam (2s)]` directly in your script. flexVox generates the effect with a duration clamped between 0.5 and 30 seconds, with optional seamless-loop requests for ambient beds.

No separate sound library just to make the door slam happen.

Music generation

Write `[Music: upbeat jazz intro (10s)]` and the app generates it. Duration clamps between 3 seconds and 10 minutes; an instrumental-only flag is available.

An intro cue without a stock-music tab.

Background music generation

Add a background music layer that spans the whole episode. Describe the mood and style; flexVox generates a track sized to the total dialogue duration and gives the project its own volume control.

A soundtrack that fits the episode instead of the other way around.

Real-time progress

An animated circular progress ring shows percentage, current turn, estimated time remaining, and elapsed time. Cancel at any point — already-generated audio is preserved.

You can see how far along it is. You can also walk away.

Resume after failure

If generation is interrupted or cancelled, the app detects which turns already have audio and offers a one-tap resume that skips completed turns.

A dropped connection costs you a tap, not the whole run.

Audio quality validation

Every generated file is checked for corruption, silence, and minimum duration. Files with detected issues are flagged with a warning badge in post-production.

The app notices the broken take before you do.

Demo mode

With no API key configured, a mock TTS service returns silent WAV audio with realistic durations. Every screen — import, review, voices, generate, edit, mix, export — works end to end.

Learn the entire workflow before spending a dollar.

Post-production

How a single bad take stops being a problem. Regenerate one line at a time, compare variants side by side, dial in pauses, exclude segments, and edit dialogue inline.

Turn-by-turn review

Play each segment individually. Simple and advanced modes are available, with an interactive mini timeline, current playback time, and `Play from Here` for continuous sequential playback.

Find the line that's off without scrubbing a 30-minute mix.

Single-turn regeneration

Swipe a turn or use its context menu to regenerate it. The new take is saved as an additional variant — your previous take is never overwritten.

Fix one line. Leave everything else exactly where it was.

Variant management

Each turn can have multiple takes. Browse them, play them back to back, mark one as active, and delete the rest. `Keep Only Active` cleans up across the whole project.

Pick the read you want. The rest go quietly.

Pause adjustment

Two levels of control: a project-wide default and per-turn custom pauses dialed in with sliders from 0 to 10 seconds.

Comic timing without a waveform editor.

Turn exclusion

Exclude individual turns from the final mix without deleting them. Excluded turns appear with strikethrough text and are skipped during mixing — toggle them back any time.

Cut a beat without losing the option to put it back.

Text editing with regeneration

Edit a turn's dialogue text directly from post-production. After saving, optionally regenerate the audio immediately to match.

A typo in the script doesn't mean a trip back to import.

Underlay mode

Mark music or SFX turns to play underneath dialogue instead of sequentially. Underlay turns live on a dedicated audio track with per-turn volume control.

Rain, crowds, and engine hum can stay present while people talk.

Auto-ducking

Underlay and background music automatically lower when dialogue plays, then rise during pauses. Duck depth, attack, and release controls shape the curve.

Music gets out of the way without you drawing automation.

Playback & export

How the episode leaves the app. Mix the finished audio, follow along with the script, export for specific platforms, and include transcript or caption files.

Audio mixing

Mix dialogue, crossfade overlay, ducked underlay, and background music into a single M4A / AAC file with configurable pauses and peak normalization.

One file. Ready to share.

Follow-along playback

The playback screen shows the script with timestamps, speaker badges, active-turn highlighting, auto-scroll, tap-to-seek, and word-level highlighting when alignment data is available.

Find a mispronunciation by reading along, not scrubbing blindly.

Mini player bar

When a project has completed audio, Script and Production tabs show a compact bar with project name, play / pause, progress, and current time. Tapping it opens Export.

The mix stays one tap away while you keep editing.

Quick preview

Stream the first ~2,000 characters of dialogue with assigned voices via the ElevenLabs dialogue API. A segment timeline shows colored bars per speaker.

Catch a miscast voice before you spend a full generation run.

Audio export

Export from the playback screen or with Cmd+E. Choose Spotify, Apple Podcasts, YouTube, Broadcast, or Custom loudness presets, then share audio plus SRT, VTT, JSON, or plain-text transcripts.

One export sheet for the file and the words that go with it.

Project & settings

How multiple projects coexist. Create, rename, duplicate, template, and search projects while the Script / Production / Export workflow shows what is ready.

Multi-project support

Create, rename, duplicate, and delete projects. The list is searchable and sorted by most recently updated.

Three episodes in flight, one app, no folder chaos.

Guided three-step workflow

Script. Production. Export. A three-step progress bar shows readiness while keeping every tab accessible. Generation progress follows you with a tap-to-view banner.

Guidance without locking you out of the work.

Project duplication

Deep-copy a project including script, turns, speakers, voice profiles, and pronunciation rules. Audio is intentionally not copied — the duplicate is a fresh starting point.

Spin up a new variant of an episode without rebuilding the cast.

Sample project

A pre-populated sample script is one tap away from the empty-state screen.

Onboard yourself in a minute, not a manual.

Project templates

Start from Interview, Audio Drama, True Crime, Newscast, or Narration templates. The first-launch empty state presents template cards before a blank page.

The first project can start with a format, not a cursor.

API key management

Enter and store an ElevenLabs API key in the iOS Keychain. Remove it any time. The app clearly displays whether it's running in API or demo mode.

Your key sits in the same vault as your bank passwords.

Connection testing & model picker

Test the saved key by fetching available voices. Pick the default TTS model from Multilingual v2, Turbo v2.5, Turbo v2, English v1, or v3.

You know the key works before you start a 40-minute run.

Shows & series

How ongoing productions keep their identity. Define a cast bible, show style, audio identity, and episode structure once, then reuse them across episodes.

Shows and series

Create ongoing productions with persistent cast, format, tone, narrator mode, episode numbering, and reusable audio identity.

Define the show once. Start each episode with the bones already in place.

Cast bible

Recurring and guest cast members can carry role, age, biography, personality, speaking style, and optional voice assignment into every new episode.

Characters stay consistent because their notes travel with them.

Episode templates

Define reusable show segments such as Intro, Main Topic, Listener Q&A, and Outro. Segments are reorderable and can include descriptions.

A recurring show can keep its rhythm without copy-paste setup.

Show style and audio identity

Set a production's default format, tone, narrator mode, intro music, outro music, and transition sounds.

A show can sound like itself before the next script is pasted.

Promote to series

Convert any standalone project into a show from the project list. Speakers, voice assignments, and character profiles become the production cast.

When one episode becomes a series, the app moves with you.

Episode creation

Create new episodes from a production with the show's cast, voice assignments, character profiles, narrator mode, and season numbering already applied.

Episode three starts where episode two left the setup.

AI script writing

How a premise becomes production text. Generate scripts with OpenAI or Claude, use character profiles and reference material, or copy the generated prompt to another tool.

AI script generation

Generate scripts in-app with OpenAI or Claude. Configure format, tone, speaker count, scenes, chapters, expression tags, SFX, music, and provider before generation.

Go from premise to production-ready script without leaving the project.

Character profiles for AI

Speaker profiles feed role, age, biography, personality, and speaking style into AI script generation for more distinct characters.

The generated dialogue has more to work from than a name.

Reference materials

Upload text reference materials such as research, outlines, and articles. flexVox sends them as context without asking the model to copy them verbatim.

Give the writer context, not a blank prompt.

Narrator modes

Choose Full Cast, Single Narrator, or Narrator + Cast. The mode affects both AI script generation and voice mapping.

A documentary, monologue, and drama do not need the same cast logic.

Sound library

How reusable audio assets stay reusable. Import, tag, search, and assign sound effects or music without losing the original library item.

Sound library

Import reusable M4A, MP3, WAV, and AIFF sound effects or music into a global library that persists across projects.

Your best sounds become assets, not one-off imports.

Sound tagging and search

Tag sounds with comma-separated keywords, search by name or tag, and filter by category.

Find the rain bed before the scene dries out.

Sound assignment

Assign library sounds to SFX and music turns in post-production. The file is copied into the project's audio assets while the library source remains reusable.

Reuse the cue without tying projects together.

Help & guidance

How the app teaches without taking over. Searchable help lives in Settings, while contextual tips appear where they are useful and stay dismissed once closed.

In-app help guide

A searchable guide in Settings covers scripts, voices, generation, post-production, shows and series, and advanced topics.

The manual lives where the questions happen.

Contextual tips

Dismissible tips appear at key workflow points, then stay dismissed once closed.

Guidance shows up once, then gets out of the way.

Settings

How service credentials and defaults are managed. Store API keys in Keychain, test every provider, choose TTS defaults, and manage subscription state from a split-pane settings sheet.

Split-pane settings

Settings uses a spacious NavigationSplitView with categories for Subscription, API Keys, Text-to-Speech, AI Writing, Display, Data & Sync, and About.

Serious controls do not have to feel buried.

Centralized API keys

ElevenLabs, OpenAI, and Claude keys live in one secure section with save, test, status, and remove controls for each provider.

Every credential has one obvious home.

Connection testing

Test ElevenLabs by fetching available voices; test OpenAI and Claude with a round-trip API call. Status appears inline.

You know a key works before your episode depends on it.

TTS model selection

Choose the default ElevenLabs model and text normalization settings from a dedicated Text-to-Speech section.

Voice defaults are separate from key management.

Interface

The parts you feel before you read them. Haptics, toast feedback, keyboard shortcuts, command palette, empty states, skeleton loading, animation, and database recovery.

Haptic feedback

Light taps for selections, medium impacts for state changes, ticks for slider adjustments, notification haptics for generation milestones and errors.

The phone confirms what just happened without asking your eyes.

Toast notifications

Ephemeral feedback at the top of the screen for assignments, duplications, regeneration completion, and errors. Four styles, matching icons, VoiceOver-accessible.

Confirmation when you need it. Silence when you don't.

Database recovery

If the local SwiftData store is corrupted on launch, the app attempts recovery in three tiers: normal open, delete-and-retry, and in-memory fallback. Users are notified of any data reset.

A bad row doesn't take the app down with it.

Command palette

Press Cmd+K to search actions, projects, and shows. Results are grouped and keyboard navigable.

Power users can jump instead of tapping around.

Keyboard shortcuts

Use Cmd+1/2/3 for workflow tabs, Cmd+G for Production, Cmd+E for Export, Space for playback, arrows for skipping, Cmd+R for regeneration, and M to mute.

The iPad keyboard gets treated like a first-class input device.

Consistent empty states

No projects, no shows, no sounds, failed searches, and unready tabs share a unified branded empty-state component.

Blank spaces explain themselves without turning into clutter.

Skeleton loading placeholders

List views use shimmer rows while content is being prepared.

Waiting states feel intentional, not broken.

Staggered animations

List rows animate in with a staggered fade-and-slide entrance across the app.

Motion gives the interface a little clarity without slowing it down.

Studio

How the free app grows. The full workflow starts free; Studio unlocks scale, show management, AI writing, advanced mixing, export presets, and unlimited projects.

Free tier

Up to three projects, the complete script workflow, voice browsing and assignment, per-turn generation with your own API key, post-production basics, follow-along playback, Podcast export, and demo mode.

The main workflow is real before anyone pays.

flexVox Studio

Studio unlocks unlimited projects, Shows and Series, Auto-Cast, dialogue generation mode, background music, underlay and auto-ducking, export presets, AI writing, Sound Library, pronunciation dictionary, templates, and pacing reports.

The upgrade is for scale and polish, not for making the app usable.

Subscription management

Settings shows subscription status, purchase, restore, and transaction updates through StoreKit 2.

The subscription state is visible and recoverable.

Feature gating

Studio-only features display a small badge and open the upgrade sheet with the relevant feature highlighted.

Locked features explain what they are before asking for money.

What we don't do

Features only count if the app isn't quietly cashing in on you.

Native iOS

SwiftUI, SwiftData, Keychain. No web view. No cross-platform compromise.

No telemetry

No analytics SDKs. No tracking. Scripts and audio stay on device.

BYO API key

ElevenLabs key lives in the iOS Keychain. The app never proxies it.

Demo mode

Walk every screen end to end with no account, no network, no commitment.

Free tier

Three projects and the full script-to-audio workflow before Studio.

Want to see how it actually flows?

The walkthrough takes you screen by screen — from pasting a script to exporting an M4A — without leaving this site.

See the walkthrough → Ask about the beta

Also from the studio

More pixeLantern apps

See the catalogue →

flexGrid

macOS 15+

A native Mac media wall with a mean little grin.

Open flexGrid →

flexRep

iOS 26.4+

Fast workout logging for solo lifters.

Open flexRep →

flexMeter

macOS 26.4+ · Apple Silicon

Audio levels on your screen edge.

Open flexMeter →

flexDoc

macOS · Apple Silicon

Every format in. Clean Markdown out.

Open flexDoc →

Lanai

iOS 17+ · iPadOS 17+ · macOS 14+

Pull up a chair. Bluesky, at reading speed.

Open Lanai →

flexStats

macOS 26+ · Apple Silicon

Messy spreadsheet in. Print-ready dashboard out.

Open flexStats →

Every feature, in the order you'll meet it.

Script import

Automatic script parsing

Confidence-scored review

Speaker management

Scene and chapter support

Turn insertion and deletion

Script editing with re-parse

Voice library

Voice mapping

Auto-Cast

Per-speaker voice settings

Voice detail & similar voices

Pronunciation dictionary

Per-turn generation

Dialogue generation

SFX generation

Music generation

Background music generation

Real-time progress

Resume after failure

Audio quality validation

Demo mode

Turn-by-turn review

Single-turn regeneration

Variant management

Pause adjustment

Turn exclusion

Text editing with regeneration

Underlay mode

Auto-ducking

Audio mixing

Follow-along playback

Mini player bar

Quick preview

Audio export

Multi-project support

Guided three-step workflow

Project duplication

Sample project

Project templates

API key management

Connection testing & model picker

Shows and series

Cast bible

Episode templates

Show style and audio identity

Promote to series

Episode creation

AI script generation

Character profiles for AI

Reference materials

Narrator modes

Sound library

Sound tagging and search

Sound assignment

In-app help guide

Contextual tips

Split-pane settings

Centralized API keys

Connection testing

TTS model selection

Haptic feedback

Toast notifications

Database recovery

Command palette

Keyboard shortcuts

Consistent empty states

Skeleton loading placeholders

Staggered animations

Free tier

flexVox Studio

Subscription management

Feature gating

Features only count if the app isn't quietly cashing in on you.

Want to see how it actually flows?

More pixeLantern apps

Every feature,
in the order you'll meet it.