1) Transcribe Locally
- Run a local transcription tool to produce text and timestamps (e.g., .srt).
- Export raw text for editing; keep timestamps for caption use.
2) Edit and Segment
Clean filler words, fix names, and split into sense units of ≤20 seconds. Keep punctuation and stage directions to guide prosody.
3) Generate with Kokoro
- Preview each segment, adjust pacing and emphasis, export WAV.
- Name files sequentially to simplify assembly (
001.wav,002.wav...).
4) Assemble and Master
- Concatenate segments with a DAW or ffmpeg; normalize to −16 LUFS.
- Deliver MP3 192–256 kbps for web; keep WAV masters.
Benefits
- Privacy by default; no uploads.
- Consistent narration quality and repeatable edits.
- Faster iteration on phrasing and timing.
Related Articles
Author: Kokoro Web Team • Last updated 2025‑01‑15