Kokoro TTS CLI

A command-line interface for Kokoro TTS with streaming and voice mixing capabilities.

Prerequisites

Install espeak-ng:

# macOS
brew install espeak-ng

# Ubuntu/Debian
sudo apt-get install espeak-ng

# Windows
# Download installer from https://github.com/espeak-ng/espeak-ng/releases

Download Kokoro model and voices:

# Install git-lfs if you haven't
git lfs install

# Clone Kokoro repository
git clone https://huggingface.co/hexgrad/Kokoro-82M

Installation

pip install git+https://github.com/cheuerde/kokoro-tts-cli.git

Usage Examples

Quick Start

# Simple text (use single quotes for text with exclamation marks)
echo 'Hello! How are you today?' | kokoro-tts

# Longer text
echo 'Once upon a time, in a digital realm far beyond our screens, there lived a unique artificial voice. This voice was not just any voice - it could sing, whisper, and tell stories with remarkable clarity!' | kokoro-tts

Voice Selection

# American female voice (Bella)
echo 'The quick brown fox jumps over the lazy dog. This sentence contains all letters of the alphabet!' | kokoro-tts --voice af_bella

# British female voice (Emma)
echo 'Would you like a cup of tea? British voices have their own unique charm.' | kokoro-tts --voice bf_emma

# American male voice (Adam)
echo 'Deep in the mountains, a lone traveler found an ancient manuscript.' | kokoro-tts --voice am_adam

Voice Mixing

# Mix American female voices (70% Bella, 30% Sarah)
echo 'Voice mixing creates interesting new voice characteristics!' | kokoro-tts --voice "af_bella:0.7,af_sarah:0.3"

# Mix female and male voices
echo 'This is a balanced mix of different voice types.' | kokoro-tts --voice "bf_emma:0.4,am_adam:0.3,af_bella:0.3"

# Create smooth transitions between voices
echo 'First part in one voice, second part in another.' | kokoro-tts --voice "af_bella:0.6,bf_emma:0.4"

Speed Control

# Faster speech
echo 'This will be spoken quickly, perfect for speed reading!' | kokoro-tts --speed 1.5

# Slower speech
echo 'This will be spoken slowly and clearly, good for learning pronunciation.' | kokoro-tts --speed 0.8

File Processing

# Process a text file
cat story.txt | kokoro-tts --verbose

# Save to audio file
cat article.txt | kokoro-tts --save output.wav

# Process and save without playback
cat script.txt | kokoro-tts --no-play --save output.wav

Interactive Mode

# Process file with interactive controls
kokoro-tts -i < story.txt

# Process text from clipboard (macOS)
pbpaste | kokoro-tts -i

Interactive Controls:

Space: Pause/Resume
Left/Right arrows: Adjust speed (0.5x - 2.0x)
Esc: Exit

Example Text Files

The repository includes example texts in examples/:

# Story example
cat examples/story.txt | kokoro-tts --voice af_bella

# Technical text
cat examples/technical.txt | kokoro-tts --voice "af_bella:0.6,am_adam:0.4"

# Mixed content with voice mixing
cat examples/mixed.txt | kokoro-tts --voice "bf_emma:0.5,af_sarah:0.5"

Server Mode

For faster repeated processing, you can run Kokoro TTS in server mode. This keeps the model loaded in memory, significantly reducing processing time for subsequent requests.

Start the server in one terminal:

# Start with default settings (localhost:5000)
kokoro-tts-server

# Custom host and port
kokoro-tts-server --host 0.0.0.0 --port 5001

Use the client in another terminal:

# Basic usage
echo 'Hello!' | kokoro-tts-client

# All regular options work with the client
echo 'Mixed voice test' | kokoro-tts-client --voice "af_bella:0.7,bf_emma:0.3" --speed 1.2

# Process files
cat story.txt | kokoro-tts-client --voice af_bella

# Save to audio file
cat script.txt | kokoro-tts-client --save output.wav

Server options:

--host: Server host (default: localhost)
--port: Server port (default: 5000)
--kokoro-path: Path to Kokoro-82M directory

Client options:

All options available in regular mode (voice, speed, save, etc.)
--host: Server host (default: localhost)
--port: Server port (default: 5000)

The server mode is particularly useful when:

Processing multiple texts in succession
Running a TTS service on a powerful machine
Reducing startup time for frequent TTS operations

Processing Modes

Both kokoro-tts and kokoro-tts-client support different processing modes:

Streaming Mode (default)

# Starts playing immediately as text is processed
cat long_text.txt | kokoro-tts
cat long_text.txt | kokoro-tts-client

Batch Mode

# Process entire text at once (faster for wav generation)
cat text.txt | kokoro-tts --batch --save output.wav
cat text.txt | kokoro-tts-client --batch --save output.wav

# Batch processing with progress info
cat text.txt | kokoro-tts --batch --verbose --save output.wav

Interactive Mode (kokoro-tts only)

# Full playback control
cat story.txt | kokoro-tts -i

Integration Examples

Kokoro TTS CLI can be easily integrated with other tools:

PDF to Speech

# Read PDF file
pdftotext document.pdf - | kokoro-tts

# With server mode
pdftotext document.pdf - | kokoro-tts-client

LLM Processing Pipeline

# Extract funny segments from text using LLM and speak them
pdftotext bartleby.pdf - | llm prompt -m phi4 "find the most relatable quote about refusing to do work tasks" | kokoro-tts

# Using server mode for faster processing
pdftotext novel.pdf - | llm prompt -m phi4 "extract the most dramatic scene" | kokoro-tts-client

# Generate and narrate summaries
pdftotext technical_doc.pdf - | \
    llm prompt -m phi4 -s "Summarize this technical document in simple terms" | \
    kokoro-tts-client --voice "bf_emma"

# Mix voices for dialogue extraction
pdftotext play.pdf - | \
    llm prompt -m phi4 "extract a dialogue between two characters" | \
    kokoro-tts-client --voice "af_bella:0.6,am_adam:0.4"

These pipelines combine text extraction, LLM processing, and speech synthesis to create powerful text-to-speech workflows. The server mode is particularly useful for processing multiple requests efficiently.

Available Voices

American English (en-us):

af_bella - Bella (female)
af_sarah - Sarah (female)
am_adam - Adam (male)
am_michael - Michael (male)
af_nicole - Nicole (female)
af_sky - Sky (female)

British English (en-gb):

bf_emma - Emma (female)
bf_isabella - Isabella (female)
bm_george - George (male)
bm_lewis - Lewis (male)

Environment Variables

KOKORO_PATH: Path to Kokoro-82M directory
```
export KOKORO_PATH=/path/to/Kokoro-82M
```

Tips

Use single quotes for text with exclamation marks:

echo 'Wow! This is amazing!' | kokoro-tts

For long texts, use files:

echo 'Long text...' > input.txt
kokoro-tts < input.txt

Mix voices for unique characteristics:

# Warm, friendly voice
kokoro-tts --voice "af_bella:0.6,bf_emma:0.4"

# Authoritative voice
kokoro-tts --voice "am_adam:0.7,bm_george:0.3"

Acknowledgements

This is a command-line interface for the Kokoro TTS model. All credit for the model goes to:

Original StyleTTS 2 architecture by Li et al
Kokoro training by @rzvzn

Claude Sonnet wrote 100% of the code and had most of the ideas for the features!

License

Apache License 2.0 (matching Kokoro's license)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
examples		examples
kokoro_tts_cli		kokoro_tts_cli
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kokoro TTS CLI

Prerequisites

Installation

Usage Examples

Quick Start

Voice Selection

Voice Mixing

Speed Control

File Processing

Interactive Mode

Example Text Files

Server Mode

Processing Modes

Integration Examples

Available Voices

Environment Variables

Tips

Acknowledgements

License

About

Releases

Packages

Languages

License

cheuerde/kokoro-tts-cli

Folders and files

Latest commit

History

Repository files navigation

Kokoro TTS CLI

Prerequisites

Installation

Usage Examples

Quick Start

Voice Selection

Voice Mixing

Speed Control

File Processing

Interactive Mode

Example Text Files

Server Mode

Processing Modes

Integration Examples

Available Voices

Environment Variables

Tips

Acknowledgements

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages