izwi align

Forced alignment — align text to audio at word level.

Synopsis

izwi align <FILE> <TEXT> [OPTIONS]

Description

Aligns reference text to audio, producing word-level timestamps. Useful for:

Subtitle generation
Karaoke timing
Audio editing
Pronunciation analysis

Arguments

Argument	Description
`<FILE>`	Audio file to align
`<TEXT>`	Reference text to align

Options

Option	Description	Default
`-m, --model <MODEL>`	Alignment model	`qwen3-forcedaligner-0.6b`
`-f, --format <FORMAT>`	Output format: `text`, `json`, `verbose_json`	`json`
`-o, --output <PATH>`	Output file (default: stdout)	—

Examples

Basic alignment

izwi align audio.wav "Hello world, this is a test."

Save to file

izwi align audio.wav "Hello world" --output alignment.json

Text output

izwi align audio.wav "Hello world" --format text

Output Formats

JSON (default)

{
  "alignments": [
    {"word": "Hello", "start": 0.0, "end": 0.45},
    {"word": "world", "start": 0.50, "end": 0.95},
    {"word": "this", "start": 1.10, "end": 1.30},
    {"word": "is", "start": 1.35, "end": 1.45},
    {"word": "a", "start": 1.50, "end": 1.55},
    {"word": "test", "start": 1.60, "end": 2.00}
  ],
  "duration": 2.0
}

Text

Hello     0.00 - 0.45 world     0.50 - 0.95 this      1.10 - 1.30 is        1.35 - 1.45 a         1.50 - 1.55 test      1.60 - 2.00

Use Cases

Subtitle Generation

Generate precise timestamps for subtitles:

izwi align video_audio.wav "$(cat script.txt)" --output subtitles.json

Audio Editing

Find exact word boundaries for editing:

izwi align podcast.wav "um actually" --format json

Pronunciation Analysis

Analyze timing of spoken words:

izwi align recording.wav "The quick brown fox" --format verbose_json

Available Models

Model	Description
`qwen3-forcedaligner-0.6b`	Qwen3-based forced aligner

izwi align

izwi align

Synopsis

Description

Arguments

Options

Examples

Basic alignment

Save to file

Text output

Output Formats

JSON (default)

Text

Use Cases

Subtitle Generation

Audio Editing

Pronunciation Analysis

Available Models

See Also