IzwiIzwi

izwi transcribe

izwi transcribe

Convert audio to text.


Synopsis

izwi transcribe <FILE> [OPTIONS]

Description

Transcribes audio files to text using automatic speech recognition (ASR). Supports multiple audio formats and output options.


Arguments

ArgumentDescription
<FILE>Audio file to transcribe

Options

OptionDescriptionDefault
-m, --model <MODEL>ASR model to useqwen3-asr-0.6b
-l, --language <LANG>Language hint (e.g., en, es)Auto-detect
-f, --format <FORMAT>Output format: text, json, verbose_jsontext
-o, --output <PATH>Output file (default: stdout)
--word-timestampsInclude word-level timestamps

Examples

Basic transcription

izwi transcribe audio.wav

Save to file

izwi transcribe audio.wav --output transcript.txt

JSON output

izwi transcribe audio.wav --format json

With timestamps

izwi transcribe audio.wav --format verbose_json --word-timestamps

Specify language

izwi transcribe audio.wav --language en izwi transcribe audio.wav --language es

Use larger model

izwi transcribe audio.wav --model qwen3-asr-1.7b

Output Formats

Text

Plain text transcript:

Hello, this is a transcription test.

JSON

{
  "text": "Hello, this is a transcription test."
}

Verbose JSON

{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "words": [
    {"word": "Hello", "start": 0.0, "end": 0.5},
    {"word": "this", "start": 0.6, "end": 0.8}
  ]
}

Supported Audio Formats

  • WAV (.wav)
  • MP3 (.mp3)
  • M4A (.m4a)
  • FLAC (.flac)
  • OGG (.ogg)
  • WebM (.webm)

Models

ModelSizeSpeedAccuracy
qwen3-asr-0.6b1.2 GBFastGood
qwen3-asr-1.7b3.4 GBMediumBetter

See Also

  • Transcription Guide
  • Diarization Guide