IzwiIzwi

Features

Izwi provides a comprehensive suite of audio AI capabilities. Each feature is accessible via the web UI, desktop app, and command line.

Core Features

FeatureDescriptionGuide
VoiceReal-time voice conversations with AIVoice Guide
ChatText-based AI conversationsChat Guide
Text-to-SpeechGenerate natural speech from textTTS Guide
TranscriptionConvert audio to textTranscription Guide
DiarizationIdentify multiple speakersDiarization Guide
Voice CloningClone voices from audio samplesVoice Cloning Guide
Voice DesignCreate voices from descriptionsVoice Design Guide

Feature Comparison

FeatureWeb UIDesktopCLIAPI
Voice
Chat
Text-to-Speech
Transcription
Diarization
Voice Cloning
Voice Design

Getting Started

Start the server:

izwi serve

Open the web UI:

http://localhost:8080

Download required models:

izwi pull qwen3-tts-0.6b-base izwi pull qwen3-asr-0.6b

Model Requirements

Different features require different models:

FeatureRequired Models
VoiceTTS + ASR + Chat model
ChatChat model (Qwen3 or Gemma)
Text-to-SpeechTTS model
TranscriptionASR model (Qwen3 or Parakeet)
DiarizationDiarization model (Sortformer)
Forced AlignmentForced aligner model
Voice CloningTTS CustomVoice model
Voice DesignTTS VoiceDesign model

Next Steps

Choose a feature to learn more: