# Transcribe Tool Audio transcription CLI using OpenAI Whisper with speaker diarization. ## Quick Reference ```bash # Basic transcription (SRT output) ./transcribe audio.mp3 -o output.srt # With speaker diarization ./transcribe audio.mp3 --diarize -o output.srt # Specify model and speakers ./transcribe audio.mp3 --model small --diarize -s 2 -o output.srt # Print to stdout ./transcribe audio.mp3 --no-write ``` ## Flags | Flag | Short | Description | Default | |------|-------|-------------|---------| | `--output` | `-o` | Output file path | **required** | | `--format` | `-f` | `srt`, `text`, `json` | `srt` | | `--model` | `-m` | `tiny`, `base`, `small`, `medium`, `large`, `turbo` | `tiny` | | `--diarize` | | Enable speaker detection | off | | `--speakers` | `-s` | Number of speakers (0=auto) | `0` | | `--no-write` | | Print to stdout instead of file | off | ## Common Tasks **Transcribe a meeting recording:** ```bash ./transcribe meeting.wav --model small -o meeting.srt ``` **Transcribe interview with 2 speakers:** ```bash ./transcribe interview.mp3 --model small --diarize -s 2 -o interview.srt ``` **Get JSON output for processing:** ```bash ./transcribe audio.mp3 --format json -o output.json ``` **Quick preview (stdout):** ```bash ./transcribe audio.mp3 --no-write ``` ## Output Formats **SRT (default):** Subtitle format with timestamps ``` 1 00:00:00,000 --> 00:00:05,200 [Speaker 1] Hello, how are you? ``` **Text:** Plain text with timestamps ``` [00:00.0 - 00:05.2] [Speaker 1] Hello, how are you? ``` **JSON:** Full metadata including segments, words, duration ## Models - `tiny` - Fastest, use for quick drafts - `small` - Good balance of speed/accuracy - `medium` - Better accuracy, slower - `large` - Best accuracy, slowest ## Supported Formats MP3, WAV, FLAC, M4A, OGG, OPUS ## Build ```bash cd /home/yeho/Documents/tools/transcribe go build -o transcribe ``` ## Dependencies ```bash pip install openai-whisper # Required pip install resemblyzer scikit-learn librosa # For diarization ```