Files
transcribe/CLAUDE.md
2026-01-17 19:18:58 -06:00

2.0 KiB

Transcribe Tool

Audio transcription CLI using OpenAI Whisper with speaker diarization.

Quick Reference

# Basic transcription (SRT output)
./transcribe audio.mp3 -o output.srt

# With speaker diarization
./transcribe audio.mp3 --diarize -o output.srt

# Specify model and speakers
./transcribe audio.mp3 --model small --diarize -s 2 -o output.srt

# Print to stdout
./transcribe audio.mp3 --no-write

Flags

Flag Short Description Default
--output -o Output file path required
--format -f srt, text, json srt
--model -m tiny, base, small, medium, large, turbo tiny
--diarize Enable speaker detection off
--speakers -s Number of speakers (0=auto) 0
--no-write Print to stdout instead of file off

Common Tasks

Transcribe a meeting recording:

./transcribe meeting.wav --model small -o meeting.srt

Transcribe interview with 2 speakers:

./transcribe interview.mp3 --model small --diarize -s 2 -o interview.srt

Get JSON output for processing:

./transcribe audio.mp3 --format json -o output.json

Quick preview (stdout):

./transcribe audio.mp3 --no-write

Output Formats

SRT (default): Subtitle format with timestamps

1
00:00:00,000 --> 00:00:05,200
[Speaker 1] Hello, how are you?

Text: Plain text with timestamps

[00:00.0 - 00:05.2] [Speaker 1] Hello, how are you?

JSON: Full metadata including segments, words, duration

Models

  • tiny - Fastest, use for quick drafts
  • small - Good balance of speed/accuracy
  • medium - Better accuracy, slower
  • large - Best accuracy, slowest

Supported Formats

MP3, WAV, FLAC, M4A, OGG, OPUS

Build

cd /home/yeho/Documents/tools/transcribe
go build -o transcribe

Dependencies

pip install openai-whisper                      # Required
pip install resemblyzer scikit-learn librosa    # For diarization