94 lines
2.0 KiB
Markdown
94 lines
2.0 KiB
Markdown
# Transcribe Tool
|
|
|
|
Audio transcription CLI using OpenAI Whisper with speaker diarization.
|
|
|
|
## Quick Reference
|
|
|
|
```bash
|
|
# Basic transcription (SRT output)
|
|
./transcribe audio.mp3 -o output.srt
|
|
|
|
# With speaker diarization
|
|
./transcribe audio.mp3 --diarize -o output.srt
|
|
|
|
# Specify model and speakers
|
|
./transcribe audio.mp3 --model small --diarize -s 2 -o output.srt
|
|
|
|
# Print to stdout
|
|
./transcribe audio.mp3 --no-write
|
|
```
|
|
|
|
## Flags
|
|
|
|
| Flag | Short | Description | Default |
|
|
|------|-------|-------------|---------|
|
|
| `--output` | `-o` | Output file path | **required** |
|
|
| `--format` | `-f` | `srt`, `text`, `json` | `srt` |
|
|
| `--model` | `-m` | `tiny`, `base`, `small`, `medium`, `large`, `turbo` | `tiny` |
|
|
| `--diarize` | | Enable speaker detection | off |
|
|
| `--speakers` | `-s` | Number of speakers (0=auto) | `0` |
|
|
| `--no-write` | | Print to stdout instead of file | off |
|
|
|
|
## Common Tasks
|
|
|
|
**Transcribe a meeting recording:**
|
|
```bash
|
|
./transcribe meeting.wav --model small -o meeting.srt
|
|
```
|
|
|
|
**Transcribe interview with 2 speakers:**
|
|
```bash
|
|
./transcribe interview.mp3 --model small --diarize -s 2 -o interview.srt
|
|
```
|
|
|
|
**Get JSON output for processing:**
|
|
```bash
|
|
./transcribe audio.mp3 --format json -o output.json
|
|
```
|
|
|
|
**Quick preview (stdout):**
|
|
```bash
|
|
./transcribe audio.mp3 --no-write
|
|
```
|
|
|
|
## Output Formats
|
|
|
|
**SRT (default):** Subtitle format with timestamps
|
|
```
|
|
1
|
|
00:00:00,000 --> 00:00:05,200
|
|
[Speaker 1] Hello, how are you?
|
|
```
|
|
|
|
**Text:** Plain text with timestamps
|
|
```
|
|
[00:00.0 - 00:05.2] [Speaker 1] Hello, how are you?
|
|
```
|
|
|
|
**JSON:** Full metadata including segments, words, duration
|
|
|
|
## Models
|
|
|
|
- `tiny` - Fastest, use for quick drafts
|
|
- `small` - Good balance of speed/accuracy
|
|
- `medium` - Better accuracy, slower
|
|
- `large` - Best accuracy, slowest
|
|
|
|
## Supported Formats
|
|
|
|
MP3, WAV, FLAC, M4A, OGG, OPUS
|
|
|
|
## Build
|
|
|
|
```bash
|
|
cd /home/yeho/Documents/tools/transcribe
|
|
go build -o transcribe
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
```bash
|
|
pip install openai-whisper # Required
|
|
pip install resemblyzer scikit-learn librosa # For diarization
|
|
```
|