ClipGen is a command-line tool for turning a long local video into:
- timestamped transcript cache
- candidate highlight clips
- chapter summaries
- a compact overall markdown summary
- Python 3.10+
ffmpeginstalled and available onPATHOPENAI_API_KEYset in the environment
Optional:
OPENAI_MODELto override the default model
Recommended default model:
gpt-4o-mini
You can put these in either:
- workspace root
.env ClipGen/.env
cd ClipGen
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txtCreate a .env file from .env.example and set your API key.
Install ffmpeg separately and confirm:
ffmpeg -version.\.venv\Scripts\python.exe -m ClipGen.main analyze "E:\videos\stream.mp4"Common flags:
--language auto|en|ja--out-dir out\my_run--highlight-count 15--min-clip-seconds 20--max-clip-seconds 90--model gpt-4o-mini--resume--whisper-model small
By default, outputs are written to ClipGen\out\latest_run.
Relative --out-dir values are resolved from the ClipGen project root.
transcript.jsonl: raw transcript segments withstart,end,textchunks.json: normalized analysis chunkschapters.json: chapter summaries with timestampshighlights.json: structured highlight candidateshighlights.csv: spreadsheet-friendly highlight listhighlights.md: human-readable review fileanalysis.json: combined structured outputsummary.md: overall markdown summary
--resumereusestranscript.jsonlwhen it already exists.- Missing
ffmpegorOPENAI_API_KEYwill fail fast with a clear error. - Whisper model cache is stored under
ClipGen\.cache\instead of the user profile. - Transcription defaults to CPU mode for compatibility on machines without CUDA.
- The first version focuses on highlight discovery and summarization only.
- Translation, EDL/XML export, and editor integration are intentionally left for later.