An intelligent video analysis tool that automatically detects scene cuts, generates AI-powered descriptions based on visuals (no audio), user-defined themes, and exports SRT caption files for video editing workflows.
- Automatic Scene Detection: Detects scene boundaries using visual analysis (FFmpeg-based histogram comparison) - no audio analysis
- AI-Powered Descriptions: Multiple AI options:
- Cloud AI: GPT-4o (OpenAI), Claude 3 (Anthropic), Gemini (Google)
- Local AI: LLaVA (Transformers), Ollama models (llava, bakllava, etc.) - 100% offline & free
- Theme Integration: Tailor descriptions to specific project themes (DIY, cooking, gaming, tutorials, etc.)
- SRT Export: Exports to standard SubRip subtitle format compatible with DaVinci Resolve, Premiere Pro, etc.
- Interactive Review: Edit and refine AI-generated descriptions before export
- Video Preview: Built-in player with scene markers and timeline navigation
- Modern Web Interface: Clean, responsive React frontend with Material-UI
- Python 3.9+ (tested with 3.12)
- Node.js 18+
- FFmpeg (for video processing) - REQUIRED
# Using Makefile (most comprehensive)
make setup # Full setup
make start # Start both services
# OR using setup script
chmod +x setup.sh
./setup.sh
# Then start with:
./start.sh # Or use: make start- Backend Setup:
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements-minimal.txt
cp .env.example .env # Add API keys if using cloud AI
mkdir -p uploads exports- Frontend Setup:
cd frontend
npm install- Start Services:
# Backend (terminal 1)
cd backend && source venv/bin/activate
uvicorn src.main:app --reload --port 8000
# Frontend (terminal 2)
cd frontend
npm run devThe application will be available at:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Upload Video: Drag and drop or select a video file (MP4, MOV, MKV, AVI, WebM)
- Configure Analysis: Set theme, detection sensitivity, AI model, and description length
- Process Video: AI detects scenes and generates descriptions (progress tracked in real-time)
- Review & Edit: View scenes, edit descriptions, adjust timing if needed
- Export SRT: Download SRT file for use in video editing software
| Upload Screen | Configuration | Processing | Review & Edit | Export |
|---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
- API Keys Configuration - Setting up cloud AI providers (OpenAI, Claude, Gemini)
- Local AI Setup - Using LLaVA and Ollama models offline
- Technical Specification - Complete technical requirements
- Development Guidelines - Code style, testing, and contribution guidelines
- LM Studio Setup - Alternative local AI setup
- VRAM Optimization - Performance tuning for GPU users
- Scene Detection: Visual-based analysis using frame histogram comparison - no audio processing. Multiple sensitivity levels, minimum scene duration control
- AI Integration: Support for multiple providers with fallback options
- Theme Awareness: Dynamic prompt engineering based on user themes
- SRT Compliance: Strict adherence to SubRip specification
- Progress Tracking: Real-time updates with estimated time remaining
- Error Handling: Comprehensive error handling with user-friendly messages
video-scene-tool/
├── backend/ # Python FastAPI backend
│ ├── src/
│ │ ├── scene_detector.py # Scene detection using FFmpeg
│ │ ├── ai_describer.py # AI description generation
│ │ ├── srt_exporter.py # SRT format export
│ │ ├── models.py # Data models
│ │ └── main.py # FastAPI application
│ ├── tests/ # Unit tests
│ └── requirements.txt # Python dependencies
├── frontend/ # React TypeScript frontend
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── store/ # Zustand state management
│ │ ├── utils/ # Utilities and API client
│ │ └── types.ts # TypeScript definitions
│ └── package.json # Node.js dependencies
└── docs/ # Documentation and screenshots
Cloud AI providers require API keys. See API_KEYS.md for detailed setup.
Without API keys, the application still works:
- ✅ Scene detection (FFmpeg)
- ✅ SRT export
- ✅ Local AI descriptions (LLaVA/Ollama) - 100% offline & free
- ✅ Mock AI descriptions (if no local AI available) - editable
- ✅ Full workflow
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests and linting
- Submit a pull request
See AGENTS.md for detailed development guidelines.
MIT License - see LICENSE file for details.
- FFmpeg for video processing
- OpenAI, Anthropic, Google AI for AI models
- LLaVA and Ollama for local AI options
- FastAPI for backend framework
- React and Material-UI for frontend
For issues and feature requests, please use the GitHub Issues page.
SceneScriber AI - Making video editing smarter, one scene at a time. 🎬
After installation, try uploading a short video (under 100MB) to test the workflow. The application will:
- Upload your video
- Detect scenes using FFmpeg
- Generate AI descriptions (if AI is configured)
- Export SRT file for video editing
Note: Cloud AI features require API keys. Without them, you can use local AI models (LLaVA/Ollama) for free offline processing.




