Performant and advanced Python library + CLI for video frame extraction and analysis. Provides adaptive key-frame analysis based on motion detection and multi-frame image differentials. Supports multiple Video input and Image output formats, and implements multi-threading and memory management to optimise resource utilisation. Designed to be particularly well suited to medical imagery or generic videos with incremental (but not necessarily smooth or consistent) movement - be they screen captures, panorama shots, presentations or any multitude of other things.
- Highly configurable CLI tool ready to interact with the library
- "Key-frame" detection (to identify the most significant high quality frames).
- Motion and content analysis (from opencv library) over multiple previous frames compared to existing key-frame set.
- Frame quality detection via contrast and sharpness detection, and additionally including Structural SIMilarity (SSIM) library from scikit-learn (sklearn) for scoring for comparing subsequent candidate frames with existing (sub)set of identified key-frames as needed.
- Inference pre-computation mode to estimate the library parameters needed to generate a set of N key frames from the video
- Concurrent frame processing with configurable thread pools, and a shared thread-safe frame cache and memory-efficient video frame buffering
- Motion detection using optical flow analysis
- Frame Quality measurement signals:
- Sharpness using Laplacian variance, really means looking for blurry low quality frame
- Noise using denoising difference
- Contrast using simple intensity distribution
- Exposure using histogram analysis
- Scene change detection
- Temporal pattern recognition Almost all one of these are overkill in most scenarios; but they're fun.
- Configurable thread pool for parallel processing of frames
- Memory-optimized thread-safe frame buffer and recent frame cache
- Minimal frame copying via efficient np view operations
- Retry mechanims for failed frame extraction via exponential backoff with jitter (just because ;))
- Python 3.12+
- OpenCV Python (opencv-python)
- NumPy
- scikit-learn (for SSIM)
The video processor can be run from the command line with minimal configuration:
python cli-script.py input.mp4 output_dir/input_video: Path to input video fileoutput_dir: Directory for output frames
Details
Control the format and quality of extracted frames:
# Extract as JPEG with 85% quality
python cli-script.py input.mp4 output_dir/ \
--format jpeg \
--quality 85
# Extract as PNG with maximum compression
python cli-script.py input.mp4 output_dir/ \
--format png \
--quality 9Configure key-frame selection:
# Enable key frame detection with custom similarity threshold
python cli-script.py input.mp4 output_dir/ \
--enable-keyframes \
--similarity 0.90Optimize processing speed and resource usage:
# Configure threading and memory usage
python cli-script.py input.mp4 output_dir/ \
--threads 4 \
--buffer-size 60 \
--max-memory 1024 \
--disable-cacheAdjust retry behavior and timeouts:
# Configure robust error handling
python cli-script.py input.mp4 output_dir/ \
--retries 5 \
--retry-delay 1.0 \
--frame-timeout 10.0 \
--video-timeout 60.0Control logging output and verbosity:
# Enable detailed logging to file
python cli-script.py input.mp4 output_dir/ \
--log-level DEBUG \
--log-file processing.logpython cli-script.py input.mp4 output_dir/ \
--format jpeg \
--quality 85 \
--enable-keyframes \
--similarity 0.95 \
--threads 4 \
--buffer-size 60 \
--max-memory 1024 \
--retries 3 \
--log-level INFOpython ./cli-script.py
--format png
--inference-tolerance 0.00001
--target-frames 8 input.mp4 output/Details
This shows the processing flow, covering the three main scenarios and the flow through the cli tool to the library - - full-frame extraction - extraction of key-frames based on a fixed similarity threshold you provide - extraction of approximately _N_ key frames based on inferential estimation of the sensitivity threshold needed to produce the specific number of framesflowchart LR
A[Input Video File] --> B[Init,Metadata]
B --> E[Inference]
E --> F[Key Frame Threshold]
F --> G[KeyFrames]
B --> G[KeyFrames]
B --> K[ALLFrames]
K --> H[Save]
G --> H[Save]
style A font-size:10px
style B font-size:10px
style E font-size:10px
style F font-size:10px
style G font-size:10px
style H font-size:10px
style K font-size:10px
2024-10-27 13:44:13,490 - common - INFO - Logging configured at level 20
2024-10-27 13:44:13,490 - common - INFO - Video processing system initialized
2024-10-27 13:44:13,495 - __main__ - INFO - Created configuration: {'output_format': <OutputFormat.PNG: ('png', [16], False)>, 'compression_quality': 9, 'detect_key frames': True, 'similarity_threshold': 0.999620166015625, 'thread_count': 1, 'buffer_size': 60, 'cache_size': 60, 'enable_cache': True, 'max_memory_usage': None, 'retry_attempts': 3, 'retry_delay': 0.5, 'frame_timeout': 5.0, 'video_timeout': 30.0}
2024-10-27 13:44:13,495 - __main__ - INFO - Running inference mode to target 8 frames
Progress: 22%2024-10-27 13:44:15,657 - nframes - INFO - Found acceptable threshold 0.99829 producing 8 frames (target: 8)
2024-10-27 13:44:15,657 - __main__ - INFO - Inference complete: threshold=0.998, estimated frames=8
Inference Results:
Optimal similarity threshold: 0.998
Estimated frame count: 8
Search iterations: 12
Processing video with inferred threshold...
Progress: 100%
2024-10-27 13:44:15,973 - processor - INFO - Processed 180 frames, kept 35 key frames
2024-10-27 13:44:15,973 - __main__ - INFO - Processing complete. Extracted 35 frames.
Successfully extracted 35 frames to output
2024-10-27 13:44:15,974 - common - INFO - System cleanup completedNote there is a discrepency between estimated and actual frames generated because the estimator uses a simplified method to estimate key frame thresholds, whilst the full extraction compares not just a frame with its preceding 2 frames but a configurable number typically much higher. If this is problematic you can adjust the target frames accordingly.
Details
The module interaction diagram shows how the main components communicate during the processing of a video file.sequenceDiagram
participant CLI as Command Line (cli-script)
participant Config as Configuration (config.py)
participant Model as FrameExtractionModel (model.py)
participant Buffer as FrameBuffer (buffer.py)
participant Analyser as FrameAnalyser
participant Output as Output Directory
CLI->>Config: Load parameters
CLI->>Model: Initialize FrameExtractionModel
Model->>Buffer: Allocate FrameBuffer
Model->>Analyser: Initialize Analyser
CLI->>Model: Process video
Model->>Buffer: Add frame to buffer
Buffer-->>Analyser: Pass frame for analysis
Analyser->>Model: Detect key frames
Model->>Output: Save key frame to output directory
--format: Output format for framespng: Lossless compression (default)jpeg: Lossy compression, smaller fileswebp: Modern format with good compression
--quality: Quality/compression level- PNG: 0-9 (9 = max compression)
- JPEG/WebP: 0-100 (100 = best quality)
--enable-keyframes: Enable key frame detection--similarity: Similarity threshold (0.0-1.0, default: 0.95)--target-frames: When specified enables automatic estimation of--similarityvalue so n frames are produced--threads: Number of processing threads (default: CPU cores - 1)
--buffer-size: Frame buffer size (default: 30)--cache-size: Frame cache size (default: 30)--max-memory: Maximum memory usage in MB (unbound if not speciifed)--disable-cache: Disable frame caching
--retries: Number of retry attempts (default: 3)--retry-delay: Initial delay between retries in seconds (default: 0.5)--frame-timeout: Frame operation timeout in seconds (default: 5.0)--video-timeout: Video operation timeout in seconds (default: 30.0)
--log-level: Logging verbosityDEBUG: Detailed debugging informationINFO: General operation informationWARNING: Warning messagesERROR: Error messagesCRITICAL: Critical issues
--log-file: Path to log file (default: console output)
- Frame buffer size directly impacts memory usage
- Monitor memory usage through built-in tracking
- Thread count affects CPU usage and processing speed
- Default thread count is (CPU cores - 1)
- Buffer size affects disk I/O patterns
- Larger buffers reduce I/O frequency but increase memory usage
- Output format affects storage requirements:
- PNG: Lossless, larger files (but supports compression)
- JPEG: Lossy, smaller files (depending on quality setting)
- WebP: Modern format, good mix between compression and quality
- Configurable automatic retry of frame extraction operations with exponential-backoff
- Comprehensive error logging
- Transaction-like operations with cleanup handlers
- As in any complex system of this nature, multi-thread concurrency, caching and frame buffering are likely to be the first thing you should try disabling if you do see errors.
- Available log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
- Performance metrics tracked via logs including:
- Frame processing times
- Memory usage statistics
- Thread pool utilization
- I/O operations monitoring
- Error tracking includes:
- Detailed error messages and stack traces
- Operation context
- Cleanup operation status
- Resource management events