YouTube archive system using git-annex for efficient storage and incremental updates.
Live Demo - Try the web interface with sample videos from the @AnnexTubeTesting channel (auto-generated by GitHub Actions)
- Complete channel archival: Videos, metadata, comments, and captions
- Incremental updates: Efficient detection of new content
- Offline browsing: Client-side web interface (no server required)
- Flexible filtering: By date, license, playlist, metadata attributes
- CI/CD automation: GitHub Actions and Codeberg Actions support
- Git-annex integration: Efficient storage with special remotes (S3, WebDAV, etc.)
# Install (core)
pip install annextube
# Or install with all optional features (DataLad, search index, audio alignment)
pip install 'annextube[full]'
# Create archive
mkdir my-archive && cd my-archive
annextube init
# Configure (edit .annextube/config.toml)
# Add YouTube Data API key and channel URLs
# Backup
annextube backup
# Browse offline
annextube generate-web
open web/index.htmlPre-built container with all dependencies (git-annex, yt-dlp, ffmpeg, deno):
# Build container
podman build -t annextube:latest -f Containerfile .
# Run
podman run -it --rm -v $PWD:/archive -e YOUTUBE_API_KEY="key" annextube:latest backup
# For Singularity/Apptainer (HPC clusters)
apptainer build annextube.sif Containerfile
apptainer run --bind $PWD:/archive annextube.sif backupSee docs/how-to/container-deployment.md for full guide.
- Installation: See docs/tutorial/01-installation.md
- Container Deployment: See docs/how-to/container-deployment.md
- Quick Start: See specs/001-youtube-backup/quickstart.md
- Troubleshooting: See docs/how-to/troubleshooting.md (challenge solver errors, quota, interrupted backups)
- API Reference: See docs/reference/
- Python 3.10+: Runtime for annextube
- git: Version control
- git-annex 8.0+: Large file management
- yt-dlp (command-line): MUST be in PATH for git-annex --no-raw
sudo pip install yt-dlp # Or download binary to /usr/local/bin
- ffmpeg: Video processing and best quality downloads
sudo apt-get install ffmpeg # Debian/Ubuntu brew install ffmpeg # macOS
-
YouTube Data API v3 key: For API-based metadata (free from Google Cloud Console)
-
deno or node: JavaScript runtime for modern YouTube features
-
Optional extras (install individually or all at once with
pip install 'annextube[full]'):Extra Install Description dataladpip install 'annextube[datalad]'DataLad dataset management searchpip install 'annextube[search]'Full-text caption search index (pagefind) audio-alignpip install 'annextube[audio-align]'Audio-based caption alignment (Whisper) fullpip install 'annextube[full]'All of the above
# Clone repository
git clone https://github.com/con/annextube.git
cd annextube
# Install with development dependencies
uv pip install -e ".[devel]"
# Run tests
pytest
# Run linter
ruff check annextube/ tests/
# Run type checker
mypy annextube/MIT License - see LICENSE file
See CONTRIBUTING.md (TBD)
🚧 Early Development - This project is under active development. API may change.
Current phase: Implementing MVP (User Story 1 + 2)