A powerful tool to transcribe YouTube videos and Instagram posts/reels into Obsidian-compatible markdown notes using OpenAI's Whisper AI.
- YouTube Support: Download and transcribe any YouTube video
- Instagram Support: Download and transcribe Instagram posts and reels
- AI Transcription: Uses OpenAI Whisper for high-quality transcription
- Multiple Languages: Auto-detects language or specify manually
- Obsidian Integration: Creates formatted markdown notes ready for your vault
- Timestamps: Optional timestamp support for detailed notes
- Batch Processing: Process multiple URLs in one command
- Flexible Models: Choose from 5 Whisper models based on speed vs accuracy needs
- Python 3.8 or higher
- ffmpeg (for audio/video processing)
macOS:
brew install ffmpegUbuntu/Debian:
sudo apt update
sudo apt install ffmpegWindows: Download from ffmpeg.org or use:
winget install ffmpeg- Clone the repository:
git clone <repository-url>
cd RT-vision-core- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Configure (optional):
cp .env.example .env
# Edit .env to configure Whisper model, language, and Obsidian vault pathTranscribe a YouTube video:
python main.py https://www.youtube.com/watch?v=dQw4w9WgXcQTranscribe an Instagram post/reel:
python main.py https://www.instagram.com/p/ABC123/Process multiple URLs:
python main.py https://www.youtube.com/watch?v=video1 https://www.instagram.com/reel/xyz/Include timestamps in transcription:
python main.py --timestamps https://www.youtube.com/watch?v=dQw4w9WgXcQUse a different Whisper model:
python main.py --model medium https://www.youtube.com/watch?v=dQw4w9WgXcQSpecify language (skip auto-detection):
python main.py --language es https://www.youtube.com/watch?v=dQw4w9WgXcQurls: One or more YouTube or Instagram URLs (required)--timestamps: Include timestamps in the transcription--model {tiny,base,small,medium,large}: Whisper model to use (default: base)--language LANG: Language code for transcription (e.g., en, es, fr)
Choose based on your needs:
| Model | Speed | Accuracy | Memory | Best For |
|---|---|---|---|---|
| tiny | ⚡⚡⚡⚡⚡ | ⭐⭐ | ~1 GB | Quick drafts |
| base | ⚡⚡⚡⚡ | ⭐⭐⭐ | ~1 GB | Balanced (default) |
| small | ⚡⚡⚡ | ⭐⭐⭐⭐ | ~2 GB | Good quality |
| medium | ⚡⚡ | ⭐⭐⭐⭐⭐ | ~5 GB | High accuracy |
| large | ⚡ | ⭐⭐⭐⭐⭐ | ~10 GB | Best quality |
Transcriptions are saved as markdown files in the transcripts/ directory with the following format:
---
title: Video Title
source: https://www.youtube.com/watch?v=...
date: 2025-11-10
language: en
channel: Channel Name
platform: YouTube
tags:
- transcription
- en
---
# Video Title
## Metadata
- **Source:** https://www.youtube.com/watch?v=...
- **Channel:** Channel Name
- **Date:** 2025-11-10
- **Duration:** 15 minutes
- **Language:** en
## Description
Original video description...
## Transcription
Transcribed text goes here...If you set OBSIDIAN_VAULT_PATH in your .env file, notes will be automatically copied to your Obsidian vault:
OBSIDIAN_VAULT_PATH=/path/to/your/obsidian/vaultRT-vision-core/
├── main.py # CLI entry point
├── config.py # Configuration settings
├── youtube_downloader.py # YouTube video/audio download
├── instagram_downloader.py # Instagram post/reel download
├── transcriber.py # Whisper transcription
├── obsidian_formatter.py # Markdown formatting
├── requirements.txt # Python dependencies
├── .env.example # Example environment config
├── downloads/ # Temporary download directory
└── transcripts/ # Output markdown files
python main.py --timestamps --model small https://www.youtube.com/watch?v=dQw4w9WgXcQThis will create a note with timestamped segments:
## Transcription
**[00:00]** Welcome to this video about...
**[00:15]** In this tutorial, we'll cover...
**[00:45]** First, let's talk about...python main.py --language es https://www.instagram.com/reel/ABC123/python main.py \
https://www.youtube.com/watch?v=video1 \
https://www.youtube.com/watch?v=video2 \
https://www.instagram.com/p/post1/ \
https://www.instagram.com/reel/reel1/- ffmpeg not found: Install ffmpeg using instructions above
- Out of memory: Use a smaller Whisper model (tiny or base)
- Instagram download fails: Some private or restricted posts may not be accessible
- Slow transcription: Use a smaller model or ensure you have GPU support
For faster transcription, install PyTorch with CUDA support:
# For NVIDIA GPUs
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118Contributions are welcome! Please feel free to submit a Pull Request.
This project is open source and available under the MIT License.
- OpenAI Whisper for transcription
- yt-dlp for YouTube downloads
- Instaloader for Instagram downloads
For issues and questions, please open an issue on GitHub.