Version 2 — refreshed architecture with multi-vendor model discovery, local Whisper transcription, and improved multimodal workflows.
Author: Arpit J Soni
LinkedIn: linkedin.com/in/arpit-j-soni
Download the Windows installer: Download the Windows installer
Product Hunt: Product Hunt
- Node.js 18+ and npm 9+
- Python 3.8+ (for local Whisper transcription)
- macOS 13+ or Windows 10/11
- OpenAI API key (for ChatGPT, GPT-4, etc.) - Set as
OPENAI_API_KEYorREACT_APP_OPENAI_API_KEY - Google Gemini API key (for Gemini models)
- Anthropic API key (for Claude models)
- Perplexity API key (for Perplexity models)
Note: The app works with local Whisper transcription without any API keys, but API keys are required for AI responses and meeting summaries/questions.
# From project root
npm ciThe app uses a local Whisper server for real-time speech-to-text transcription. Install Python dependencies:
# Install Python packages directly
pip install faster-whisper websockets numpy torch
# Or if you need to use pip3:
pip3 install faster-whisper websockets numpy torch
# Or install with specific versions (recommended)
pip install "faster-whisper>=1.0.0" "websockets>=12.0" "numpy>=1.24.0" "torch>=2.0.0"Required Python packages:
faster-whisper>=1.0.0- Optimized Whisper implementation for real-time transcriptionwebsockets>=12.0- WebSocket support for real-time streaming (optional, for future features)numpy>=1.24.0- Numerical computing librarytorch>=2.0.0- PyTorch for model inference
Note: On first run, the Whisper small.en model (~500-600MB) will be automatically downloaded to whisper_models/ directory.
# Start the React dev server and Electron app
npm run electron-devThis command:
- Starts the React development server on
http://localhost:3000 - Automatically starts the local Whisper server on port
8765 - Launches the Electron app once the React server is ready
- Open the app and click ⚙ Settings
- Select your preferred vendor (OpenAI, Gemini, Anthropic, or Perplexity)
- Enter your API key in the settings panel
- Save the configuration
- Choose your favorite LLM for every request—swap between OpenAI (ChatGPT), Google Gemini, Perplexity, or Claude (Anthropic) right from the app.
- Local Whisper Transcription - Real-time speech-to-text using local Whisper model (no API calls needed for transcription).
- Unified multimodal support across all providers: mix text plus multiple screenshots in one prompt and get a single streaming answer.
- Smarter fallbacks so retired or unavailable models are replaced automatically without you touching settings.
- Cleaner setup that keeps your provider configuration in sync and ready for future upgrades (voice, tools, etc.).
- Screenshot capture with automatic window-hide during capture (restores immediately).
- Multimodal querying (text + screenshots) with streaming AI responses.
- Audio recording with automatic transcription (supports vendors with audio capabilities).
- Conversation history with sidebar panel for reviewing past interactions.
- Response window with markdown rendering and syntax highlighting.
- Real-time local transcription using Whisper (no API required for transcription).
- Dual audio capture:
- Microphone capture - Records your voice
- System audio capture - Records system/application audio
- Independent controls:
- Start/Stop Microphone independently
- Start/Stop System Audio independently
- Live transcription display with separate streams for microphone and system audio.
- AI-powered meeting summary - Auto-generates summaries every 5 seconds (requires API key).
- Question extraction - Automatically identifies and extracts questions from transcript every 10 seconds (requires API key).
- Export functionality - Export transcriptions, summaries, and questions to text file.
- Screen-share privacy: Hide app from screen sharing while keeping it visible for you.
- Draggable windows: Drag the window by clicking and dragging the header area.
- Window opacity control: Adjust app transparency for overlay use.
- Always-on-top: Window stays on top of other applications.
- Multi-vendor support: Switch between OpenAI, Gemini, Anthropic, and Perplexity.
- Theme selection: Light, Dark, or Auto (system preference).
- Font size control: Small, Medium, or Large.
- Customizable keyboard shortcuts for all actions.
- Participant names: Set custom names for microphone and system audio in meetings.
- Screenshot limits: Configure maximum number of screenshots per session.
- Aurora-inspired palette with CSS variables applied app-wide.
- Smooth transitions and modern UI components.
- Accessibility support: High contrast mode and reduced motion preferences.
The app includes a built-in local Whisper server for real-time speech-to-text transcription:
- Automatic startup: Whisper server starts automatically when the app launches.
- No API required: Transcription works completely offline (no API calls needed).
- Low latency: Optimized for real-time transcription with 5-second audio chunks.
- Model: Uses
faster-whisperwithsmall.enmodel for English transcription. - Server endpoint: Runs on
http://localhost:8765
If you need to start/stop the Whisper server manually:
# Start server manually
python3 whisper_server.py
# Or with custom port
WHISPER_PORT=8765 python3 whisper_server.pyThe server provides:
GET /health- Health check endpointPOST /transcribe- Audio transcription endpoint
Requirements:
- Xcode Command Line Tools
- Apple Developer account for signing/notarization (recommended for distribution)
Build steps:
npm ci
npm run electron-packArtifacts:
dist/Visual Studio Code Desktop-<version>-arm64.dmg(and.zip)- Unpacked
.appunderdist/mac-arm64/
Optional signing & notarization:
export CSC_IDENTITY_AUTO_DISCOVERY=true
export APPLE_ID="your-apple-id@example.com"
export APPLE_APP_SPECIFIC_PASSWORD="xxxx-xxxx-xxxx-xxxx"
npm run electron-packRequirements:
- Windows 10/11
- Optional code-signing certificate for best install UX
Build steps:
npm ci
npm run electron-packArtifacts:
- NSIS installer:
dist/Visual Studio Code Desktop-<version>.exe
Optional Windows signing (via electron-builder):
- Set
CSC_LINK(path/base64 to PFX) andCSC_KEY_PASSWORD, or use store-based options.
CmdOrCtrl+H: Take screenshot (also registered globally)CmdOrCtrl+R: Reset the current sessionCmdOrCtrl+S: Save the current responseShift+Enter(while holding Cmd/Ctrl): Submit the current query
CmdOrCtrl+Shift+V: Toggle "visible to screen share" (privacy mode on/off; window stays visible to you)CmdOrCtrl+Up/Down/Left/Right: Move window by 50pxCmdOrCtrl+Alt+./CmdOrCtrl+Alt+,: Increase / Decrease app opacity
Note: All keyboard shortcuts can be customized in Settings.
Server not starting:
- Verify Python 3.8+ is installed:
python3 --version - Check dependencies:
pip3 list | grep faster-whisper - Check if port 8765 is available:
lsof -i :8765(macOS) ornetstat -ano | findstr :8765(Windows) - Review server logs in console or
/tmp/whisper_server.log
Transcription not working:
- Ensure Whisper server is running: Check
http://localhost:8765/health - Verify microphone permissions are granted
- Check browser console for connection errors
- Restart the Electron app to restart the Whisper server
Model download issues:
- First-time model download requires internet connection
- Model is saved to
whisper_models/directory (~500-600MB) - If download fails, delete
whisper_models/and restart the app
Python not found:
- macOS/Linux: Install Python 3 using your package manager or from python.org
- Windows: Download from python.org, ensure "Add Python to PATH" is checked during installation
Dependencies installation fails:
- Try upgrading pip:
pip install --upgrade piporpip3 install --upgrade pip - Use virtual environment:
python3 -m venv venv && source venv/bin/activate(macOS/Linux) orpython -m venv venv && venv\Scripts\activate(Windows)
Buttons not working after returning to Meetings page:
- The app automatically re-initializes event listeners when returning to the page
- If issues persist, restart the Electron app
Window not draggable:
- Drag the window by clicking and holding on the header area (not on buttons)
- Ensure you're not clicking on interactive elements (buttons, inputs, etc.)
This repo exposes a helper command that forwards flags to electron-icon-maker.
Source image: A square PNG with sufficient resolution (at least 1024×1024 recommended).
Command:
npm run make-icons -- --input /absolute/path/to/source.png --output /absolute/path/to/output-dirExample using this repo's asset layout:
npm run make-icons -- \
--input Desktop/CHEAT/CHEATS/public/assets/vscode.png \
--output Desktop/CHEAT/CHEATS/public/assetsThe generated .icns (mac) and .ico (windows) can be referenced by the build config and at runtime (tray/window icons).
Update in package.json:
name: npm package name (optional for distribution but keep consistent)build.productName: The user-facing application name in installers and the.app/.exebuild.appId: Reverse-DNS identifier (change to your domain, e.g.,com.yourco.product)
Optional UI title: If you want a custom window title beyond the product name, set it via the renderer or in BrowserWindow options.
Build-time (installers):
- macOS:
build.mac.icon→public/assets/your.icns - Windows:
build.win.icon→public/assets/your.ico
Runtime (tray/window):
- File:
public/electron.jspicks icons per platform frompublic/assets(vscode.icnson mac,vscode.icoon win,vscode.pngas fallback for others). Replace those file paths with your own.
After replacing icons, rebuild:
npm run electron-pack- Personal site: arpitjsoni.com
- Twitter/X:
x.com/arpitsoni1893 - YouTube:
youtube.com/@ArpitJSoni_1
- Twitter/X profile mentioned above for author identity:
https://x.com/arpitsoni1893