A Heads-Up Display powered by the OpenAI Realtime API (gpt-realtime-1.5).
Analyzes your screen captures and audio streams in real-time, surfacing useful supplementary information, insights, and confirmations directly in your HUD.
- 🖥 Screen capture – select any monitor or window to analyze (single or multi-monitor)
- 🎙 Microphone audio – include your voice for full context
- 🔊 System/computer audio – capture what's playing on your screen
- 💡 AI HUD display – real-time insights streamed as they are generated
- ⌨ Steerable – collapsible text input to direct the AI when needed
- 🔒 Privacy first – clear recording indicator, instant stop, no data storage
- ⤢ Pop-out display – open the HUD in a second window for multi-monitor setups
- 🔄 Provider-agnostic – OpenAI now; designed for future offline/private models
- Your API key never leaves your machine (kept in server
.env, never sent to the browser) - Screen and audio data are proxied directly to OpenAI – nothing is stored on the server
- A prominent Recording badge indicates when the session is active
- You can stop the session at any time
- Node.js 18 or later
- An OpenAI API key with Realtime API access
git clone https://github.com/hack-r/realtime-hud.git
cd realtime-hud
npm installcp .env.example .env
# Edit .env and add your OPENAI_API_KEYnpm run devOpen http://localhost:5173 in your browser.
npm run build
npm start- Click Select Screen to Capture and choose a monitor or window
- Optionally enable Microphone and/or System Audio
- Click Start AI Session
- AI insights appear in the right panel as your screen is analyzed
- Use ⤢ Pop Out to move the HUD display to a second monitor
- Click ▼ Steer AI to open the text input for directing the AI
- Click Stop Session to end recording
Browser (React + Vite)
└─ WebSocket ──▶ Node.js / Express proxy
└─ WebSocket ──▶ OpenAI Realtime API
(gpt-realtime-1.5)
The API key is stored server-side only. The browser connects to a local WebSocket proxy that forwards traffic to OpenAI.
The AIProvider interface (src/types/index.ts) is designed for swappability:
| Status | Provider |
|---|---|
| ✅ | OpenAI Realtime (gpt-realtime-1.5) |
| 🗓 | Other cloud providers (Gemini Live, etc.) |
| 🗓 | Fully offline/private local model |
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
required | Your OpenAI API key |
PORT |
3001 |
Server port |
OPENAI_MODEL |
gpt-realtime-1.5 |
Realtime model to use |
OPENAI_REALTIME_INTERFACE |
ga |
Realtime protocol interface (ga or beta) |
| Heads Up Display with RT AI. Will start with OpenAI, but provider agnostic. |