This project creates a ChatGPT Voice–like application:
project-root/
│
├── backend/
│ ├── main.go # WebSocket proxy server to OpenAI
│ └── go.mod
|
├── consoleApp/ #console app to transcript audio file and process it into openapi
| └── withReadFile/
| ├── audios/ # audio files that need to transcript to openai
| └── main.go # main go to run
| └── withRecordingFunction/
| └── main.go # main go to run
|
├── frontend/
│ ├── src/
│ │ ├── App.tsx # Main React component
│ │ ├── recorder-worklet.js # AudioWorklet
│ │ └── ...
│ └── package.json
│
│── config.env #needs to provide by yourself consist of OPENAI_API_KEY and SERVER_PORT
└── README.md
You need to create config.env file inside the main folder, this file should have:
OPENAI_API_KEY=sk-proj-....
SERVER_PORT=8080
This is the console application that will read an audio file, and give response based on the audio file transcription
- Place the audio file under
/consoleApp/withReadFile/audios - Change this line of code under
voiceRequestfunctionaudioFile := "./audios/audio_name" - Run main.go
This is the console application that will listen your voice and give a response
-
Instal this library: Mac OS:
brew install pkg-config brew install portaudioLinux:
apt-get install portaudio19-dev -
Run main.go
-
Type
startto talk -
Type
stopto get the response -
Type
exitto exit the console app
- Frontend: React + TypeScript (TSX), using AudioWorklet to capture microphone input, waveform visualization (mic & output), and playback of audio responses from OpenAI.
- Backend: Golang, as a WebSocket proxy between the client and OpenAI Realtime API.
- OpenAI Realtime API: provides two-way conversation (voice-in + voice-out).
- Capture microphone audio, downsample to 16kHz PCM16.
- Send audio chunks to the backend via WebSocket.
- Backend proxies to OpenAI Realtime (commit + response.create).
- Playback audio responses from OpenAI with circular waveform visualization (mic: yellow, output: green).
- Display chat logs (text transcripts).
Make sure you have Go 1.22+ installed and your OPENAI_API_KEY is set inside config.env file
cd backend
go mod tidy
go run main.goThis will start a WebSocket server on port that already set inside config.env SERVER_PORT
ws://localhost:SERVER_PORT/ws
cd frontend
npm install
npm run devThen open http://localhost:5173 in your browser.
- Connect → open WebSocket connection to backend and begin capturing mic and sending audio.
- Mute Mic → stop mic, commit buffer, and trigger response.
-
- Unmute Mic → start capturing mic.
- Show logs → show logs file at the bottom.
-
Browser must support AudioWorklet (Chrome, Edge, latest browsers).
-
Playback speed can be tuned by changing:
src.playbackRate.value = 1.4;
-
Visualizer:
- Yellow Circle → Mic input
- Green Circle → AI output
MIT