aivoice (React + Golang + OpenAI)

This project creates a ChatGPT Voice–like application:

Project Structure


project-root/
│
├── backend/
│ ├── main.go # WebSocket proxy server to OpenAI
│ └── go.mod
|
├── consoleApp/ #console app to transcript audio file and process it into openapi
| └── withReadFile/
|     ├── audios/ # audio files that need to transcript to openai
|     └── main.go # main go to run
| └── withRecordingFunction/ 
|     └── main.go # main go to run
|
├── frontend/
│ ├── src/
│ │ ├── App.tsx # Main React component
│ │ ├── recorder-worklet.js # AudioWorklet
│ │ └── ...
│ └── package.json
│
│── config.env #needs to provide by yourself consist of OPENAI_API_KEY and SERVER_PORT
└── README.md

First thing To Do

You need to create config.env file inside the main folder, this file should have:

OPENAI_API_KEY=sk-proj-....
SERVER_PORT=8080

Console Apps

withReadFile

This is the console application that will read an audio file, and give response based on the audio file transcription

Place the audio file under /consoleApp/withReadFile/audios
Change this line of code under voiceRequest function audioFile := "./audios/audio_name"
Run main.go

withRecordingFunction

This is the console application that will listen your voice and give a response

Instal this library: Mac OS:

brew install pkg-config
brew install portaudio

Linux:

apt-get install portaudio19-dev

Run main.go
Type start to talk
Type stop to get the response
Type exit to exit the console app

Web Based

Frontend: React + TypeScript (TSX), using AudioWorklet to capture microphone input, waveform visualization (mic & output), and playback of audio responses from OpenAI.
Backend: Golang, as a WebSocket proxy between the client and OpenAI Realtime API.
OpenAI Realtime API: provides two-way conversation (voice-in + voice-out).

Features

Capture microphone audio, downsample to 16kHz PCM16.
Send audio chunks to the backend via WebSocket.
Backend proxies to OpenAI Realtime (commit + response.create).
Playback audio responses from OpenAI with circular waveform visualization (mic: yellow, output: green).
Display chat logs (text transcripts).

1. Run Backend (Golang)

Make sure you have Go 1.22+ installed and your OPENAI_API_KEY is set inside config.env file

cd backend
go mod tidy
go run main.go

This will start a WebSocket server on port that already set inside config.env SERVER_PORT

ws://localhost:SERVER_PORT/ws

2. Run Frontend (React + Vite)

cd frontend
npm install
npm run dev

Then open http://localhost:5173 in your browser.

Hotkeys

Connect → open WebSocket connection to backend and begin capturing mic and sending audio.
Mute Mic → stop mic, commit buffer, and trigger response.
- Unmute Mic → start capturing mic.
Show logs → show logs file at the bottom.

Notes

Browser must support AudioWorklet (Chrome, Edge, latest browsers).
Playback speed can be tuned by changing:
```
src.playbackRate.value = 1.4;
```
Visualizer:
- Yellow Circle → Mic input
- Green Circle → AI output

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
backend		backend
consoleApp		consoleApp
ui		ui
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aivoice (React + Golang + OpenAI)

Project Structure

First thing To Do

Console Apps

withReadFile

withRecordingFunction

Web Based

Features

1. Run Backend (Golang)

2. Run Frontend (React + Vite)

Hotkeys

Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

aivoice (React + Golang + OpenAI)

Project Structure

First thing To Do

Console Apps

withReadFile

withRecordingFunction

Web Based

Features

1. Run Backend (Golang)

2. Run Frontend (React + Vite)

Hotkeys

Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages