Skip to content

sobuhasy/AIWaifuBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

AIWaifuBot

Voice-in, AI-out, voice-back. This small Python app lets you speak into your mic, sends the transcribed text to an OpenAI model for a reply, and then speaks the reply aloud.

It uses:

  • speech_recognition for microphone capture and Google Web Speech for transcription (internet required)
  • OpenAI Python SDK (Responses API) for the AI reply
  • pyttsx3 for offline text-to-speech output on Windows (SAPI5)
  • python-dotenv to load your OpenAI API key from a local .env file

Prerequisites

  • Windows 10/11
  • Python 3.9–3.12 (recommended 3.10+)
  • A working microphone and speakers/headphones
  • Internet access (for speech recognition and OpenAI API)
  • An OpenAI API key

Keep your API key private. Don’t commit it to source control.

Quick start (PowerShell)

The commands below are for Windows PowerShell.

  1. Clone or open this folder in VS Code.

  2. Create and activate a virtual environment:

python -m venv .venv
.\.venv\Scripts\Activate
  1. Install dependencies:
pip install -r requirements.txt

If you run into issues installing PyAudio on Windows, see Troubleshooting below.

  1. Create a .env file in the project root with your key:
OPENAI_API_KEY="your_api_key_here"
  1. Run the app:
python aiwaifu.py

The app will:

  • calibrate for ambient noise,
  • record up to 15 seconds of speech,
  • send your words to an OpenAI model,
  • and speak the model’s reply.

Configuration tips

You can tweak a few things in aiwaifu.py.

  • Speech language: in speech_to_text, change language="en-US" to your locale, e.g. "en-GB", "fr-FR", "ja-JP".

  • Record duration: in recognizer.listen(...), change timeout and phrase_time_limit (both are 15 seconds by default).

  • Voice selection (TTS): in speak_text, the code uses voices[2]. If you get an index error or don’t like the voice, change the index. To list installed voices:

     import pyttsx3
     e = pyttsx3.init()
     for i, v in enumerate(e.getProperty('voices')):
     		print(i, v.id)

Troubleshooting

  • PyAudio installation fails on Windows

    • Try installing via pipwin:
       pip install pipwin
       pipwin install pyaudio
    • Or install a prebuilt wheel compatible with your Python version from a trusted source, then pip install <wheel_file.whl>.
  • No default input device / microphone not found

    • In Windows Sound Settings, set a default input device and ensure the mic is allowed for desktop apps.
  • speech_recognition.WaitTimeoutError

    • You were silent past the timeout. Speak sooner or increase timeout.
  • Google speech recognition RequestError

    • Check your internet connection. The default recognizer relies on Google’s web API.
  • OpenAI InvalidRequestError: model not found or AuthenticationError

    • Update the model name in aiwaifu.py to a valid one (e.g., gpt-4o, gpt-4o-mini).
    • Ensure .env contains a valid OPENAI_API_KEY and it’s being loaded.
  • TTS voice issues or index errors

    • List voices (see snippet above) and pick an available index.
    • Ensure your output device is working and not muted.

Project structure

AIWaifuBot/
├─ aiwaifu.py          # Main script: STT → OpenAI → TTS
├─ .env                # Your OpenAI API key (not committed)
├─ requirements.txt    # Python dependencies
└─ README.md           # This file

Notes on privacy and cost

  • Speech transcription uses Google’s web API via speech_recognition, which sends audio snippets over the internet.
  • OpenAI API usage is billed per token. Consider using gpt-5-mini for lower cost.

License

No license specified yet.

Next steps (ideas)

  • Add a system prompt/persona so the bot keeps a specific “waifu” style.
  • Use a local or different STT provider (e.g., Whisper) for better privacy.
  • Add a wake word or continuous listening loop.
  • Persist conversation history for multi-turn context.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages