Developed by Fayna Digital — Author: Volodymyr Shevchenko
Interactive voice assistant for Karkandaki Armenian restaurant — a production-grade offline kiosk built by Fayna Digital.
Deployed at: ul. Kolejowa 41, Ostrów Wielkopolski, Poland
Contact: 530 324 239
An autonomous voice-driven information kiosk designed for use at trade fairs, restaurants, and events. The system listens for Polish speech, understands natural questions about the menu, prices, and promotions, and responds with a neural female voice — all 100% offline, without cloud APIs or ongoing subscription costs.
| Feature | Implementation |
|---|---|
| Offline Speech Recognition | Vosk with Polish language model |
| Offline Neural Text-to-Speech | Piper TTS — VITS model pl_PL-gosia-medium, no internet required |
| NLP Engine | Deterministic rule-based keyword matcher (zero hallucinations) |
| UI | Tkinter fullscreen kiosk mode (no browser dependency) |
| Business Rules | Hardcoded allergen data, age validation, forbidden topics — AI cannot invent answers |
| Promo Loop | Background thread plays random promotions between interactions |
| Continuous Dialog | Multi-turn conversation — asks multiple questions in one session |
┌─────────────────────────────────────────────────┐
│ KarkandakiKiosk │
│ ┌──────────┐ ┌────────────┐ ┌─────────────┐ │
│ │ STTEngine│ │NLPProcessor│ │ TTSEngine │ │
│ │ (Vosk) │→ │ (keywords) │→ │ (Piper) │ │
│ └──────────┘ └────────────┘ └─────────────┘ │
│ │
│ Modes: PROMO ←→ DIALOG │
│ PROMO: background promo loop (TTS every 15s) │
│ DIALOG: listen → match → speak → loop │
└─────────────────────────────────────────────────┘
Dialog flow:
- User presses START button
- Kiosk says: "Słucham, w czym mogę pomóc?"
- STT listens → Vosk transcribes Polish speech
- NLP matches keywords → deterministic response (no LLM)
- TTS speaks response via Piper →
afplay(macOS) /aplay(Linux) - Loop continues until: goodbye phrase detected, 15s inactivity, or STOP pressed
ai-kiosk/
├── src/
│ ├── main.py # Entry point — KarkandakiKiosk Tkinter app
│ ├── config/
│ │ ├── settings.py # Business rules, allergens, audio config
│ │ └── knowledge.py # Menu data, QA knowledge base, system prompt
│ ├── nlp/
│ │ └── processor.py # Keyword-based NLP — 15+ intent categories
│ ├── stt/
│ │ └── engine.py # Vosk offline STT engine (Polish)
│ ├── tts/
│ │ └── engine.py # Edge TTS neural voice engine
│ ├── kiosk/
│ │ └── kiosk_mode.py # Chromium kiosk mode manager (Linux deploy)
│ └── assets/
│ ├── images/ # UI images
│ ├── models/vosk/ # Vosk STT model (not in git — see Setup)
│ └── models/piper/ # Piper TTS model .onnx (not in git — see Setup)
├── tests/
│ ├── test_stt.py # STT integration test (10s mic recording)
│ └── test_tts.py # TTS smoke test
├── scripts/
│ └── install-kiosk.sh # systemd service installer (Linux/Raspberry Pi)
├── docs/
│ └── 01_client_constraints_adr.md # Architecture Decision Record
├── data/
│ ├── menu.json # Menu data (reference)
│ └── qa.json # Q&A pairs (reference)
├── index.html # Web menu page (static, no server needed)
└── requirements.txt
- Python 3.11 (required —
piper-ttsdepends ononnxruntimewhich has no Python 3.13+ wheels yet) - macOS (uses
afplay) or Linux (usesaplayfromalsa-utils) - Microphone
git clone https://github.com/VladSh77/ai-kiosk.git
cd ai-kiosk
# macOS: install Python 3.11 if needed
brew install python@3.11 espeak-ng
python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txtmkdir -p src/assets/models/vosk-model-pl
curl -L https://alphacephei.com/vosk/models/vosk-model-small-pl-0.22.zip -o vosk.zip
unzip vosk.zip -d src/assets/models/
mv src/assets/models/vosk-model-small-pl-0.22 src/assets/models/vosk-model-pl
rm vosk.zipmkdir -p src/assets/models/piper
curl -L -o src/assets/models/piper/pl_PL-gosia-medium.onnx \
"https://huggingface.co/rhasspy/piper-voices/resolve/main/pl/pl_PL/gosia/medium/pl_PL-gosia-medium.onnx"
curl -L -o src/assets/models/piper/pl_PL-gosia-medium.onnx.json \
"https://huggingface.co/rhasspy/piper-voices/resolve/main/pl/pl_PL/gosia/medium/pl_PL-gosia-medium.onnx.json"Model size: ~60 MB. Alternative voice (male): replace
gosiawithmc_speech.
source venv/bin/activate
python3 src/main.pyAll business-critical settings are in src/config/settings.py:
FULLSCREEN_MODE = True # Lock to fullscreen (kiosk mode)
HIDE_CURSOR = True # Hide mouse cursor
NOISE_GATE_THRESHOLD = 500 # Adjust for venue ambient noise
ENABLE_BARGE_IN = True # User can interrupt TTS playback
MIN_AGE_RECORD = 16 # Karkandakowy Rekord age restriction (hardcoded)
UNKNOWN_RESPONSE = "Nie znam odpowiedzi, zapytaj operatora."Allergen data is hardcoded (not AI-generated) to prevent hallucinations:
ALLERGENS_DB = {
"karkandak_slodki_nutella": ["orzechy", "mleko", "soja", "gluten"],
"karkandak_wytrawny_mieso": ["gluten", "jaja"],
...
}The keyword engine covers 15+ intent categories:
| Intent | Example query | Response |
|---|---|---|
| Greeting | cześć, hej | Welcome message |
| Menu list | co macie, jakie smaki | Full menu with price |
| Price | ile kosztuje, cena | "8 zł per piece" |
| What is karkandak | co to jest | Product description |
| Recommendation | co polecasz, co wziąć | Taste-based suggestion |
| Spicy/mild | ostre, pikantne | Spice level guide |
| Children | dla dzieci, dziecko | Safe recommendation |
| Allergens | alergen, gluten, orzechy | Hardcoded safe answer |
| Opening hours | godziny, otwarte | 8:00–22:00 daily |
| Address | gdzie, adres, ulica | ul. Kolejowa 41 |
| Delivery | dowóz, dostawa | 10 zł delivery info |
| Challenge | rekord, wyzwanie | Rules + age warning |
| Goodbye | dziękuję, do widzenia | Farewell + session end |
sudo bash scripts/install-kiosk.sh
sudo systemctl status ai-kiosk
journalctl -u ai-kiosk -fThe service auto-restarts on failure and starts at boot.
See docs/01_client_constraints_adr.md for the full ADR covering:
- Why offline STT (Vosk) over cloud (Google/Whisper)
- Why deterministic NLP over LLM
- Why Piper TTS over edge-tts (offline VITS vs cloud-dependent Neural)
- Noise gate strategy for trade fair environments
- GDPR-compliant lead collection design
Client: Karkandaki restaurant — Armenian snack bar at trade fairs in Poland
Use case: Hands-free customer service at busy market stands
Problem solved: Staff cannot simultaneously serve customers and answer repetitive questions
ROI: Handles 100% of FAQ traffic autonomously, frees staff for upselling
Fayna Digital — Systems architecture & AI automation agency
fayna.agency · github.com/VladSh77
Core Tech: Python 3.11 · Vosk · Piper TTS · Tkinter · 100% offline architecture