A voice-driven natural language interface for querying a tourism database. Ask a question in English or Italian — by voice or text — and get back a SQL query, a results table, and a chart. No SQL knowledge required.
Built for the Data Management course at the University of Naples Federico II (2026).
- Demo
- Architecture
- Project Structure
- Database
- Technology Stack
- How to Run
- Notebooks
- Results
- Languages Supported
- Documentation
- Limitations
- Authors & Acknowledgments
- License
Voice query — ▶ Watch the demo
Text query — ▶ Watch the demo
Both demos follow the same pipeline: the question (spoken or typed) is transcribed
if needed, corrected for ASR errors, translated into SQL by WrenAI, run against
campania_tourism, and rendered as a table + chart.
Voice / Text
|
v
Whisper (ASR)
|
v
ASR Error Correction
|
v
WrenAI (Text-to-SQL) — GPT-4o-mini
|
v
PostgreSQL (campania_tourism)
|
v
Results + Chart
voice2query/
├── asr/ — Task 1: Speech-to-Text (Whisper)
│ ├── task1_speech_to_text.ipynb
│ └── README.md
├── databases/ — PostgreSQL schema and data
│ └── database_postgresql/
│ ├── 01_schema1.sql
│ ├── 02_seed_data1.sql
│ ├── 03_queries1.sql
│ ├── 04_init_db1.py
│ └── README.md
├── text2sql/ — Task 2: Text-to-SQL (WrenAI)
│ ├── docker/
│ │ ├── docker-compose.yaml
│ │ ├── config.yaml
│ │ ├── .env.example
│ │ └── README.md
│ └── task2_text_to_sql.ipynb
├── pipeline/ — Full end-to-end pipeline
│ ├── voice2query_pipeline.ipynb
│ └── README.md
├── dashboard/ — Streamlit web interface
│ ├── app.py
│ └── README.md
├── docs/ — Architecture and design documentation
├── start.bat — One-click startup script (Windows)
└── README.md
The campania_tourism PostgreSQL database covers the main tourist destinations
of the Campania region, including cities, attractions, hotels, restaurants,
events, users, bookings and reviews.
| Table | Rows |
|---|---|
| cities | 10 |
| attractions | 20 |
| hotels | 19 |
| restaurants | 15 |
| events | 12 |
| users | 10 |
| bookings | 13 |
| reviews | 15 |
| Component | Technology |
|---|---|
| Speech-to-Text | OpenAI Whisper (turbo) |
| ASR Correction | Custom edit-distance module |
| Text-to-SQL | WrenAI 0.29 |
| LLM | GPT-4o-mini (OpenAI API) |
| Embeddings | text-embedding-3-small |
| Vector store | Qdrant |
| Database | PostgreSQL 16 |
| Web interface | Streamlit |
| Visualisation | Plotly |
- Docker Desktop 4.17+
- Python 3.11+
- An OpenAI API key (platform.openai.com) — each user needs their own key
Note:
start.batis Windows-only. On macOS/Linux, run the Docker and database steps manually using the commands below — there is nostart.shequivalent yet (contributions welcome).
1. Create the PostgreSQL container (only once per machine):
docker run --name campania-pg -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:16
⚠️ postgres/postgresis a local development default only — don't reuse it for anything beyond running this project on your own machine.
2. Configure your OpenAI API key:
cp text2sql/docker/.env.example text2sql/docker/.envThen open text2sql/docker/.env and fill in the key:
OPENAI_API_KEY=sk-...
3. Install Python dependencies:
pip install openai-whisper streamlit pandas plotly sqlalchemy psycopg2-binary requests sounddevice scipy.\start.batThis will:
- Check that Docker is running
- Start all WrenAI Docker containers
- Start the PostgreSQL container
- Initialise the database
- Launch the dashboard at
http://localhost:8501
WrenAI interface: http://localhost:3000
docker compose -f text2sql/docker/docker-compose.yaml up -d
docker start campania-pg
python databases/database_postgresql/04_init_db1.py
streamlit run dashboard/app.py| Notebook | Description |
|---|---|
asr/task1_speech_to_text.ipynb |
Whisper transcription and ASR correction |
text2sql/task2_text_to_sql.ipynb |
Text-to-SQL with 7 query tests |
pipeline/voice2query_pipeline.ipynb |
Full end-to-end pipeline |
A quick snapshot — full details in docs/documentation.md:
- Text-to-SQL: 7/7 test queries succeeded across JOIN, GROUP BY, HAVING, subqueries, CASE, and multi-table LEFT JOIN, in both English and Italian
- ASR: Whisper
turbotranscribed clean-audio queries correctly; the custom edit-distance corrector fixed 4/4 simulated domain-specific error types - Average end-to-end response time: ~20–25 seconds per query (WrenAI + GPT-4o-mini)
English and Italian — these are the two languages the pipeline has been built and tested for, including the ASR correction module's number-word mapping and the domain query test set.
Whisper's underlying
turbomodel supports 99+ languages out of the box, so the pipeline could in principle be extended to other languages. That would require updating the ASR correction module and re-validating accuracy for each new language — it hasn't been tested here.
The docs/ folder contains:
documentation.md— full system architecture, methodology, LLM/DBMS selection process, results, and conclusionsrelated-works.md— analysis of the 8 papers that informed the project's design decisionsWrenAI.md— in-depth look at WrenAI's architecture, MDL, and known limitations
- Requires an active OpenAI API key and incurs a small per-query cost (~$0.002)
- Average response time of 20–25 seconds is fine for a demo, not for production-grade interactivity
- The ASR corrector relies on a fixed keyword list and doesn't generalise beyond the Campania tourism vocabulary
start.batonly supports Windows out of the box
Built by Isabella Di Lorenzi and Maria Pasconcino for the Data Management final project, University of Naples Federico II (2026).
Thanks to the authors of the papers referenced in docs/related-works.md,
whose work on cascaded and end-to-end Speech-to-SQL pipelines, ASR error
correction, and multilingual NLIDBs shaped some design decisions in this project.
This project is released under the MIT License — feel free to use, adapt, or build on it for your own work.