Voice2Query

A voice-driven natural language interface for querying a tourism database. Ask a question in English or Italian — by voice or text — and get back a SQL query, a results table, and a chart. No SQL knowledge required.

Built for the Data Management course at the University of Naples Federico II (2026).

Demo

Voice query — ▶ Watch the demo

Text query — ▶ Watch the demo

Both demos follow the same pipeline: the question (spoken or typed) is transcribed if needed, corrected for ASR errors, translated into SQL by WrenAI, run against campania_tourism, and rendered as a table + chart.

Architecture

Voice / Text
    |
    v
Whisper (ASR)
    |
    v
ASR Error Correction
    |
    v
WrenAI (Text-to-SQL) — GPT-4o-mini
    |
    v
PostgreSQL (campania_tourism)
    |
    v
Results + Chart

Project Structure

voice2query/
├── asr/                          — Task 1: Speech-to-Text (Whisper)
│   ├── task1_speech_to_text.ipynb
│   └── README.md
├── databases/                    — PostgreSQL schema and data
│   └── database_postgresql/
│       ├── 01_schema1.sql
│       ├── 02_seed_data1.sql
│       ├── 03_queries1.sql
│       ├── 04_init_db1.py
│       └── README.md
├── text2sql/                     — Task 2: Text-to-SQL (WrenAI)
│   ├── docker/
│   │   ├── docker-compose.yaml
│   │   ├── config.yaml
│   │   ├── .env.example
│   │   └── README.md
│   └── task2_text_to_sql.ipynb
├── pipeline/                     — Full end-to-end pipeline
│   ├── voice2query_pipeline.ipynb
│   └── README.md
├── dashboard/                    — Streamlit web interface
│   ├── app.py
│   └── README.md
├── docs/                         — Architecture and design documentation
├── start.bat                     — One-click startup script (Windows)
└── README.md

Database

The campania_tourism PostgreSQL database covers the main tourist destinations of the Campania region, including cities, attractions, hotels, restaurants, events, users, bookings and reviews.

Table	Rows
cities	10
attractions	20
hotels	19
restaurants	15
events	12
users	10
bookings	13
reviews	15

Technology Stack

Component	Technology
Speech-to-Text	OpenAI Whisper (turbo)
ASR Correction	Custom edit-distance module
Text-to-SQL	WrenAI 0.29
LLM	GPT-4o-mini (OpenAI API)
Embeddings	text-embedding-3-small
Vector store	Qdrant
Database	PostgreSQL 16
Web interface	Streamlit
Visualisation	Plotly

How to Run

Prerequisites

Docker Desktop 4.17+
Python 3.11+
An OpenAI API key (platform.openai.com) — each user needs their own key

Note: start.bat is Windows-only. On macOS/Linux, run the Docker and database steps manually using the commands below — there is no start.sh equivalent yet (contributions welcome).

First-time setup

1. Create the PostgreSQL container (only once per machine):

docker run --name campania-pg -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:16

⚠️ postgres/postgres is a local development default only — don't reuse it for anything beyond running this project on your own machine.

2. Configure your OpenAI API key:

cp text2sql/docker/.env.example text2sql/docker/.env

Then open text2sql/docker/.env and fill in the key: OPENAI_API_KEY=sk-...

3. Install Python dependencies:

pip install openai-whisper streamlit pandas plotly sqlalchemy psycopg2-binary requests sounddevice scipy

Start everything (Windows)

.\start.bat

This will:

Check that Docker is running
Start all WrenAI Docker containers
Start the PostgreSQL container
Initialise the database
Launch the dashboard at http://localhost:8501

WrenAI interface: http://localhost:3000

Start everything (macOS/Linux)

docker compose -f text2sql/docker/docker-compose.yaml up -d
docker start campania-pg
python databases/database_postgresql/04_init_db1.py
streamlit run dashboard/app.py

Notebooks

Notebook	Description
`asr/task1_speech_to_text.ipynb`	Whisper transcription and ASR correction
`text2sql/task2_text_to_sql.ipynb`	Text-to-SQL with 7 query tests
`pipeline/voice2query_pipeline.ipynb`	Full end-to-end pipeline

Results

A quick snapshot — full details in docs/documentation.md:

Text-to-SQL: 7/7 test queries succeeded across JOIN, GROUP BY, HAVING, subqueries, CASE, and multi-table LEFT JOIN, in both English and Italian
ASR: Whisper turbo transcribed clean-audio queries correctly; the custom edit-distance corrector fixed 4/4 simulated domain-specific error types
Average end-to-end response time: ~20–25 seconds per query (WrenAI + GPT-4o-mini)

Languages Supported

English and Italian — these are the two languages the pipeline has been built and tested for, including the ASR correction module's number-word mapping and the domain query test set.

Whisper's underlying turbo model supports 99+ languages out of the box, so the pipeline could in principle be extended to other languages. That would require updating the ASR correction module and re-validating accuracy for each new language — it hasn't been tested here.

Documentation

The docs/ folder contains:

documentation.md — full system architecture, methodology, LLM/DBMS selection process, results, and conclusions
related-works.md — analysis of the 8 papers that informed the project's design decisions
WrenAI.md — in-depth look at WrenAI's architecture, MDL, and known limitations

Limitations

Requires an active OpenAI API key and incurs a small per-query cost (~$0.002)
Average response time of 20–25 seconds is fine for a demo, not for production-grade interactivity
The ASR corrector relies on a fixed keyword list and doesn't generalise beyond the Campania tourism vocabulary
start.bat only supports Windows out of the box

Authors & Acknowledgments

Built by Isabella Di Lorenzi and Maria Pasconcino for the Data Management final project, University of Naples Federico II (2026).

Thanks to the authors of the papers referenced in docs/related-works.md, whose work on cascaded and end-to-end Speech-to-SQL pipelines, ASR error correction, and multilingual NLIDBs shaped some design decisions in this project.

License

This project is released under the MIT License — feel free to use, adapt, or build on it for your own work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice2Query

Table of Contents

Demo

Architecture

Project Structure

Database

Technology Stack

How to Run

Prerequisites

First-time setup

Start everything (Windows)

Start everything (macOS/Linux)

Notebooks

Results

Languages Supported

Documentation

Limitations

Authors & Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
asr		asr
dashboard		dashboard
databases		databases
demo		demo
docs		docs
pipeline		pipeline
text2sql		text2sql
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
start.bat		start.bat

Folders and files

Latest commit

History

Repository files navigation

Voice2Query

Table of Contents

Demo

Architecture

Project Structure

Database

Technology Stack

How to Run

Prerequisites

First-time setup

Start everything (Windows)

Start everything (macOS/Linux)

Notebooks

Results

Languages Supported

Documentation

Limitations

Authors & Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages