VoiceIQ is an industry-grade, production-ready Speech Emotion Recognition (SER) platform. It leverages Deep Learning (1D CNNs, LSTMs, and Hybrid architectures) and advanced Audio Signal Processing to accurately detect human emotions from raw speech audio.
Designed with a futuristic, glassmorphism-themed Streamlit frontend, VoiceIQ provides a premium SaaS-like experience.
- Real-Time Inference: Predict emotions live using microphone input or by uploading audio files.
- Deep Learning Architectures: Includes CNN, LSTM, and Hybrid CNN-LSTM models.
- Explainable AI (XAI): Visualizes feature importance and acoustic contributions.
- Advanced Audio Visualizations: Interactive Plotly-based Waveforms, Mel Spectrograms, and Radar Charts.
- Automated PDF Reports: Generates downloadable, professional analysis summaries.
- Premium UI/UX: Responsive, dark-mode, glassmorphism dashboard built with custom CSS.
Audio Input (Mic/File) → Preprocessing (Noise Reduction, Silence Trimming)
→ Feature Extraction (MFCC, Chroma, Mel, Tonnetz) → Scaler
→ Deep Learning Model (CNN/LSTM) → Output Probabilities
→ UI Rendering & PDF Report Generation
-
Clone the repository:
git clone https://github.com/yourusername/AI_Speech_Emotion_Recognition.git cd AI_Speech_Emotion_Recognition -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows use: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Prepare Datasets (Optional for training):
- Download RAVDESS, TESS, and EMO-DB datasets.
- Extract them into the respective folders inside
datasets/.
-
Run the Application:
streamlit run app.py
The platform is built to handle multiple audio datasets seamlessly:
- RAVDESS: Ryerson Audio-Visual Database of Emotional Speech and Song
- TESS: Toronto emotional speech set
- EMO-DB: Berlin Database of Emotional Speech
- Push this repository to GitHub.
- Log into Streamlit Cloud or HuggingFace Spaces.
- Connect your repository and select
app.pyas the entry point. - Add
packages.txtif system dependencies (likelibsndfile1) are required for librosa in Linux environments.
Developed by an elite AI Research Engineer for the CodeAlpha Internship Portfolio.