Skip to content

Nightcoder-26/CodeAlpha_Emotionprediction

Repository files navigation

VoiceIQ - AI Speech Emotion Recognition Platform

📖 Project Overview

VoiceIQ is an industry-grade, production-ready Speech Emotion Recognition (SER) platform. It leverages Deep Learning (1D CNNs, LSTMs, and Hybrid architectures) and advanced Audio Signal Processing to accurately detect human emotions from raw speech audio.

Designed with a futuristic, glassmorphism-themed Streamlit frontend, VoiceIQ provides a premium SaaS-like experience.

✨ Key Features

  • Real-Time Inference: Predict emotions live using microphone input or by uploading audio files.
  • Deep Learning Architectures: Includes CNN, LSTM, and Hybrid CNN-LSTM models.
  • Explainable AI (XAI): Visualizes feature importance and acoustic contributions.
  • Advanced Audio Visualizations: Interactive Plotly-based Waveforms, Mel Spectrograms, and Radar Charts.
  • Automated PDF Reports: Generates downloadable, professional analysis summaries.
  • Premium UI/UX: Responsive, dark-mode, glassmorphism dashboard built with custom CSS.

🧠 System Architecture

Audio Input (Mic/File) → Preprocessing (Noise Reduction, Silence Trimming) 
→ Feature Extraction (MFCC, Chroma, Mel, Tonnetz) → Scaler 
→ Deep Learning Model (CNN/LSTM) → Output Probabilities 
→ UI Rendering & PDF Report Generation

🛠️ Installation & Setup

  1. Clone the repository:

    git clone https://github.com/yourusername/AI_Speech_Emotion_Recognition.git
    cd AI_Speech_Emotion_Recognition
  2. Create a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows use: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Prepare Datasets (Optional for training):

    • Download RAVDESS, TESS, and EMO-DB datasets.
    • Extract them into the respective folders inside datasets/.
  5. Run the Application:

    streamlit run app.py

📊 Datasets Supported

The platform is built to handle multiple audio datasets seamlessly:

  • RAVDESS: Ryerson Audio-Visual Database of Emotional Speech and Song
  • TESS: Toronto emotional speech set
  • EMO-DB: Berlin Database of Emotional Speech

🚀 Deployment (Streamlit Cloud / HuggingFace Spaces)

  1. Push this repository to GitHub.
  2. Log into Streamlit Cloud or HuggingFace Spaces.
  3. Connect your repository and select app.py as the entry point.
  4. Add packages.txt if system dependencies (like libsndfile1) are required for librosa in Linux environments.

👨‍💻 Author

Developed by an elite AI Research Engineer for the CodeAlpha Internship Portfolio.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors