A comprehensive web application designed to assist visually impaired students with features like object detection, text-to-speech, speech-to-text, and task management using voice commands.
- Problem Statement
- Solution
- Features
- System Architecture
- Technology Stack
- Prerequisites
- Installation
- Usage
- Contributors
Visually impaired students face significant challenges in accessing educational materials, managing tasks, and interacting with their environment. Traditional learning materials and tools often lack accessibility features, creating barriers to education and daily activities.
VisionMATE provides an integrated platform with multiple assistive features:
- Real-time object detection for environmental awareness
- Text-to-Speech conversion for reading documents
- Speech-to-Text for note-taking and communication
- Voice-controlled navigation
- Task management system with voice commands
- Object Detection using TensorFlow.js and COCO-SSD model
- Text-to-Speech functionality for document reading
- Speech-to-Text for live captioning
- Voice-controlled navigation throughout the application
- Interactive To-Do list with voice commands
- User authentication system
- Responsive and accessible interface
-
Frontend Layer:
- React.js based user interface
- TensorFlow.js for object detection
- Web Speech API integration
-
Backend Layer:
- Node.js/Express server
- MongoDB database
- RESTful API architecture
Frontend:
- React.js
- TensorFlow.js
- Web Speech API
- Styled Components
- Tailwind CSS
- Framer Motion
Backend:
- Node.js
- Express.js
- MongoDB
- JWT Authentication
- Node.js (v14 or higher)
- MongoDB
- Modern web browser
- Webcam
- Microphone
- Clone the repository:
git clone https://github.com/akshitjain16/VisionMATE.git- Install server dependencies:
cd server
npm install- Install client dependencies:
cd client
npm install- Start the server:
cd server
npm start- Start the client:
cd client
npm start- Sign up/Login to access the application
- Use voice commands for navigation:
- "Go to home"
- "Go to object detection"
- "Go to text to speech"
- "Go to todo list"
- Access different features through the intuitive interface
- Use voice commands or manual controls to interact with each feature
This project is licensed under the MIT License - see the LICENSE.md file for details