An AI-powered real-time Arabic Sign Language detection system using YOLOv11. This project can detect Arabic sign language gestures through a camera feed and provide both visual feedback and Arabic audio pronunciation of the detected words.
- Real-time Detection: Live camera feed processing for sign language recognition
- YOLOv11 Model: State-of-the-art object detection for accurate gesture recognition
- Arabic Translation: Converts detected English sign words to Arabic text
- Audio Feedback: Text-to-speech functionality in Arabic for detected signs
- Web Interface: User-friendly Gradio web interface for easy interaction
- Dual Mode: Both terminal-based (
app.py) and web-based (main.py) applications
The model currently recognizes 13 common Arabic sign language gestures:
| English | Arabic | English | Arabic |
|---|---|---|---|
| Hello | مرحبا | Dog | كلب |
| Thanks | شكرا | Love | حب |
| Yes | نعم | Me | أنا |
| No | لا | You | أنت |
| Sorry | آسف | Mother | أم |
| Fine | بخير | Smile | ابتسامة |
| Sunday | الأحد |
- Python 3.8 or higher
- Webcam/Camera
- CUDA-capable GPU (optional, for better performance)
-
Clone the repository
git clone https://github.com/amjadAwad95/sign-language.git cd sign-language -
Install dependencies
pip install -r requirements.txt
-
Run the application
Option A: Web Interface (Recommended)
python main.py
Then open your browser to the displayed local URL (usually
http://127.0.0.1:7860)Option B: Terminal Interface
python app.py
Press 'q' to quit the camera feed.
- Base Model: YOLOv11s (small variant for balance of speed and accuracy)
- Framework: Ultralytics YOLO
- Input Size: 640x640 pixels
- Format: ONNX (optimized for deployment)
- ONNX Runtime for faster inference
- CUDA acceleration when available
- Frame skipping for real-time performance
- Audio cooldown to prevent spam
- ultralytics: YOLO model implementation
- torch & torchvision: PyTorch framework
- opencv-python: Computer vision operations
- gradio: Web interface framework
- gTTS: Google Text-to-Speech
- pygame: Audio playback
- onnxruntime-gpu: ONNX inference optimization
- Launch the web app:
python main.py - Allow camera permissions in your browser
- Point your camera at sign language gestures
- Watch real-time detection with Arabic translation and audio
- Run:
python app.py - Position yourself in front of the camera
- Perform sign language gestures
- Listen for Arabic pronunciation of detected signs
- Press 'q' to quit
The model was trained using YOLOv11 with custom Arabic sign language dataset:
yolo detect train model=yolo11s.pt data=data.yaml epochs=60 imgsz=640 project="arabic-sl-yolov11" name="arabic-sl"Training details:
- Epochs: 60
- Image Size: 640x640
- Base Model: YOLOv11s pre-trained weights
- Tracking: Weights & Biases integration
We welcome contributions! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- 📈 Adding more sign language gestures
- 🌍 Supporting other Arabic dialects
- ⚡ Performance optimizations
- 🎨 UI/UX improvements
- 📱 Mobile app development
- Real-time Processing: 15-30 FPS depending on hardware
- Accuracy: Trained on diverse Arabic sign language dataset
- Latency: < 100ms inference time with GPU acceleration
- Memory Usage: ~2GB with CUDA, ~1GB CPU-only
-
Camera not detected
- Ensure camera permissions are granted
- Try changing camera index in code (0, 1, 2...)
-
Slow performance
- Install CUDA version:
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118 - Reduce inference frequency in config
- Install CUDA version:
-
Audio not working
- Check system audio settings
- Install/update pygame:
pip install --upgrade pygame
-
Model not found
- Ensure
model/model.onnxexists - Download pre-trained model or train your own
- Ensure
This project is licensed under the MIT License - see the LICENSE file for details.
- Ultralytics for the amazing YOLO framework
- Arabic Sign Language community for datasets and guidance
- Google Text-to-Speech for Arabic audio generation
- Gradio team for the excellent web interface framework
Made with ❤️ for the Arabic Sign Language community