A real-time vehicle detection and matching system built on WeChat Miniprogram with YOLOX + CLIP, powered by WeChat Cloud Development.
- Reference Image Upload - Upload a target vehicle photo and set matching thresholds
- Real-Time Camera Monitoring - Live camera feed with AI-powered vehicle detection (~10s/frame)
- Smart Matching - Composite scoring: image similarity (70%) + color consistency (20%) + brand consistency (10%)
- Detection Reports - View all matches with timestamps, confidence scores, and screenshots
- Task History - Browse and manage past detection tasks
- Local Video Analysis - Batch analyze video files via web interface with higher accuracy model (~25s/frame)
- Scene Enhancement - Automatic image preprocessing for night, fog, rain, and motion blur conditions
- Smart Deduplication - Feature similarity + position tracking + 60s cooldown to avoid duplicate captures
+---------------------+ +----------------------+
| WeChat Miniprogram |---->| AI Engine (Python) |
| Real-time Camera | | YOLOX + CLIP |
| Detection Records | | Flask Server :5000 |
+---------------------+ +----------+-----------+
| |
v v
+---------------------+ +----------------------+
| WeChat Cloud Dev | | Web Video Analysis |
| Cloud DB + Storage | | localhost:5000/video |
| Cloud Functions | | |
+---------------------+ +----------------------+
| Component | Technology | License |
|---|---|---|
| Vehicle Detection | YOLOX-S | Apache 2.0 |
| Feature Matching | CLIP ViT-B/32 + ViT-L/14 | MIT |
| AI Server | Python Flask | |
| Frontend | WeChat Miniprogram (Cloud Dev) | |
| Database | WeChat Cloud Database | |
| Tunnel | natapp |
| Scenario | Model | Speed | Accuracy |
|---|---|---|---|
| Miniprogram real-time | ViT-B/32 | ~10s/frame | Medium |
| Video file analysis | ViT-L/14 | ~25s/frame | High |
| Reference feature extraction | ViT-L/14 | One-time | High |
- Python 3.10+
- Node.js
- WeChat DevTools
- natapp (for tunnel forwarding)
pip install flask torch torchvision Pillow numpy scikit-learn
pip install opencv-python loguru psutil thop tabulate pycocotools
pip install openai-clipcd server/ai_engine
python detection_server.pyOn first launch, models will be downloaded automatically:
- YOLOX-S (~80MB)
- CLIP ViT-B/32 (~350MB)
- CLIP ViT-L/14 (~890MB)
Download natapp from https://natapp.cn/, set up a tunnel pointing to port 5000, then update the tunnel URL in:
miniprogram/utils/cloud.js- line 3:AI_ENGINE_URLcloudfunctions/uploadReference/index.js- line 9:AI_ENGINE_URLcloudfunctions/analyzeFrame/index.js- line 10:AI_ENGINE_URL
- Fill in your AppID in
project.config.json - Fill in your Cloud Environment ID in
miniprogram/app.js - Open the project in WeChat DevTools
- Enable Cloud Development and initialize the environment
- Create database collections:
tasksanddetections - Deploy all cloud functions: right-click each folder under
cloudfunctions/and select "Upload and Deploy: Install Dependencies in Cloud" - In local settings, check "Do not verify valid domain names"
In Cloud Development Console > Database > set each collection's permission to:
All users can read and write
├── project.config.json # WeChat DevTools project config
├── miniprogram/
│ ├── app.js # Entry point (cloud init)
│ ├── app.json # Page routing + TabBar config
│ ├── app.wxss # Global styles
│ ├── utils/
│ │ └── cloud.js # AI engine communication + cloud DB ops
│ └── pages/
│ ├── index/ # Home: upload reference image + start detection
│ ├── camera/ # Monitoring: camera frames + AI analysis
│ ├── result/ # Results: detection report + sharing
│ └── gallery/ # History: task management
├── cloudfunctions/
│ ├── login/ # User authentication
│ ├── uploadReference/ # Upload reference image
│ ├── analyzeFrame/ # Analyze single frame
│ ├── getResults/ # Query detection results
│ ├── getRecentTasks/ # Fetch recent tasks
│ └── deleteTask/ # Delete task records
├── server/
│ └── ai_engine/
│ ├── detection_server.py # AI engine main server
│ └── templates/
│ └── video.html # Web UI for batch video analysis
└── docs/
├── ARCHITECTURE.md # System architecture details
└── YOLOX_SETUP.md # Model setup instructions
- Open the miniprogram and upload a target vehicle photo
- Adjust similarity threshold (recommended: 55%)
- Tap "Start Monitoring" and point the camera at the road
- The system captures and analyzes one frame every 5 seconds
- When a match is found: vibration alert + screenshot saved (with green detection box)
- Tap "Stop" to view the detection report
- Share the report with WeChat contacts or save to album
- Ensure the AI engine is running
- Open
http://localhost:5000/videoin your browser - Upload a target vehicle photo
- Enter the video file path (e.g.,
D:\footage\highway.mp4) - Set frame rate and threshold, then click "Start Analysis"
- Monitor progress, logs, and matched screenshots in real time
- Open the screenshot folder or download the report when complete
- YOLOX: Apache License 2.0
- CLIP: MIT License
- This project: For personal and educational use only. Commercial use is not permitted without prior written consent from the author. If you wish to use this project commercially, please contact the repository owner for authorization.