A full-stack AI system that detects wildfires in near real-time using satellite imagery and a custom CNN. The system ingests NASA satellite data, runs inference, and delivers alerts to stakeholders within 10 minutes of image capture.
Course: ENG 4000 — Capstone Design Project, York University (Fall 2024)
Team: Emmanuel Akin-salami · Kellan Ho · Angelique Izere · Noran Kerret · Parmoun Khalkhali Sharifi · Sara Riazi
- Overview
- Architecture
- ML Model
- Results
- How to Run
- Tools & Stack
- Lessons Learned
- 3-Minute Walkthrough
Traditional wildfire detection relies on ground sensors, aerial patrols, and manual observation — methods that fail in remote areas and create dangerous delays. EFDS replaces this with a satellite-fed CNN pipeline:
- Satellite imagery is fetched from NASA Earth Data (MODIS/VIIRS thermal bands)
- Preprocessing patches each image into 32×32 segments and applies noise reduction
- A custom CNN classifies each patch as fire / no-fire and outputs a confidence score
- Alerts with fire location, confidence, and heatmap overlay are sent via SMS/email within 10 minutes
- A web dashboard shows active fire detections on an interactive map in near real-time
Target users: government fire agencies, environmental organizations, communities in fire-prone regions.
NASA Earth Data (MODIS / VIIRS)
│
▼
┌──────────────────┐
│ Preprocessing │ ← Image patching (32×32), noise reduction, thermal band extraction
└────────┬─────────┘
│
▼
┌──────────────────┐
│ CNN Model │ ← Custom 4-layer CNN (trained from scratch on FLAME dataset)
│ (Keras / TF) │ Fire class: 93% precision / 97% recall / F1 = 0.95
└────────┬─────────┘
│ confidence score + bounding region
▼
┌──────────────────────────────────────────┐
│ Flask Backend API │ ← Deployed on Google Cloud
│ - /detect POST image → fire result │
│ - /alerts GET active fire events │
│ - /subscribe POST notification prefs │
└───────────┬──────────────────────────────┘
│ │
┌──────────▼──────┐ ┌─────────▼──────────────┐
│ HTML/CSS │ │ Notification System │
│ Dashboard │ │ SMS / email via │
│ - Live map │ │ cloud messaging │
│ - Confidence │ │ (< 10 min from image) │
│ - Status feed │ └────────────────────────-┘
└─────────────────┘
A custom Convolutional Neural Network trained from scratch:
Input: 32×32×3 RGB patch (from satellite imagery)
│
├─ Conv2D(32, 3×3, ReLU) → MaxPool(2×2)
├─ Conv2D(64, 3×3, ReLU) → MaxPool(2×2)
├─ Conv2D(128, 3×3, ReLU) → MaxPool(2×2)
├─ Conv2D(128, 3×3, ReLU) → MaxPool(2×2)
├─ Flatten → Dense(512, ReLU) → Dropout(0.5)
└─ Dense(1, Sigmoid) → fire probability [0, 1]
Model size: 1.91 MB | Inference memory: 0.64 MB
Why CNN from scratch vs. transfer learning? We benchmarked the custom CNN against VGG16 transfer learning. The custom model was chosen for deployment due to its significantly smaller footprint (1.91 MB vs ~500 MB for VGG16) and near-equivalent fire-class recall, which is the more safety-critical metric.
- Source: FLAME dataset (aerial wildfire imagery) + NASA MODIS thermal imagery
- Classes: Fire / No Fire (binary classification)
- Preprocessing: 32×32 patch extraction, min-max normalization, horizontal/vertical flip augmentation
| Metric | Value |
|---|---|
| Overall Accuracy | 98% |
| No-Fire Precision | 99% |
| No-Fire Recall | 98% |
| No-Fire F1 | 0.99 |
| Fire Precision | 93% |
| Fire Recall | 97% |
| Fire F1 | 0.95 |
| Throughput | 7,336 images in ~20 s (0.003 s / image) |
| Alert latency | < 10 minutes from satellite image capture |
| Model size | 1.91 MB |
| Inference memory | 0.64 MB |
| Backend CPU usage | 3.6–3.8% (Google Cloud, concurrent load test) |
The 98% overall accuracy exceeds the project's 85% target. The fire-class recall of 97% means only 3% of actual fires are missed — minimizing the most safety-critical failure mode.
pip install tensorflow keras flask numpy pillow opencv-pythonfrom tensorflow.keras.models import load_model
from PIL import Image
import numpy as np
model = load_model('best_fire_detection_model_32x32.h5')
def predict(image_path):
img = Image.open(image_path).resize((32, 32))
arr = np.array(img) / 255.0
arr = np.expand_dims(arr, axis=0)
prob = model.predict(arr)[0][0]
label = "FIRE" if prob > 0.5 else "NO FIRE"
print(f"{label} (confidence: {prob:.2%})")
predict("your_satellite_patch.jpg")cd Backend/
pip install -r requirements.txt
python app.py
# → Server running at http://localhost:5000API endpoints:
POST /detect — Upload image, get fire probability + heatmap overlay
GET /alerts — Retrieve active fire detections with location + confidence
POST /subscribe — Set notification preferences (SMS or email)
# Open app/index.html in a browser, or serve with:
cd app/
python3 -m http.server 8080
# → http://localhost:8080Upload a satellite image patch via the dashboard → the backend runs inference → the result appears on the interactive map with a confidence score and heatmap overlay.
| Layer | Tool / Service |
|---|---|
| ML Framework | TensorFlow / Keras |
| Data source | NASA Earth Data (MODIS, VIIRS) · FLAME dataset |
| Preprocessing | Python (NumPy, OpenCV, PIL) |
| Backend | Flask (Python) · Google Cloud Platform |
| Frontend | HTML / CSS · Leaflet.js (interactive map) |
| Notifications | Google Cloud Messaging (SMS + email) |
| GPU Training | Lambda Cloud (A100, $0.14/hr, capped at $150) |
| Versioning | Git / GitHub |
Satellite image quality is the hardest constraint. MODIS and VIIRS data is free and globally available, but moderate spatial resolution means small fires may not be distinguishable from hot ground in a single patch. Image patching at 32×32 helped, but real-time access to higher-resolution commercial data would meaningfully improve recall.
Fire recall > fire precision. A false positive (alerting when there's no fire) is a nuisance. A false negative (missing a real fire) can be catastrophic. We tuned the confidence threshold below 0.5 during testing to bias toward higher recall, and reported separately. The deployed threshold is 0.45 based on stakeholder feedback.
Agile kept 6 people on track. Six-person engineering projects are coordination problems as much as technical ones. Six-sprint cycles with a defined backlog, retrospectives, and clearly scoped deliverables per sprint prevented scope creep and kept integration from becoming a last-minute scramble.
Flask is enough for MVP, but won't scale past a few concurrent users. For a production system, async inference (Celery/Redis queue) and a proper API gateway would replace the synchronous Flask endpoints.
| Time | What to show |
|---|---|
| 0:00–0:30 | Problem framing: show a news clip or satellite image of a wildfire; explain why detection latency matters |
| 0:30–1:00 | Walk through CNN.py — architecture definition, training loop, explain 32×32 patch rationale |
| 1:00–1:30 | Run CNN Test.py live — show the confusion matrix output and 98% accuracy |
| 1:30–2:00 | Show the Flask /detect endpoint — POST a sample satellite patch, get back JSON with confidence score |
| 2:00–2:30 | Open the dashboard — show the interactive map, upload an image, point to the heatmap overlay |
| 2:30–3:00 | Discuss fire recall vs. precision trade-off and what production scaling would look like |