Skip to content

Mithurn/instagram-profile-webscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


Instagram Analytics
Instagram Analytics Dashboard

A full-stack, deployable web application that scrapes public Instagram profile data and presents it in an interactive analytics dashboard with real-time rankings and filtering capabilities.

Python React Flask PostgreSQL Docker
License Status Tests Live Demo

Key Features β€’ Architecture β€’ Installation β€’ Usage β€’ API β€’ Deployment β€’ Contributing

Instagram Analytics Dashboard

Dashboard Features

πŸš€ Key Features

πŸ“Š Comprehensive Analytics

  • Real-time Profile Metrics: Track followers, following, posts count, and engagement rates
  • Interactive Dashboard: Sort, filter, and rank profiles by any metric
  • Auto-refresh: Live data updates every 5-10 minutes
  • Historical Tracking: Monitor growth trends over time

πŸ” Advanced Scraping Engine

  • Anti-Detection: Randomized delays, User-Agent rotation, and proxy support
  • Robust Error Handling: Graceful handling of private profiles and missing data
  • Scalable Architecture: Process multiple profiles simultaneously
  • Scheduled Updates: Automated daily data collection

🎨 Modern UI/UX

  • Responsive Design: Beautiful interface across all devices
  • Dark/Light Mode: Customizable themes
  • Interactive Elements: Smooth transitions and loading indicators
  • Data Visualization: Charts and graphs for trend analysis

πŸ—οΈ Production Ready

  • Dockerized: Complete containerization with docker-compose
  • RESTful API: Well-documented endpoints
  • Database Optimization: Efficient schema design
  • CI/CD Ready: Automated testing and deployment

πŸ› οΈ Technical Stack

Layer Technology Version Purpose
Frontend React 18+ Modern UI with hooks and functional components
Backend Flask 2.0+ Lightweight Python web framework for REST API
Database PostgreSQL 13+ Robust relational database for complex queries
Scraping Selenium/Playwright Latest JavaScript rendering for dynamic content
Containerization Docker Latest Consistent deployment across environments
Task Queue Celery 5.0+ Asynchronous task processing for scraping
ORM SQLAlchemy 2.0+ Python SQL toolkit for database operations

Technology Justification

  • React 18+: Chosen for its component-based architecture, excellent performance with concurrent features, and vast ecosystem
  • Flask: Selected for its simplicity, flexibility, and perfect fit for REST API development without unnecessary overhead
  • PostgreSQL: Opted for its ACID compliance, advanced indexing capabilities, and excellent performance with complex analytical queries
  • Selenium/Playwright: Essential for handling Instagram's JavaScript-heavy interface and anti-bot measures
  • Docker: Ensures consistent deployment and easy scaling across different environments

🧠 Challenges & Solutions

Challenge 1: Instagram Anti-Scraping Measures

Problem: Instagram employs sophisticated anti-bot detection including rate limiting, CAPTCHAs, and behavioral analysis.

Solution: Implemented a multi-layered approach:

  • Randomized Delays: Dynamic sleep intervals (2-8 seconds) between requests
  • User-Agent Rotation: Pool of realistic browser user agents
  • Session Management: Proper cookie handling and session persistence
  • Proxy Integration: Optional proxy rotation for IP diversity
  • Request Headers: Mimicking real browser behavior with proper headers

Challenge 2: Scalable Data Processing

Problem: Processing hundreds of profiles efficiently while maintaining data consistency and handling failures gracefully.

Solution: Built a robust architecture:

  • Celery Task Queue: Asynchronous processing with Redis backend
  • Database Connection Pooling: Optimized connection management
  • Retry Logic: Exponential backoff for failed requests
  • Data Validation: Comprehensive input validation and error handling
  • Batch Processing: Efficient bulk operations for database updates

πŸ›οΈ Architecture

graph TB
    A[Frontend - React] --> B[Backend API - Flask]
    B --> C[Database - PostgreSQL]
    D[Scraper - Python] --> C
    E[Scheduler] --> D
    F[Docker Compose] --> A
    F --> B
    F --> C
    F --> D
Loading

πŸ“¦ Installation

Prerequisites

  • Python 3.8+
  • Node.js 16+
  • Docker & Docker Compose
  • PostgreSQL 13+ (optional for local development)

Quick Start with Docker

# Clone the repository
git clone https://github.com/Mithurn/instagram-analytics-dashboard.git
cd instagram-analytics-dashboard

# Start all services
docker-compose up -d

# The application will be available at:
# Frontend: http://localhost:3000
# Backend API: http://localhost:5000

Manual Installation

# Backend Setup
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
python app.py

# Frontend Setup (in a new terminal)
cd frontend
npm install
npm start

Environment Setup

Create a .env file in the root directory with the following essential variables:

# Database Configuration
DATABASE_URL=postgresql://username:password@localhost:5432/instagram_analytics
REDIS_URL=redis://localhost:6379/0

# API Configuration
SECRET_KEY=your-secret-key-here
FLASK_ENV=development

Note: See .env.example for additional configuration options.

🎯 Usage

Adding Profiles to Track

  1. Via Dashboard: Use the web interface to add Instagram usernames
  2. Via API: POST to /api/profiles/update with a list of usernames
  3. Via Config: Add profiles to the configuration file

Dashboard Features

  • πŸ“ˆ Rankings: View profiles ranked by followers, engagement, or growth
  • πŸ” Search & Filter: Find specific profiles or filter by criteria
  • πŸ“Š Analytics: Detailed metrics and trend analysis
  • ⚑ Real-time Updates: Automatic data refresh

API Endpoints

# Get all profiles
GET /api/profiles

# Get ranked profiles
GET /api/profiles/ranked?by=followers

# Get specific profile
GET /api/profiles/{username}

# Update profiles
POST /api/profiles/update

🐳 Deployment

Docker Deployment

# Production deployment
docker-compose -f docker-compose.prod.yml up -d

# With environment variables
cp .env.example .env
# Edit .env with your configuration
docker-compose up -d

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

βš–οΈ Legal & Ethical Considerations

This project is designed for educational and personal use only. Please ensure you:

  • βœ… Only scrape publicly available data
  • βœ… Respect Instagram's Terms of Service
  • βœ… Implement reasonable rate limiting
  • βœ… Use data responsibly and ethically

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ‘¨β€πŸ’» Author & Contact

Mithurn Jeromme
Full-Stack Developer & Data Enthusiast


Built with ❀️ for the developer community

Report Bug β€’ Request Feature β€’ Documentation

About

A full-stack web application that scrapes data from public Instagram profiles and displays them in an interactive analytics dashboard.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors