Skip to content

HarshitPG/Ethereum-RPC-Load-Balancer

Repository files navigation

Ethereum RPC Load Balancer - Backend

Production-grade JSON-RPC load balancer with intelligent routing, auto-failover, circuit breaking, and observability for Ethereum mainnet RPC providers.

TypeScript Node.js Redis License


📋 Table of Contents


🎯 Overview

A high-performance load balancer that distributes Ethereum JSON-RPC requests across multiple providers (Infura, Alchemy, etc.) with:

  • Intelligent Routing: Round-robin or weighted strategies based on latency
  • Auto-Failover: Circuit breakers automatically disable unhealthy providers
  • Smart Caching: Redis-backed cache for deterministic RPC calls (finalized blocks, transactions)
  • Observability: Prometheus metrics, structured logging, Grafana dashboards
  • Operator Dashboard: Real-time monitoring UI for provider analytics and health

Tech Stack: TypeScript, Express, Redis, Prometheus, Grafana, Loki, Tempo


Tech Stack

Backend

  • Runtime: Node.js 18+
  • Language: TypeScript 5.9
  • Framework: Express 5.x
  • Cache: Redis 7.0 (ioredis client)
  • Circuit Breaker: Opossum
  • HTTP Client: Axios
  • Validation: Zod

Observability & Monitoring

  • Metrics: Prometheus + prom-client
  • Dashboards: Grafana
  • Logging: Pino (structured JSON logs)
  • Log Aggregation: Loki
  • Distributed Tracing: Tempo
  • Alerting: NodeMailer (email alerts)

Frontend (Dashboard)

  • Framework: React 18+ with TypeScript
  • Build Tool: Vite
  • Styling: TailwindCSS
  • Charts: Chart.js / Recharts
  • HTTP Client: Axios
  • State Management: React Context / Zustand

DevOps & Infrastructure

  • Containerization: Docker + Docker Compose
  • Testing: Vitest
  • Load Testing: Autocannon
  • Code Quality: ESLint + Prettier

💼 Business Purpose

Problem: At Luganodes, we rely on third-party RPC providers (Infura, Alchemy) to interact with Ethereum. Challenges:

  • Cost Optimization: Some providers are expensive; naive round-robin wastes money on slow/unreliable providers
  • Reliability: A single provider outage causes service downtime
  • Performance: Redundant requests (e.g., fetching the same block 1000 times) are wasteful

Solution: This load balancer:

  1. Saves Money: Routes traffic to cost-effective, high-performing providers
  2. Increases Uptime: Auto-failover ensures 99.9%+ availability
  3. Boosts Speed: Caches ~40-60% of requests, reducing latency by 80%+
  4. Provides Visibility: Grafana dashboards show which providers are reliable/expensive

🏗️ High-Level Architecture

System Architecture Diagram

╔═════════════════════════════════════════════════════════════════════════════════╗
║                                                                                 ║
║                            CLIENT APPLICATIONS                                  ║
║                   (Web3 Apps, dApps, Wallets, Backend Services)                 ║
║                                                                                 ║
╚═══════════════════════════════════════╦═════════════════════════════════════════╝
                                        ║
                                        ║  JSON-RPC Requests
                                        ║  POST / {"jsonrpc":"2.0", "method":"...", ...}
                                        ║
                                        ▼
╔════════════════════════════════════════════════════════════════════════════════╗
║                       ETHEREUM RPC LOAD BALANCER                               ║
║                              (Port 8080)                                       ║
╠════════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  ┌───────────────────────────────────────────────────────────────────────────┐ ║
║  │  REQUEST HANDLER (Express.js)                                             │ ║
║  │  - Generate Correlation ID (UUID)                                         │ ║
║  │  - Validate JSON-RPC payload                                              │ ║
║  │  - Structured logging (Pino)                                              │ ║
║  │  - CORS & middleware chain                                                │ ║
║  └───────────────────────────────────┬───────────────────────────────────────┘ ║
║                                      │                                         ║
║                                      ▼                                         ║
║  ┌───────────────────────────────────────────────────────────────────────────┐ ║
║  │  INTELLIGENT CACHE LAYER                                                  │ ║
║  │  ┌─────────────────────────────────────────────────────────────────────┐  │ ║
║  │  │  Cache Decision Engine                                              │  │ ║
║  │  │  - Is method cacheable? (eth_getBlockByNumber YES, eth_call NO)     │  │ ║
║  │  │  - Contains "latest"? -> Skip cache                                 │  │ ║
║  │  │  - Generate cache key: method + params + chain                      │  │ ║
║  │  └───────────────────────────┬─────────────────────────────────────────┘  │ ║
║  │                              │                                            │ ║
║  │         ┌────────────────────┴─────────────────────┐                      │ ║
║  │         │                                          │                      │ ║
║  │    Cache HIT                                  Cache MISS                  │ ║
║  │         │                                          │                      │ ║
║  │         │    ┌──────────────────────────┐          │                      │ ║
║  │         └───>│  REDIS CACHE             │          │                      │ ║
║  │              │  (Port 6379)             │          │                      │ ║
║  │              ├──────────────────────────┤          │                      │ ║
║  │              │ Finalized: INF TTL       │          │                      │ ║
║  │              │ Recent: 5min TTL         │          │                      │ ║
║  │              │ Unfinalized: 30s TTL     │          │                      │ ║
║  │              └──────────────────────────┘          │                      │ ║
║  │                      │                             │                      │ ║
║  │                      │ Return cached               │ Forward request      │ ║
║  │                      │ response                    ▼                      │ ║
║  └──────────────────────┼────────────────────────────────────────────────────┘ ║
║                         │                             │                        ║
║                         │                             ▼                        ║
║  ┌──────────────────────┼────────────────────────────────────────────────────┐ ║
║  │  PROVIDER MANAGER    │                                                    │ ║
║  │                      │                                                    │ ║
║  │  ┌───────────────────▼────────────────────────────────────────────────┐   │ ║
║  │  │  ROUTING STRATEGY SELECTOR                                         │   │ ║
║  │  │  ┌──────────────────────┐  ┌────────────────────────────────────┐  │   │ ║
║  │  │  │  Round-Robin         │  │  Weighted (EWMA Latency)           │  │   │ ║
║  │  │  │  Equal distribution  │  │  Faster providers get more load    │  │   │ ║
║  │  │  └──────────────────────┘  └────────────────────────────────────┘  │   │ ║
║  │  └───────────────────────────┬────────────────────────────────────────┘   │ ║
║  │                              │                                            │ ║
║  │  ┌───────────────────────────▼───────────────────────────────────────┐    │ ║
║  │  │  CIRCUIT BREAKER (Opossum)                                        │    │ ║
║  │  │  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐   │    │ ║
║  │  │  │   CLOSED   │->│    OPEN    │->│ HALF_OPEN  │->│   CLOSED   │   │    │ ║
║  │  │  │  (Normal)  │  │  (Failed)  │  │  (Testing) │  │(Recovered) │   │    │ ║
║  │  │  └────────────┘  └────────────┘  └────────────┘  └────────────┘   │    │ ║
║  │  │  - Failure threshold: 50% errors in window                        │    │ ║
║  │  │  - Timeout: 5s per request                                        │    │ ║
║  │  │  - Reset timeout: 30s exponential backoff                         │    │ ║
║  │  └───────────────────────────────────────────────────────────────────┘    │ ║
║  └───────────────────────────────────────────────────────────────────────────┘ ║
║                                                                                ║
╚═════════════════════════╦═══════════════════╦═══════════════════╦══════════════╝
                          ║                   ║                   ║
                          ▼                   ▼                   ▼
        ╔═════════════════════════╗ ╔════════════════════╗ ╔══════════════════╗
        ║   INFURA PROVIDER       ║ ║ ALCHEMY PROVIDER   ║ ║ QUICKNODE        ║
        ║   (Ethereum Mainnet)    ║ ║ (Ethereum Mainnet) ║ ║ (Backup)         ║
        ╟─────────────────────────╢ ╟────────────────────╢ ╟──────────────────╢
        ║ Status: Healthy         ║ ║ Status: Healthy    ║ ║ Status: Healthy  ║
        ║ Latency: 142ms          ║ ║ Latency: 98ms      ║ ║ Latency: 210ms   ║
        ║ Weight: 32%             ║ ║ Weight: 46%        ║ ║ Weight: 22%      ║
        ║ Requests: 4,521         ║ ║ Requests: 6,783    ║ ║ Requests: 2,156  ║
        ║ Errors: 12 (0.27%)      ║ ║ Errors: 3 (0.04%)  ║ ║ Errors: 45 (2%)  ║
        ╚═══════════════╦═════════╝ ╚══════════╦═════════╝ ╚══════════╦═══════╝
                        ║                      ║                      ║
                        ╚══════════════════════╩══════════════════════╝
                                             ║
                                             ▼
                        ╔═════════════════════════════════════════╗
                        ║  ETHEREUM MAINNET BLOCKCHAIN            ║
                        ║  (Decentralized Network)                ║
                        ╚═════════════════════════════════════════╝


╔═════════════════════════════════════════════════════════════════════════════════╗
║                         OBSERVABILITY & MONITORING STACK                        ║
╠═════════════════════════════════════════════════════════════════════════════════╣
║                                                                                 ║
║  ┌─────────────────────┐  ┌─────────────────────┐  ┌─────────────────────────┐  ║
║  │ PROMETHEUS          │  │ GRAFANA             │  │ LOKI                    │  ║
║  │ (Port 9090)         │  │ (Port 3001)         │  │ (Port 3100)             │  ║
║  ├─────────────────────┤  ├─────────────────────┤  ├─────────────────────────┤  ║
║  │ - Metrics scraping  │->│ - Live dashboards   │  │ - Log aggregation       │  ║
║  │ - Time-series DB    │  │ - Visualization     │  │ - Full-text search      │  ║
║  │ - PromQL queries    │  │ - Alert manager     │  │ - Log retention         │  ║
║  │ - 15s scrape rate   │  │ - Multi-tenancy     │  │ - JSON parsing          │  ║
║  └─────────────────────┘  └─────────────────────┘  └─────────────────────────┘  ║
║                                                                                 ║
║  ┌───────────────────────────────────────────────────────────────────────────┐  ║
║  │ TEMPO (Distributed Tracing - Port 3200)                                   │  ║
║  │ - End-to-end request tracing      - Latency waterfall visualization       │  ║
║  │ - Span correlation                - Performance bottleneck detection      │  ║
║  └───────────────────────────────────────────────────────────────────────────┘  ║
║                                                                                 ║
║  EMAIL ALERTING (NodeMailer)                                                    ║
║  - All providers down (CRITICAL)      - Cache hit rate < 30% (WARNING)          ║
║  - Only 1 provider remaining (WARN)   - Error rate > 5% (WARNING)               ║
║                                                                                 ║
╚═════════════════════════════════════════════════════════════════════════════════╝


╔═════════════════════════════════════════════════════════════════════════════════╗
║                          ADMIN & OPERATOR INTERFACE                             ║
╠═════════════════════════════════════════════════════════════════════════════════╣
║                                                                                 ║
║  ┌───────────────────────────────────────────────────────────────────────────┐  ║
║  │  ADMIN API (Port 3000)                                                    │  ║
║  │  Authentication: X-Admin-Token header                                     │  ║
║  ├───────────────────────────────────────────────────────────────────────────┤  ║
║  │  Endpoints:                                                               │  ║
║  │  - POST   /admin/providers          -> Add new RPC provider               │  ║
║  │  - DELETE /admin/providers/:id      -> Remove provider                    │  ║
║  │  - PATCH  /admin/providers/:id      -> Update weight/status               │  ║
║  │  - POST   /admin/providers/:id/enable   -> Force enable                   │  ║
║  │  - POST   /admin/providers/:id/disable  -> Force disable                  │  ║
║  │  - GET    /admin/cache/stats        -> Cache metrics                      │  ║
║  │  - DELETE /admin/cache              -> Clear cache (all/provider/pattern) │  ║
║  └───────────────────────────────────────────────────────────────────────────┘  ║
║                                                                                 ║
║  ┌───────────────────────────────────────────────────────────────────────────┐  ║
║  │  OPERATOR DASHBOARD (React + TypeScript + Vite)                           │  ║
║  │  Real-time monitoring interface for DevOps/SRE teams                      │  ║
║  ├───────────────────────────────────────────────────────────────────────────┤  ║
║  │  Features:                                                                │  ║
║  │  - Live provider health status cards                                      │  ║
║  │  - Request distribution pie/bar charts (Chart.js)                         │  ║
║  │  - Cache hit/miss rate trends                                             │  ║
║  │  - Circuit breaker state visualization                                    │  ║
║  │  - Provider latency comparison graphs                                     │  ║
║  │  - Error rate alerts & notifications                                      │  ║
║  │  - One-click provider enable/disable                                      │  ║
║  └───────────────────────────────────────────────────────────────────────────┘  ║
║                                                                                 ║
╚═════════════════════════════════════════════════════════════════════════════════╝

Request Flow Diagram

                            ┏━━━━━━━━━━━━━━━━━━━━━━━┓
                            ┃   CLIENT              ┃
                            ┃   (Web3 Application)  ┃
                            ┗━━━━━━━━━┯━━━━━━━━━━━━━┛
                                      │
                  ╔═══════════════════▼═══════════════════╗
                  ║  1. POST / (JSON-RPC Request)         ║
                  ║  {                                    ║
                  ║    "jsonrpc": "2.0",                  ║
                  ║    "method": "eth_getBlockByNumber",  ║
                  ║    "params": ["0x12A4B7C", true],     ║
                  ║    "id": 1                            ║
                  ║  }                                    ║
                  ╚═══════════════════════════════════════╝
                                      │
                                      ▼
        ╔═════════════════════════════════════════════════════════╗
        ║  2. REQUEST HANDLER                                     ║
        ║  ┌──────────────────────────────────────────────────┐   ║
        ║  │ - Generate Correlation ID: "req_abc123xyz"       │   ║
        ║  │ - Validate JSON-RPC format                       │   ║
        ║  │ - Log incoming request                           │   ║
        ║  │ - Start latency timer                            │   ║
        ║  └──────────────────────────────────────────────────┘   ║
        ╚═════════════════════════════════════════════════════════╝
                                      │
                                      ▼
        ╔═════════════════════════════════════════════════════════╗
        ║  3. CACHE DECISION ENGINE                               ║
        ║  ┌──────────────────────────────────────────────────┐   ║
        ║  │ Is method cacheable?                             │   ║
        ║  │    -> eth_getBlockByNumber         YES           │   ║
        ║  │    -> eth_blockNumber               NO           │   ║
        ║  │                                                  │   ║
        ║  │ Contains "latest" parameter?        NO           │   ║
        ║  │                                                  │   ║
        ║  │ Generate Cache Key:                              │   ║
        ║  │    "eth:1:getBlockByNumber:0x12A4B7C:true"       │   ║
        ║  └──────────────────────────────────────────────────┘   ║
        ╚═════════════════════════════════════════════════════════╝
                                      │
                    ┌─────────────────┴─────────────────┐
                    │                                   │
                    ▼                                   ▼
        ┌─────────────────────────┐       ┌─────────────────────────┐
        │  Cache HIT              │       │  Cache MISS             │
        │  (Redis lookup: found)  │       │  (No cached data)       │
        └───────────┬─────────────┘       └─────────┬───────────────┘
                    │                               │
                    │                               ▼
                    │              ╔════════════════════════════════════════╗
                    │              ║  4. PROVIDER SELECTION                 ║
                    │              ║  ┌──────────────────────────────────┐  ║
                    │              ║  │ Routing Strategy: WEIGHTED       │  ║
                    │              ║  │                                  │  ║
                    │              ║  │ Available Providers:             │  ║
                    │              ║  │ - Infura    (142ms) -> 32% load  │  ║
                    │              ║  │ - Alchemy   (98ms)  -> 46% load  │  ║
                    │              ║  │ - QuickNode (210ms) -> 22% load  │  ║
                    │              ║  │                                  │  ║
                    │              ║  │ Random: 0.521                    │  ║
                    │              ║  │ Selected: Alchemy (fastest!)     │  ║
                    │              ║  └──────────────────────────────────┘  ║
                    │              ╚════════════════════════════════════════╝
                    │                               │
                    │                               ▼
                    │              ╔════════════════════════════════════════╗
                    │              ║  5. CIRCUIT BREAKER CHECK              ║
                    │              ║  ┌──────────────────────────────────┐  ║
                    │              ║  │ Provider: Alchemy                │  ║
                    │              ║  │ State: CLOSED (Healthy)          │  ║
                    │              ║  │ Recent Errors: 3/1000 (0.3%)     │  ║
                    │              ║  │ Last Success: 2s ago             │  ║
                    │              ║  │ Result: Allow request to pass    │  ║
                    │              ║  └──────────────────────────────────┘  ║
                    │              ╚════════════════════════════════════════╝
                    │                               │
                    │                               ▼
                    │              ╔════════════════════════════════════════╗
                    │              ║  6. FORWARD TO PROVIDER                ║
                    │              ║  POST https://eth-mainnet.g.alchemy... ║
                    │              ║  Timeout: 5000ms                       ║
                    │              ╚════════════════════════════════════════╝
                    │                               │
                    │                               │ Latency: 98ms
                    │                               ▼
                    │              ╔════════════════════════════════════════╗
                    │              ║  7. RESPONSE RECEIVED                  ║
                    │              ║  {                                     ║
                    │              ║    "jsonrpc": "2.0",                   ║
                    │              ║    "id": 1,                            ║
                    │              ║    "result": { ... block data ... }    ║
                    │              ║  }                                     ║
                    │              ╚════════════════════════════════════════╝
                    │                               │
                    │                               ▼
                    │              ╔════════════════════════════════════════╗
                    │              ║  8. CACHE RESPONSE                     ║
                    │              ║  ┌──────────────────────────────────┐  ║
                    │              ║  │ Block: 19,400,000                │  ║
                    │              ║  │ Current: 19,500,000              │  ║
                    │              ║  │ Diff: 100,000 blocks > 64        │  ║
                    │              ║  │ Status: FINALIZED                │  ║
                    │              ║  │ TTL: Infinite (1 year)           │  ║
                    │              ║  │ Store in Redis: SUCCESS          │  ║
                    │              ║  └──────────────────────────────────┘  ║
                    │              ╚════════════════════════════════════════╝
                    │                               │
                    └───────────────────────────────┘
                                      │
                                      ▼
        ╔═════════════════════════════════════════════════════════╗
        ║  9. RETURN RESPONSE WITH METADATA                       ║
        ║  ┌──────────────────────────────────────────────────┐   ║
        ║  │ Headers:                                         │   ║
        ║  │ - X-Cache-Hit: false                             │   ║
        ║  │ - X-Provider-Id: alchemy                         │   ║
        ║  │ - X-Correlation-Id: req_abc123xyz                │   ║
        ║  │ - X-Response-Time: 98ms                          │   ║
        ║  │                                                  │   ║
        ║  │ Body: { "jsonrpc": "2.0", "result": {...} }      │   ║
        ║  └──────────────────────────────────────────────────┘   ║
        ╚═════════════════════════════════════════════════════════╝
                                      │
                                      ▼
                            ┏━━━━━━━━━━━━━━━━━━━━━━━┓
                            ┃   CLIENT              ┃
                            ┃   (Response received) ┃
                            ┗━━━━━━━━━━━━━━━━━━━━━━━┛

        ╔═══════════════════════════════════════════════════════╗
        ║  10. METRICS & LOGGING                                ║
        ║  - Prometheus: rpc_requests_total{provider=alchemy}++ ║
        ║  - Pino: {"correlationId":"req_abc123xyz",            ║
        ║           "method":"eth_getBlockByNumber",            ║
        ║           "cacheHit":false, "latency":98}             ║
        ╚═══════════════════════════════════════════════════════╝

Circuit Breaker State Machine

                    ╔══════════════════════════════════════╗
                    ║                                      ║
                    ║      CLOSED (Healthy)                ║
                    ║                                      ║
                    ║  - All requests pass through         ║
                    ║  - Monitor failure rate              ║
                    ║  - Track error count in window       ║
                    ║  - Normal operation mode             ║
                    ║                                      ║
                    ╚═════════════╦════════════════════════╝
                                  ║
                                  ║  Threshold Exceeded
                                  ║  (e.g., 50% errors in 10s window)
                                  ║  or 3 consecutive failures
                                  ║
                                  ▼
                    ╔══════════════════════════════════════╗
                    ║                                      ║
                    ║        OPEN (Unhealthy)              ║
                    ║                                      ║
                    ║  - Block ALL requests                ║
                    ║  - Fast-fail immediately (no delay)  ║
                    ║  - Return error instantly            ║
                    ║  - Wait for cooldown period          ║
                    ║  - Timer: 30s (exponential backoff)  ║
                    ║                                      ║
                    ╚═════════════╦════════════════════════╝
                                  ║
                                  ║  After resetTimeout
                                  ║  (30s -> 60s -> 120s -> ...)
                                  ║
                                  ▼
                    ╔══════════════════════════════════════╗
                    ║                                      ║
                    ║     HALF_OPEN (Testing)              ║
                    ║                                      ║
                    ║  - Allow limited test requests       ║
                    ║  - Monitor closely for success       ║
                    ║  - One request at a time             ║
                    ║  - Decide: Recover or re-open        ║
                    ║                                      ║
                    ╚══════╦══════════════════════╦════════╝
                           ║                      ║
              SUCCESS      ║                      ║  FAILURE
           (Provider OK)   ║                      ║  (Still broken)
                           ▼                      ▼
              ╔════════════════════╗    ╔═════════════════════╗
              ║  CLOSED            ║    ║  OPEN               ║
              ║  (Auto-recovered)  ║    ║  (Retry later)      ║
              ║  Resume normal ops ║    ║  Increase backoff   ║
              ╚════════════════════╝    ╚═════════════════════╝

┌───────────────────────────────────────────────────────────────────────────┐
│  CIRCUIT BREAKER CONFIGURATION                                            │
├───────────────────────────────────────────────────────────────────────────┤
│  Timeout:              5000ms (per request)                               │
│  Error Threshold:      50% (errors in rolling window)                     │
│  Reset Timeout:        30000ms (initial), exponential backoff             │
│  Rolling Window:       10 seconds                                         │
│  Volume Threshold:     10 requests (minimum before triggering)            │
│  Failure Detector:     HTTP 5xx, Timeout, Network Error                   │
└───────────────────────────────────────────────────────────────────────────┘


Intelligent Caching Strategy


┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃  TTL STRATEGY REFERENCE TABLE                                               ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃                                                                             ┃
┃  ╔════════════════════════╦═══════════════╦══════════════════════════════╗  ┃
┃  ║ BLOCK AGE              ║ TTL           ║ REASON                       ║  ┃
┃  ╠════════════════════════╬═══════════════╬══════════════════════════════╣  ┃
┃  ║ >64 blocks from head   ║ INF (1 year)  ║ Finalized, immutable         ║  ┃
┃  ║ 13-64 blocks from head ║ 5 minutes     ║ Likely finalized, safe       ║  ┃
┃  ║ 1-12 blocks from head  ║ 30 seconds    ║ Unfinalized, may reorg       ║  ┃
┃  ║ "latest" parameter     ║ NEVER CACHE   ║ Always refers to chain head  ║  ┃
┃  ╚════════════════════════╩═══════════════╩══════════════════════════════╝  ┃
┃                                                                             ┃
┃  ╔════════════════════════════════════════╦═══════════════════════════════╗ ┃
┃  ║ METHOD                                 ║ CACHEABLE?                    ║ ┃
┃  ╠════════════════════════════════════════╬═══════════════════════════════╣ ┃
┃  ║ eth_getBlockByNumber (finalized)       ║ YES (with TTL logic)          ║ ┃
┃  ║ eth_getBlockByHash                     ║ YES (infinite TTL)            ║ ┃
┃  ║ eth_getTransactionByHash               ║ YES (infinite TTL)            ║ ┃
┃  ║ eth_getTransactionReceipt              ║ YES (infinite TTL)            ║ ┃
┃  ║ eth_blockNumber                        ║ NO (always current)           ║ ┃
┃  ║ eth_gasPrice                           ║ NO (highly volatile)          ║ ┃
┃  ║ eth_call                               ║ NO (state-dependent)          ║ ┃
┃  ║ eth_getBalance (with "latest")         ║ NO (changes every block)      ║ ┃
┃  ╚════════════════════════════════════════╩═══════════════════════════════╝ ┃
┃                                                                             ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

╔═══════════════════════════════════════════════════════════════════════════════╗
║  CACHE PERFORMANCE METRICS (Expected)                                         ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║  Cache Hit Rate:           40-60% (typical Web3 workload)                     ║
║  Latency Reduction:        ~80% (500ms -> 100ms for cached requests)          ║
║  Cost Savings:             30-50% reduction in provider API calls             ║
║  Memory Usage (Redis):     ~100MB for 50,000 cached blocks                    ║
║  Eviction Policy:          allkeys-lru (Least Recently Used)                  ║
║  Max Memory:               256MB (configurable)                               ║
╚═══════════════════════════════════════════════════════════════════════════════╝


🏗️ Architecture Choices

Layered Architecture Pattern

This project follows a clean layered architecture for maintainability, testability, and separation of concerns:

Benefits:

  • Separation of Concerns: Each layer has a single, well-defined responsibility
  • Testability: Easy to mock dependencies and write unit tests for each layer
  • Maintainability: Changes in one layer don't cascade to others (loose coupling)
  • Scalability: Individual layers can be optimized or replaced independently
  • Reusability: Service layer logic can be reused across different controllers
  • Clarity: New developers can quickly understand the codebase structure

✨ Features

Core Functionality

✅ HTTP server accepting JSON-RPC requests on port 8080
✅ Multiple backend RPC providers (configurable via environment variables)
Routing Strategies:

  • round-robin: Distribute evenly across healthy providers
  • weighted: Route based on EWMA latency (faster providers get more traffic)
    Admin API for provider management (add/remove/enable/disable/update weights)

Health Monitoring

✅ Periodic health checks with staggered jitter (30s cycle)
✅ Success/failure rate tracking per provider
Circuit Breaker: Disables providers after N failures, auto-recovery with exponential backoff
✅ Configurable thresholds (timeout, error threshold, reset timeout)

Intelligent Caching

Cacheable Methods: eth_getBlockByNumber, eth_getBlockByHash, eth_getTransactionByHash, eth_getTransactionReceipt
Non-Cacheable Methods: eth_blockNumber, eth_gasPrice, eth_call, any call with "latest" parameter
TTL Strategy:

  • Infinite TTL for finalized blocks (>64 blocks old)
  • 5-minute TTL for recent blocks (within 64 blocks)
  • 30-second TTL for unfinalized blocks
    ✅ Redis-backed with automatic key generation

Observability

Structured Logging (Pino): Request tracing with correlation IDs
Prometheus Metrics:

  • Per-provider: request count, success/failure rate, latency, circuit breaker state
  • System-wide: cache hit rate, active providers, error rate by type
    Alerting: Email alerts for critical conditions (all providers down, cache issues)
    Grafana Dashboards: Pre-configured dashboards for provider health, request distribution, cache stats

Bonus Features

Request Retry: Automatically retries with a different provider on failure
Docker Compose: Full stack (Redis, Prometheus, Grafana, Loki, Tempo)
Load Testing: Autocannon-based load test scripts with realistic traffic patterns
Operator Dashboard: React-based UI for real-time monitoring


✅ Requirements Verification

For detailed verification of all problem statement requirements, see:

📄 REQUIREMENTS-VERIFICATION.md

Summary: ✅ ALL REQUIREMENTS MET (100%)

  • ✅ Core Functionality (9/9)
  • ✅ Health Monitoring (7/7)
  • ✅ Intelligent Caching (10/10)
  • ✅ Observability (9/9)
  • ✅ Alerting (6/6)
  • ✅ Bonus Features (3/3)

📡 API Reference

For complete API documentation, see API-REFERENCE.md.

Quick Overview

Public Endpoints (Port 8080):

  • POST / - JSON-RPC proxy for Ethereum requests
  • GET /providers - View provider statistics (read-only)
  • GET /health - Health check
  • GET /health/detailed - Detailed health information
  • GET /metrics - Prometheus metrics

Admin Endpoints (Port 8081) - Basic Auth Required (admin:changeme):

  • GET /admin/providers - View all providers
  • GET /admin/providers/:id - View specific provider
  • POST /admin/providers - Add new provider
  • PATCH /admin/providers/:id - Update provider (weight, enable/disable)
  • DELETE /admin/providers/:id - Remove provider
  • POST /admin/providers/:id/enable - Force enable provider
  • POST /admin/providers/:id/disable - Force disable provider
  • GET /admin/cache/stats - View cache statistics
  • DELETE /admin/cache - Clear cache (all/provider/pattern)

Authentication: Admin endpoints use HTTP Basic Authentication (not JWT).

# Example: View all providers
curl -u admin:changeme http://localhost:8081/admin/providers

# Example: JSON-RPC request
curl -X POST http://localhost:8080 \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "eth_blockNumber",
    "params": [],
    "id": 1
  }'

→ See API-REFERENCE.md for complete documentation with request/response examples.


🚀 Local Development

Prerequisites

  • Node.js 18+ (Download)
  • Docker & Docker Compose (Download)
  • RPC Provider API Keys: Infura, Alchemy, or any Ethereum JSON-RPC provider (optional - public endpoints available)

Backend Setup

  1. Clone the repository

    git clone https://github.com/HarshitPG/Ethereum-RPC-Load-Balancer.git
    cd Ethereum-RPC-Load-Balancer
  2. Install backend dependencies

    npm install
  3. Configure environment variables

    cp .env.example .env

    Edit .env with your configuration. The project includes working public RPC endpoints, or you can add your own API keys:

    # Public endpoints (works out of the box)
    INFURA_URL=https://eth.llamarpc.com
    ALCHEMY_URL=https://ethereum.publicnode.com
    
    # Or use your own keys
    # INFURA_URL=https://mainnet.infura.io/v3/YOUR_API_KEY
    # ALCHEMY_URL=https://eth-mainnet.g.alchemy.com/v2/YOUR_API_KEY
  4. Start all services (Redis, Prometheus, Grafana, Loki, Tempo)

    docker-compose up -d

    This spins up:

    • Redis (Port 6379)
    • Prometheus (Port 9090)
    • Grafana (Port 3001 - username: admin, password: admin)
    • Loki (Port 3100)
    • Tempo (Port 3200)
  5. Start the backend application

    # Development mode (hot reload)
    npm run dev
    
    # Production mode
    npm run build
    npm start

    Backend runs on:

  6. Test the backend API

    # Health check
    curl http://localhost:8080/health
    
    # JSON-RPC request
    curl -X POST http://localhost:8080 \
      -H "Content-Type: application/json" \
      -d '{
        "jsonrpc": "2.0",
        "method": "eth_blockNumber",
        "params": [],
        "id": 1
      }'
    
    # View providers
    curl http://localhost:8080/providers
    
    # Admin API (basic auth required)
    curl http://localhost:8081/admin/providers \
      -u admin:changeme

Frontend Setup (Operator Dashboard)

  1. Navigate to dashboard directory

    cd dashboard
  2. Install frontend dependencies

    npm install
  3. Configure API endpoint (if needed)

    Edit dashboard/.env or dashboard/src/config.ts to point to your backend:

    VITE_API_URL=http://localhost:8080
    VITE_ADMIN_API_URL=http://localhost:8081
  4. Start the frontend development server

    npm run dev

    Dashboard runs on: http://localhost:5173

  5. Build for production

    npm run build

    Output will be in dashboard/dist/

Access Services


🐳 Docker Compose Stack

To stop all services:

docker-compose down

To view logs:

# All services
docker-compose logs -f

# Specific service
docker-compose logs -f redis
docker-compose logs -f prometheus
docker-compose logs -f grafana

To reset data (clear cache, metrics):

docker-compose down -v
rm -rf data/

Grafana Dashboards

Pre-configured dashboards are automatically provisioned:

  1. RPC Load Balancer Overview

    • Total requests, cache hit rate, error rate
    • Provider health status
    • Request distribution
  2. Provider Analytics

    • Per-provider request count, latency, errors
    • Circuit breaker state transitions
    • Success/failure rates
  3. Cache Performance

    • Hit/miss rates by method
    • TTL distribution
    • Memory usage

🧪 Testing

Unit & Integration Tests

# Run all tests
npm test

# Run specific test file
npm test test/1-core-functionality.test.ts

# Run with coverage
npm test -- --coverage

Test Suites

  1. Core Functionality (1-core-functionality.test.ts)

    • Provider registration, routing strategies, request handling
  2. Health Monitoring (2-health-monitoring.test.ts)

    • Health checks, circuit breakers, auto-failover
  3. Caching Layer (3-caching-layer.test.ts)

    • Cache hit/miss, TTL strategy, invalidation
  4. Observability (4-observability.test.ts)

    • Logging, metrics, correlation IDs
  5. Alerting (5-alerting.test.ts)

    • Email alerts, alert conditions
  6. Circuit Breaker Integration (7-circuit-breaker-integration.test.ts)

    • Failure detection, state transitions
  7. Performance & Load Testing (8-performance-load-testing.test.ts)

    • Throughput, latency under load

Load Testing

# Minimal load test (30 seconds)
bash scripts/load-test-minimal.sh

# Custom load test with autocannon
npx autocannon -c 50 -d 60 -m POST \
  -H "Content-Type: application/json" \
  -b '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
  http://localhost:8080

Expected Results:

  • Throughput: 500-1000 req/s (cached), 100-200 req/s (uncached)
  • Latency (p95): <100ms (cached), <500ms (uncached)
  • Error Rate: <1%

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages