Skip to content

amk9978/People-Match-Engine

Repository files navigation

๐Ÿค Match Engine

CI/CD Pipeline

Find the best connections between people โ€” an AI-powered matching engine for events, recruiting, and communities.

Enterprise-grade Professional Network Analysis Platform

Try the demo

A sophisticated graph-based matching system that discovers optimal professional communities using advanced multi-feature similarity and AI-powered complementarity analysis. Built for scalable network intelligence with real-time processing capabilities.

FastAPI Redis NetworkX OpenAI

1.png

2.png

๐Ÿ› ๏ธ Quick Start

Prerequisites

  • Docker & Docker Compose
  • OpenAI API key
  • Redis instance

Installation

git clone https://github.com/amk9978/people_match_engine
cd match_engine

# Configure environment
cp .env.example .env
# Edit .env with your OpenAI API key and Redis config

# Launch the system
docker compose up -d

Usage Example

curl -X POST "http://localhost:8000/analyze" \
  -H "X-User-ID: startup_founder" \
  -F "file=@team_candidates.csv" \
  -F "prompt=I'm building a fintech startup and need complementary technical and business expertise"

Sample dataset input:

|Person Name  |Person Title           |Person Company  |LinkedIn URL                            |Professional Identity - Role Specification                                                                                                         |Professional Identity - Experience Level                                                                                                                       |Company Identity - Industry Classification                                                                                                 |Company Market - Market Traction                                                                                                                          |Company Offering - Value Proposition                                                                                                                      |All Persona Titles                             |
|-------------|-----------------------|----------------|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------|
|Gaurav Bharaj|Co-Founder / AI        |Reality Defender|https://www.linkedin.com/in/gauravbharaj|Co-Founder & Head of AI | C-Suite Executive Level | AI Research Strategy and Team Leadership | Strategic Technology and Research Decision Authority|15+ Years Total Experience | Academic Research to Executive Progression | 4+ Years Leadership Roles | Deep Computer Vision AI Specialization                   |Artificial Intelligence Security | Deepfake Detection Technology | Multimodal Synthetic Media Detection | Financial Services and Government|Government and Enterprise Customers | Early Adopter Market Penetration | Series A $33M Funding Growth | North America Primary with International Expansion|Real-Time Deepfake Detection | 98.6% Detection Accuracy Rate | Eliminates Synthetic Media Fraud Risk | Multimodal Analysis Versus Single-Channel Detection|Fortune 500 Chief Information Security Officers; AI Governance Policy Architects; Cloud Security Platform Leaders; Hypergrowth AI Company Veterans; Government Technology Integration Specialists|
|Melanie Erk  |Event Marketing Manager|IBM             |https://www.linkedin.com/in/melanieerk  |Top Account Event Marketing Manager | Senior Management Level | Strategic Event Marketing Leadership | Enterprise Account Event Decision Authority |10 Years Total Experience | Individual Contributor to Management Progression | 3 Years Senior Marketing Leadership | Deep Event Marketing Domain Specialization|Information Technology Services & Consulting | Hybrid Cloud and AI Solutions | Enterprise Digital Transformation | Regulated Industries    |270,300 Global Employees | Established Enterprise Market Leader | $62.8 Billion Annual Revenue | Global Presence with Regional Specialization             |Accelerates Digital Transformation | 176% ROI Over Three Years | Eliminates Hybrid Cloud Complexity | Integrated AI and Cloud Versus Point Solutions      |Enterprise CMOs Driving AI Transformation; Event Technology Innovation Leaders; Senior Event Marketing Directors; Marketing Technology Analysts; Executive Event Production Specialists|
|Ruma nair    |Principal Product Manager|Twilio          |https://www.linkedin.com/in/ruma-n-41b9459/|Principal Product Manager | Senior Individual Contributor Level | AI Product Strategy and Development | Product Roadmap Decision Authority         |6 Years Current Role Experience | Individual Contributor to Senior IC Progression | Product Strategy Leadership Experience | Deep AI and CX Domain Specialization|Cloud Communications Platform | Customer Engagement Technology | AI-Powered Contact Center Solutions | Enterprise Communications           |335000 Active Customer Accounts | Leading CPaaS Market Position | 7% Annual Revenue Growth | Global Presence with North America Dominance                 |Accelerates Customer Engagement | 50% Reduction in Agent Onboarding Time | Eliminates Communication Infrastructure Complexity | API-First Versus Monolithic Platform Advantage|Enterprise CX Transformation Leaders; AI Model Partnership Executives; AI-Native Contact Center Founders; Regulatory AI Compliance Experts; Enterprise AI Investment Analysts|


Sample

๐Ÿš€ What Makes It Powerful

Match Engine revolutionizes professional network analysis by combining traditional graph algorithms with modern AI to find the most strategically valuable professional communities. Unlike simple similarity matching, our system understands the complementary value of professional relationships.

๐ŸŽช Behind the System

  1. ๐Ÿ” Intelligent Data Processing - Uploads professional CSV data with automatic deduplication and tag extraction
  2. ๐Ÿค– AI-Powered Complementarity Analysis - ChatGPT analyzes complete professional profiles to score strategic relationship value
  3. โš–๏ธ Dynamic Weight Optimization - Auto-tunes similarity vs complementarity weights based on user intent prompts
  4. ๐Ÿ•ธ๏ธ Advanced Graph Construction - Builds weighted networks combining embedding similarity with AI-scored complementarity
  5. ๐Ÿ’Ž Dense Subgraph Discovery - Employs sophisticated algorithms to find the most connected professional communities
  6. ๐Ÿ“Š Real-Time Insights - WebSocket-powered live updates with comprehensive visualization and analytics

๐Ÿ—๏ธ System Architecture

System Flow Architecture

User Request โ†’ REST API โ†’ Preprocessing โ†’ Embedding Tags โ†’ Semantic Preprocessing โ†’ 
Similarity Matrices โ†’ Complementarity Matrices โ†’ Hyperparameter Tuning โ†’ 
Graph Building โ†’ Dense Subgraph Discovery โ†’ Subgraph Analysis โ†’ Frontend Rendering

Performance Layer:

  • Caching: Redis stores preprocessing results, embeddings, and computed matrices
  • Batching: Bulk operations for semantic deduplication and matrix computation
  • FAISS: High-performance vector similarity search for large datasets
  • Async: Concurrent processing for matrix building and AI scoring operations
  • Generalized Mean: Leverages power mean inequality to optimally combine similarity and complementarity scores, ensuring mathematically sound edge weight aggregation that preserves feature relationships
  • Supporting OpenAI embedding as well as FastEmbed through design by interface

๐Ÿ”„ System Flow Details

  1. ๐Ÿ‘ค User Request - CSV upload with optional intent prompt via REST API
  2. ๐ŸŒ REST API - FastAPI endpoints handle requests with async processing
  3. ๐Ÿงน Preprocessing - Remove partial records, validate data integrity
  4. ๐Ÿท๏ธ Embedding Tags - Convert text features to vector representations
  5. ๐Ÿค– Semantic Preprocessing - AI-powered deduplication of similar tags/profiles
  6. ๐Ÿ“Š Similarity Matrices - Compute feature-based similarity scores
  7. ๐Ÿ”„ Complementarity Matrices - AI-analyzed strategic relationship value
  8. โš™๏ธ Hyperparameter Tuning - Auto-optimize similarity/complementarity weights
  9. ๐Ÿ•ธ๏ธ Graph Building - Calculate edge weights using generalized mean for optimal aggregation
  10. ๐Ÿ’Ž Dense Subgraph Discovery - Find maximum density professional communities
  11. ๐Ÿ“ˆ Subgraph Analysis - Extract insights, cycles, and community structure
  12. ๐ŸŽจ Frontend Rendering - Interactive visualizations with real-time updates

โšก Performance Optimizations

  • ๐Ÿ’พ Redis Caching - Persistent storage for embeddings, matrices, and results
  • ๐Ÿ” FAISS Integration - High-performance vector similarity search for large datasets
  • ๐Ÿ“ฆ Batch Processing - Optimized bulk operations for semantic preprocessing
  • โšก Async Operations - Concurrent matrix building and API request handling
  • ๐Ÿงฎ Generalized Mean - Applies power mean inequality for provably optimal aggregation of multiple similarity scores into edge weights

๐ŸŽจ Core Capabilities

๐Ÿง  AI-Powered Analysis

  • Multi-Dimensional Matching - Analyzes role, experience, persona, industry, market, and offerings
  • Strategic Complementarity Scoring - ChatGPT evaluates professional synergy potential
  • Intent-Aware Optimization - Dynamically adjusts matching criteria based on user goals
  • Advanced Graph Algorithms - Employs density-based subgraph mining and community detection

โšก Performance & Scale

  • Real-Time Processing - WebSocket-powered live analysis updates
  • Enterprise Caching - Redis-backed performance optimization
  • FAISS Integration - Vector similarity search for large datasets
  • Async Architecture - Concurrent processing for maximum throughput

๐Ÿ“Š Professional Analytics

  • Maximum Weight Cycles - Discovers optimal professional collaboration chains
  • Community Detection - Identifies natural professional clusters using Louvain/Greedy Modularity
  • Feature Importance Analysis - Reveals which attributes drive the strongest connections
  • Interactive Visualizations - D3.js-powered network graphs with MDS layout

๐Ÿ”ง Enterprise Features

  • Job Persistence - Redis-backed result storage with job ID retrieval
  • Dataset Versioning - Track changes, revert, and analyze different data versions
  • Multi-User Support - User profiles with usage statistics and file management
  • Flexible Data Input - CSV upload with automatic validation and preprocessing

๐Ÿ“š API Documentation

Access the interactive OpenAPI documentation at http://localhost:8000/docs

๐Ÿ”‘ Core Endpoints

Endpoint Method Purpose
/analyze POST Upload CSV and initiate analysis with optional user prompt
/jobs/{job_id} GET Monitor analysis progress and retrieve status
/jobs/{job_id}/result GET Retrieve completed analysis results from Redis
/users/me GET User profile, statistics, and file management
/datasets/{filename}/add-rows POST Dynamically modify datasets with new entries
/cache/info GET Redis performance metrics and cache statistics
/ws/{client_id} WebSocket Real-time analysis updates and progress monitoring

๐Ÿ“ค Response Examples

Analysis Results Structure
{
  "job_id": "uuid-string",
  "subgraph_info": {
    "nodes": [
      "member1",
      "member2",
      "member3"
    ],
    "density": 0.85,
    "communities": {
      "community_1": [
        "member1",
        "member2"
      ],
      "community_2": [
        "member3",
        "member4"
      ]
    },
    "maximum_cycle": {
      "cycle": [
        "member1",
        "member2",
        "member3",
        "member1"
      ],
      "weight": 0.92,
      "summary": "High-synergy collaboration chain"
    },
    "feature_analysis": {
      "most_important_features": [
        "experience",
        "industry"
      ],
      "tuned_weights": {
        "similarity": 0.6,
        "complementarity": 0.4
      }
    },
    "dataset_values": {
      "member1": {
        "role": "Engineer",
        "experience": "5 years"
      },
      "member2": {
        "role": "Designer",
        "experience": "3 years"
      }
    }
  },
  "visualization": {
    "stress_layout": {
      "coordinates": {
        ...
      }
    },
    "edge_weights": {
      ...
    }
  }
}

๐Ÿ”ฎ Future Roadmap

  • Swap ChatGPT as a judge with simpler and faster models
  • Dynamic columns support
  • Extend the list of features
  • Add human feedbacks to the system (RLHF)
  • Use GPU for matrices and graph operations instead of CPU

About

A neat match engine to match the most related people in a dataset as a subgraph. The must-have tool if you are a meeting organaizer.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors