Skip to content

jaggernaut007/RevGeniAgent

Repository files navigation

RevGeniAgent

Automated multi-agent system for discovering B2B companies, extracting decision‑maker contacts, validating lead quality, and (scaffolded) sending personalized outbound emails. Built with CrewAI and powered by Tavily search, exposed via a FastAPI backend.

Table of Contents

  1. Overview
  2. Core Features
  3. Architecture & Directory Layout
  4. Installation & Setup
  5. Environment Variables
  6. Usage (CLI & Workflows)
  7. API Endpoints
  8. Agents & Tasks
  9. Logging
  10. Roadmap
  11. Contributing
  12. License
  13. Deploy to Google Cloud Run
  14. Automated Deployment Script

1. Overview

RevGeniAgent orchestrates multiple specialized agents to automate early‑stage revenue generation:

  • Find companies matching ICP criteria (size, geography, industry)
  • Extract key account & procurement contacts
  • Qualify leads (website presence + structured data)
  • (Planned) Generate and send personalized outbound emails
  • Compose end‑to‑end JSON objects ready for CRM ingestion

2. Core Features

  • Modular CrewAI agents with clear roles (lead discovery, contact extraction, quality check, email drafting/sending scaffold)
  • Tavily real‑time internet search tool (internet_search) for enrichment & verification
  • Advanced lead intelligence enrichment tool (lead_intel_search) for heuristic extraction of companies, emails, phones & contact titles
  • FastAPI service exposing search, contact extraction and qualified lead generation endpoints
  • Lead parsing & heuristic normalization producing structured Pydantic models
  • Strict logging & observability through centralized logging_config.py
  • End‑to‑end workflow that chains discovery → contact extraction → qualification

3. Architecture & Directory Layout

agents/                # CrewAI Agent subclasses (lead, contact, quality, email)
tasks/                 # Task objects defining goals for each agent
workflows/             # Orchestration logic composing agents & tasks
tools/                 # Reusable CrewAI tools (basic & enriched Tavily search; lead intelligence; email tool to be implemented)
api/                   # FastAPI app, Pydantic models, endpoints
logging_config.py      # Central logging formatter & helper
start_server.py        # Convenience launcher for API (loads env)
main.py                # Example entry point for a single workflow run
env.example            # Template of required/optional environment variables
requirements.txt       # Python dependencies

High-Level Data Flow

            +------------------+
            |   User / API     |
            +---------+--------+
                      |
                      v
        +---------------------------+
        |   Workflow Orchestrator   |
        | (e.g. EndToEndLeadWorkflow)|
        +----+----------+-----------+
             |          |
   Lead Discovery    Contact Extraction
        Agent              Agent
             \          /
              \        /
            Lead Quality Agent
                    |
                    v
           Structured Qualified Leads

4. Installation & Setup

Requires Python 3.10+ (CrewAI & FastAPI compatibility). Recommended to isolate in a virtual environment.

git clone https://github.com/jaggernaut007/RevGeniAgent.git
cd RevGeniAgent
python -m venv .venv
source .venv/bin/activate  # macOS/Linux
pip install -r requirements.txt

Create a .env file (see env.example). At minimum set:

OPENAI_API_KEY=sk-...
TAVILY_API_KEY=tvly-...
LOG_LEVEL=INFO  # optional

5. Environment Variables

Variable Required Purpose
OPENAI_API_KEY Yes Used by CrewAI LLM-powered agents
TAVILY_API_KEY Yes Enables internet search enrichment
LOG_LEVEL No Override default logging level (INFO)
SMTP vars (SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASSWORD, SMTP_FROM_EMAIL, SMTP_USE_TLS) Optional Needed once the email sending tool is implemented

Email sending is scaffolded: EmailSendingAgent references tools/email_tool.py which is not yet implemented. Implement a tool exposing a CrewAI-compatible callable (e.g. send_email) before using that agent.

6. Usage (CLI & Workflows)

Quick Run (Default CRM Lead Workflow)

python main.py

Programmatic Customization

from workflows.crm_lead_workflow import CRMLeadGenerationWorkflow

workflow = CRMLeadGenerationWorkflow(
    size="50-200 employees",
    geography="North America",
    industry="Technology",
    max_leads=15,
    verbose=True
)
result = workflow.run()
print(result)  # Raw agent output; parse via API models for structure

End-to-End Qualified Leads

from workflows.end_to_end_lead_workflow import EndToEndLeadGenerationWorkflow

workflow = EndToEndLeadGenerationWorkflow(
    size="100-500 employees",
    geography="EMEA region",
    industry="Healthcare",
    max_leads=12,
    min_leads=5
)
qualified = workflow.run()  # Returns JSON-like list of qualified leads with contacts

7. API Endpoints

Start the server:

python start_server.py

Base URL: http://localhost:8000

Method Path Purpose
GET /health Health check
POST /api/v1/leads/search Discover leads (returns raw parsed leads)
GET /api/v1/leads/search Convenience GET variant
POST /api/v1/contacts/extract Extract contacts (KAM / Procurement) for company list
POST /api/v1/leads/generate End-to-end: discovery → contact extraction → qualification

Example: End-to-End Lead Generation

curl -X POST http://localhost:8000/api/v1/leads/generate \
  -H "Content-Type: application/json" \
  -d '{
    "size": "50-200 employees",
    "geography": "North America",
    "industry": "Technology",
    "max_leads": 10
  }'

Contact Extraction

curl -X POST http://localhost:8000/api/v1/contacts/extract \
  -H "Content-Type: application/json" \
  -d '{"company_names": ["Example Corp", "Alpha Systems"]}'

Response Models (Simplified)

LeadSearchResponse → criteria + list[LeadInfo] ContactExtractionResponse → companies + contacts QualifiedLead → company + list[QualifiedContact]

8. Agents & Tasks

Agent Role Key Output
CRMLeadGenerationAgent Finds candidate companies Unstructured text (parsed into leads)
ContactExtractionAgent Extracts decision-maker contacts CompanyContactInfo objects
LeadQualityCheckAgent Filters for qualified leads (website present) Clean JSON leads
EmailSendingAgent (Planned) Sends personalized outbound emails Delivery status / message id

Tasks define the specific objectives for each agent; workflows compose tasks + agents for multi-stage automation.

9. Logging

Centralized via logging_config.py. All modules acquire loggers using get_logger(__name__).

  • Set LOG_LEVEL=DEBUG for richer parsing diagnostics.
  • Startup script warns if mandatory environment vars missing.

10. Roadmap

Short-term:

  1. Implement tools/email_tool.py (SMTP / provider abstraction + templates)
  2. Add unit tests for parsers (parse_lead_results, contact extraction heuristics)
  3. Add caching layer for repeated Tavily queries
  4. Basic rate limiting & API key auth for production usage

Mid-term: 5. Export to CRM (HubSpot / Salesforce) via pluggable adapters 6. Vector store for historical lead memory & deduplication 7. Structured metrics (Prometheus) + dashboard

Long-term: 8. Agent feedback reinforcement loop for improving parsing accuracy 9. Multi-channel outreach (LinkedIn, email sequencing) 10. Docker + CI/CD deployment templates

11. Contributing

Pull requests welcome. Please open an issue first for significant changes. Suggested workflow:

  1. Fork & branch from main
  2. Implement feature with minimal scope
  3. Add/adjust README section if behavior is user-facing
  4. Ensure linters & tests (once added) pass

12. License

License information not yet specified. Add a LICENSE file (e.g. MIT) before external distribution.


Changelog (Recent)

  • Added end-to-end qualified lead generation endpoint
  • Added contact extraction & lead quality agent scaffolds
  • Removed stale reference to non-existent services/ directory
  • Enhanced README with architecture, roadmap, and usage examples

Notes

This repository is evolving. Some features (email sending) are intentionally scaffolded and require tool implementation before production use.

13. Deploy to Google Cloud Run

Deploying the FastAPI service to Google Cloud Run (fully managed) involves: containerizing, pushing the image to Artifact Registry, and deploying with required environment variables.

Prerequisites

  • gcloud CLI installed and authenticated (gcloud auth login)
  • A GCP project selected: gcloud config set project YOUR_PROJECT_ID
  • Enable required services:
gcloud services enable artifactregistry.googleapis.com run.googleapis.com

1. Build & Push Image to Artifact Registry

Choose a region (e.g. us-central1). Create a Docker repository:

REGION=us-central1
REPO=revgeniagent
gcloud artifacts repositories create $REPO --repository-format=docker --location=$REGION --description="RevGeniAgent images"

Set helper variables:

PROJECT_ID=$(gcloud config get-value project)
IMAGE=revgeniagent
AR_PATH="$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:latest"

Build & push:

docker build -t $AR_PATH .
docker push $AR_PATH

2. Deploy to Cloud Run

Provide required environment variables (OPENAI_API_KEY, TAVILY_API_KEY). For simple testing:

gcloud run deploy revgeniagent \
  --image $AR_PATH \
  --region $REGION \
  --platform managed \
  --allow-unauthenticated \
  --set-env-vars OPENAI_API_KEY=sk-REDACTED,TAVILY_API_KEY=tvly-REDACTED,LOG_LEVEL=INFO

Production tip: Use Secret Manager and --set-secrets OPENAI_API_KEY=projects/PROJECT_ID/secrets/openai-api-key:latest etc.

3. Verify Deployment

After a successful deploy the command prints the service URL:

SERVICE_URL=$(gcloud run services describe revgeniagent --region $REGION --format='value(status.url)')
curl -s $SERVICE_URL/health | jq .

4. Sample API Call (Lead Generation)

curl -X POST "$SERVICE_URL/api/v1/leads/search" \
  -H "Content-Type: application/json" \
  -d '{"size":"50-200 employees","geography":"North America","industry":"Technology","max_leads":5}'

5. Using cloudrun.yaml (Optional)

You can customize scaling and resources via the provided cloudrun.yaml:

sed -e "s/PROJECT_ID/$PROJECT_ID/" -e "s/REGION/$REGION/" cloudrun.yaml > cloudrun.rendered.yaml
gcloud run services replace cloudrun.rendered.yaml --region $REGION

6. Local Test

Run locally with Docker:

docker run -e OPENAI_API_KEY=sk-REDACTED -e TAVILY_API_KEY=tvly-REDACTED -p 8080:8080 $AR_PATH
curl http://localhost:8080/health

Notes on Ports & Reload

start_server.py now respects the PORT env var. The container entrypoint uses uvicorn directly; for local dev you can still run python start_server.py (auto-reload enabled unless DISABLE_RELOAD=1). Cloud Run sets PORT=8080 automatically.

Next Steps (CI/CD)

  • Add GitHub Actions workflow to build & push on tag / main merge.
  • Integrate Secret Manager + IAM for secure key management.
  • Add unit tests to validate parsing before deploy.

14. Automated Deployment Script

An opinionated helper script scripts/deploy.sh streamlines build → push → deploy.

Basic Usage

./scripts/deploy.sh --project YOUR_PROJECT --region us-central1 --env-file .env --unauth

Loads environment vars from the provided .env (simple KEY=VALUE lines), builds the Docker image, pushes to Artifact Registry (auto-creates repo if missing), and deploys Cloud Run service revgeniagent.

Flags

Flag Description Default
--project GCP project id (falls back to gcloud config) (gcloud config)
--region Deployment region us-central1
--service Cloud Run service name revgeniagent
--repo Artifact Registry repo name revgeniagent
--image Image name inside repo revgeniagent
--image-tag Docker tag latest
--env-file Path to env file for --set-env-vars (none)
--unauth Allow unauthenticated access disabled
--use-secrets Use Secret Manager (SECRET_* mappings) disabled
--dry-run Print actions without executing disabled

Secrets Mode

Set environment variables like SECRET_OPENAI_API_KEY=openai-api-key then run with --use-secrets to map secrets automatically:

SECRET_OPENAI_API_KEY=openai-api-key SECRET_TAVILY_API_KEY=tavily-api-key \
  ./scripts/deploy.sh --use-secrets --project YOUR_PROJECT

Immutable Tags

./scripts/deploy.sh --image-tag $(date +%Y%m%d%H%M) --project YOUR_PROJECT

Dry Run

./scripts/deploy.sh --dry-run --project YOUR_PROJECT

Outputs the full deploy command without executing.

About

Agentic AI system to generate qualified leads for a target market

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors