RevGeniAgent

Automated multi-agent system for discovering B2B companies, extracting decision‑maker contacts, validating lead quality, and (scaffolded) sending personalized outbound emails. Built with CrewAI and powered by Tavily search, exposed via a FastAPI backend.

1. Overview

RevGeniAgent orchestrates multiple specialized agents to automate early‑stage revenue generation:

Find companies matching ICP criteria (size, geography, industry)
Extract key account & procurement contacts
Qualify leads (website presence + structured data)
(Planned) Generate and send personalized outbound emails
Compose end‑to‑end JSON objects ready for CRM ingestion

2. Core Features

Modular CrewAI agents with clear roles (lead discovery, contact extraction, quality check, email drafting/sending scaffold)
Tavily real‑time internet search tool (internet_search) for enrichment & verification
Advanced lead intelligence enrichment tool (lead_intel_search) for heuristic extraction of companies, emails, phones & contact titles
FastAPI service exposing search, contact extraction and qualified lead generation endpoints
Lead parsing & heuristic normalization producing structured Pydantic models
Strict logging & observability through centralized logging_config.py
End‑to‑end workflow that chains discovery → contact extraction → qualification

3. Architecture & Directory Layout

agents/                # CrewAI Agent subclasses (lead, contact, quality, email)
tasks/                 # Task objects defining goals for each agent
workflows/             # Orchestration logic composing agents & tasks
tools/                 # Reusable CrewAI tools (basic & enriched Tavily search; lead intelligence; email tool to be implemented)
api/                   # FastAPI app, Pydantic models, endpoints
logging_config.py      # Central logging formatter & helper
start_server.py        # Convenience launcher for API (loads env)
main.py                # Example entry point for a single workflow run
env.example            # Template of required/optional environment variables
requirements.txt       # Python dependencies

High-Level Data Flow

            +------------------+
            |   User / API     |
            +---------+--------+
                      |
                      v
        +---------------------------+
        |   Workflow Orchestrator   |
        | (e.g. EndToEndLeadWorkflow)|
        +----+----------+-----------+
             |          |
   Lead Discovery    Contact Extraction
        Agent              Agent
             \          /
              \        /
            Lead Quality Agent
                    |
                    v
           Structured Qualified Leads

4. Installation & Setup

Requires Python 3.10+ (CrewAI & FastAPI compatibility). Recommended to isolate in a virtual environment.

git clone https://github.com/jaggernaut007/RevGeniAgent.git
cd RevGeniAgent
python -m venv .venv
source .venv/bin/activate  # macOS/Linux
pip install -r requirements.txt

Create a .env file (see env.example). At minimum set:

OPENAI_API_KEY=sk-...
TAVILY_API_KEY=tvly-...
LOG_LEVEL=INFO  # optional

5. Environment Variables

Variable	Required	Purpose
`OPENAI_API_KEY`	Yes	Used by CrewAI LLM-powered agents
`TAVILY_API_KEY`	Yes	Enables internet search enrichment
`LOG_LEVEL`	No	Override default logging level (INFO)
SMTP vars (`SMTP_HOST`, `SMTP_PORT`, `SMTP_USER`, `SMTP_PASSWORD`, `SMTP_FROM_EMAIL`, `SMTP_USE_TLS`)	Optional	Needed once the email sending tool is implemented

Email sending is scaffolded: EmailSendingAgent references tools/email_tool.py which is not yet implemented. Implement a tool exposing a CrewAI-compatible callable (e.g. send_email) before using that agent.

6. Usage (CLI & Workflows)

Quick Run (Default CRM Lead Workflow)

python main.py

Programmatic Customization

from workflows.crm_lead_workflow import CRMLeadGenerationWorkflow

workflow = CRMLeadGenerationWorkflow(
    size="50-200 employees",
    geography="North America",
    industry="Technology",
    max_leads=15,
    verbose=True
)
result = workflow.run()
print(result)  # Raw agent output; parse via API models for structure

End-to-End Qualified Leads

from workflows.end_to_end_lead_workflow import EndToEndLeadGenerationWorkflow

workflow = EndToEndLeadGenerationWorkflow(
    size="100-500 employees",
    geography="EMEA region",
    industry="Healthcare",
    max_leads=12,
    min_leads=5
)
qualified = workflow.run()  # Returns JSON-like list of qualified leads with contacts

7. API Endpoints

Start the server:

python start_server.py

Base URL: http://localhost:8000

Method	Path	Purpose
GET	`/health`	Health check
POST	`/api/v1/leads/search`	Discover leads (returns raw parsed leads)
GET	`/api/v1/leads/search`	Convenience GET variant
POST	`/api/v1/contacts/extract`	Extract contacts (KAM / Procurement) for company list
POST	`/api/v1/leads/generate`	End-to-end: discovery → contact extraction → qualification

Example: End-to-End Lead Generation

curl -X POST http://localhost:8000/api/v1/leads/generate \
  -H "Content-Type: application/json" \
  -d '{
    "size": "50-200 employees",
    "geography": "North America",
    "industry": "Technology",
    "max_leads": 10
  }'

Contact Extraction

curl -X POST http://localhost:8000/api/v1/contacts/extract \
  -H "Content-Type: application/json" \
  -d '{"company_names": ["Example Corp", "Alpha Systems"]}'

Response Models (Simplified)

LeadSearchResponse → criteria + list[LeadInfo] ContactExtractionResponse → companies + contacts QualifiedLead → company + list[QualifiedContact]

8. Agents & Tasks

Agent	Role	Key Output
`CRMLeadGenerationAgent`	Finds candidate companies	Unstructured text (parsed into leads)
`ContactExtractionAgent`	Extracts decision-maker contacts	CompanyContactInfo objects
`LeadQualityCheckAgent`	Filters for qualified leads (website present)	Clean JSON leads
`EmailSendingAgent`	(Planned) Sends personalized outbound emails	Delivery status / message id

Tasks define the specific objectives for each agent; workflows compose tasks + agents for multi-stage automation.

9. Logging

Centralized via logging_config.py. All modules acquire loggers using get_logger(__name__).

Set LOG_LEVEL=DEBUG for richer parsing diagnostics.
Startup script warns if mandatory environment vars missing.

10. Roadmap

Short-term:

Implement tools/email_tool.py (SMTP / provider abstraction + templates)
Add unit tests for parsers (parse_lead_results, contact extraction heuristics)
Add caching layer for repeated Tavily queries
Basic rate limiting & API key auth for production usage

Mid-term: 5. Export to CRM (HubSpot / Salesforce) via pluggable adapters 6. Vector store for historical lead memory & deduplication 7. Structured metrics (Prometheus) + dashboard

Long-term: 8. Agent feedback reinforcement loop for improving parsing accuracy 9. Multi-channel outreach (LinkedIn, email sequencing) 10. Docker + CI/CD deployment templates

11. Contributing

Pull requests welcome. Please open an issue first for significant changes. Suggested workflow:

Fork & branch from main
Implement feature with minimal scope
Add/adjust README section if behavior is user-facing
Ensure linters & tests (once added) pass

12. License

License information not yet specified. Add a LICENSE file (e.g. MIT) before external distribution.

Changelog (Recent)

Added end-to-end qualified lead generation endpoint
Added contact extraction & lead quality agent scaffolds
Removed stale reference to non-existent services/ directory
Enhanced README with architecture, roadmap, and usage examples

Notes

This repository is evolving. Some features (email sending) are intentionally scaffolded and require tool implementation before production use.

13. Deploy to Google Cloud Run

Deploying the FastAPI service to Google Cloud Run (fully managed) involves: containerizing, pushing the image to Artifact Registry, and deploying with required environment variables.

Prerequisites

gcloud CLI installed and authenticated (gcloud auth login)
A GCP project selected: gcloud config set project YOUR_PROJECT_ID
Enable required services:

gcloud services enable artifactregistry.googleapis.com run.googleapis.com

1. Build & Push Image to Artifact Registry

Choose a region (e.g. us-central1). Create a Docker repository:

REGION=us-central1
REPO=revgeniagent
gcloud artifacts repositories create $REPO --repository-format=docker --location=$REGION --description="RevGeniAgent images"

Set helper variables:

PROJECT_ID=$(gcloud config get-value project)
IMAGE=revgeniagent
AR_PATH="$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:latest"

Build & push:

docker build -t $AR_PATH .
docker push $AR_PATH

2. Deploy to Cloud Run

Provide required environment variables (OPENAI_API_KEY, TAVILY_API_KEY). For simple testing:

gcloud run deploy revgeniagent \
  --image $AR_PATH \
  --region $REGION \
  --platform managed \
  --allow-unauthenticated \
  --set-env-vars OPENAI_API_KEY=sk-REDACTED,TAVILY_API_KEY=tvly-REDACTED,LOG_LEVEL=INFO

Production tip: Use Secret Manager and --set-secrets OPENAI_API_KEY=projects/PROJECT_ID/secrets/openai-api-key:latest etc.

3. Verify Deployment

After a successful deploy the command prints the service URL:

SERVICE_URL=$(gcloud run services describe revgeniagent --region $REGION --format='value(status.url)')
curl -s $SERVICE_URL/health | jq .

4. Sample API Call (Lead Generation)

curl -X POST "$SERVICE_URL/api/v1/leads/search" \
  -H "Content-Type: application/json" \
  -d '{"size":"50-200 employees","geography":"North America","industry":"Technology","max_leads":5}'

5. Using cloudrun.yaml (Optional)

You can customize scaling and resources via the provided cloudrun.yaml:

sed -e "s/PROJECT_ID/$PROJECT_ID/" -e "s/REGION/$REGION/" cloudrun.yaml > cloudrun.rendered.yaml
gcloud run services replace cloudrun.rendered.yaml --region $REGION

6. Local Test

Run locally with Docker:

docker run -e OPENAI_API_KEY=sk-REDACTED -e TAVILY_API_KEY=tvly-REDACTED -p 8080:8080 $AR_PATH
curl http://localhost:8080/health

Notes on Ports & Reload

start_server.py now respects the PORT env var. The container entrypoint uses uvicorn directly; for local dev you can still run python start_server.py (auto-reload enabled unless DISABLE_RELOAD=1). Cloud Run sets PORT=8080 automatically.

Next Steps (CI/CD)

Add GitHub Actions workflow to build & push on tag / main merge.
Integrate Secret Manager + IAM for secure key management.
Add unit tests to validate parsing before deploy.

14. Automated Deployment Script

An opinionated helper script scripts/deploy.sh streamlines build → push → deploy.

Basic Usage

./scripts/deploy.sh --project YOUR_PROJECT --region us-central1 --env-file .env --unauth

Loads environment vars from the provided .env (simple KEY=VALUE lines), builds the Docker image, pushes to Artifact Registry (auto-creates repo if missing), and deploys Cloud Run service revgeniagent.

Flags

Flag	Description	Default
`--project`	GCP project id (falls back to gcloud config)	(gcloud config)
`--region`	Deployment region	`us-central1`
`--service`	Cloud Run service name	`revgeniagent`
`--repo`	Artifact Registry repo name	`revgeniagent`
`--image`	Image name inside repo	`revgeniagent`
`--image-tag`	Docker tag	`latest`
`--env-file`	Path to env file for `--set-env-vars`	(none)
`--unauth`	Allow unauthenticated access	disabled
`--use-secrets`	Use Secret Manager (SECRET_* mappings)	disabled
`--dry-run`	Print actions without executing	disabled

Secrets Mode

Set environment variables like SECRET_OPENAI_API_KEY=openai-api-key then run with --use-secrets to map secrets automatically:

SECRET_OPENAI_API_KEY=openai-api-key SECRET_TAVILY_API_KEY=tavily-api-key \
  ./scripts/deploy.sh --use-secrets --project YOUR_PROJECT

Immutable Tags

./scripts/deploy.sh --image-tag $(date +%Y%m%d%H%M) --project YOUR_PROJECT

Dry Run

./scripts/deploy.sh --dry-run --project YOUR_PROJECT

Outputs the full deploy command without executing.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
agents		agents
api		api
scripts		scripts
tasks		tasks
tools		tools
workflows		workflows
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
cloudrun.yaml		cloudrun.yaml
command.sh		command.sh
env.example		env.example
logging_config.py		logging_config.py
requirements.txt		requirements.txt
start_server.py		start_server.py

Folders and files

Latest commit

History

Repository files navigation

RevGeniAgent

Table of Contents

1. Overview

2. Core Features

3. Architecture & Directory Layout

High-Level Data Flow

4. Installation & Setup

5. Environment Variables

6. Usage (CLI & Workflows)

Quick Run (Default CRM Lead Workflow)

Programmatic Customization

End-to-End Qualified Leads

7. API Endpoints

Example: End-to-End Lead Generation

Contact Extraction

Response Models (Simplified)

8. Agents & Tasks

9. Logging

10. Roadmap

11. Contributing

12. License

Changelog (Recent)

Notes

13. Deploy to Google Cloud Run

Prerequisites

1. Build & Push Image to Artifact Registry

2. Deploy to Cloud Run

3. Verify Deployment

4. Sample API Call (Lead Generation)

5. Using cloudrun.yaml (Optional)

6. Local Test

Notes on Ports & Reload

Next Steps (CI/CD)

14. Automated Deployment Script

Basic Usage

Flags

Secrets Mode

Immutable Tags

Dry Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages