Automated multi-agent system for discovering B2B companies, extracting decision‑maker contacts, validating lead quality, and (scaffolded) sending personalized outbound emails. Built with CrewAI and powered by Tavily search, exposed via a FastAPI backend.
- Overview
- Core Features
- Architecture & Directory Layout
- Installation & Setup
- Environment Variables
- Usage (CLI & Workflows)
- API Endpoints
- Agents & Tasks
- Logging
- Roadmap
- Contributing
- License
- Deploy to Google Cloud Run
- Automated Deployment Script
RevGeniAgent orchestrates multiple specialized agents to automate early‑stage revenue generation:
- Find companies matching ICP criteria (size, geography, industry)
- Extract key account & procurement contacts
- Qualify leads (website presence + structured data)
- (Planned) Generate and send personalized outbound emails
- Compose end‑to‑end JSON objects ready for CRM ingestion
- Modular CrewAI agents with clear roles (lead discovery, contact extraction, quality check, email drafting/sending scaffold)
- Tavily real‑time internet search tool (
internet_search) for enrichment & verification - Advanced lead intelligence enrichment tool (
lead_intel_search) for heuristic extraction of companies, emails, phones & contact titles - FastAPI service exposing search, contact extraction and qualified lead generation endpoints
- Lead parsing & heuristic normalization producing structured Pydantic models
- Strict logging & observability through centralized
logging_config.py - End‑to‑end workflow that chains discovery → contact extraction → qualification
agents/ # CrewAI Agent subclasses (lead, contact, quality, email)
tasks/ # Task objects defining goals for each agent
workflows/ # Orchestration logic composing agents & tasks
tools/ # Reusable CrewAI tools (basic & enriched Tavily search; lead intelligence; email tool to be implemented)
api/ # FastAPI app, Pydantic models, endpoints
logging_config.py # Central logging formatter & helper
start_server.py # Convenience launcher for API (loads env)
main.py # Example entry point for a single workflow run
env.example # Template of required/optional environment variables
requirements.txt # Python dependencies
+------------------+
| User / API |
+---------+--------+
|
v
+---------------------------+
| Workflow Orchestrator |
| (e.g. EndToEndLeadWorkflow)|
+----+----------+-----------+
| |
Lead Discovery Contact Extraction
Agent Agent
\ /
\ /
Lead Quality Agent
|
v
Structured Qualified Leads
Requires Python 3.10+ (CrewAI & FastAPI compatibility). Recommended to isolate in a virtual environment.
git clone https://github.com/jaggernaut007/RevGeniAgent.git
cd RevGeniAgent
python -m venv .venv
source .venv/bin/activate # macOS/Linux
pip install -r requirements.txtCreate a .env file (see env.example). At minimum set:
OPENAI_API_KEY=sk-...
TAVILY_API_KEY=tvly-...
LOG_LEVEL=INFO # optional| Variable | Required | Purpose |
|---|---|---|
OPENAI_API_KEY |
Yes | Used by CrewAI LLM-powered agents |
TAVILY_API_KEY |
Yes | Enables internet search enrichment |
LOG_LEVEL |
No | Override default logging level (INFO) |
SMTP vars (SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASSWORD, SMTP_FROM_EMAIL, SMTP_USE_TLS) |
Optional | Needed once the email sending tool is implemented |
Email sending is scaffolded: EmailSendingAgent references tools/email_tool.py which is not yet implemented. Implement a tool exposing a CrewAI-compatible callable (e.g. send_email) before using that agent.
python main.pyfrom workflows.crm_lead_workflow import CRMLeadGenerationWorkflow
workflow = CRMLeadGenerationWorkflow(
size="50-200 employees",
geography="North America",
industry="Technology",
max_leads=15,
verbose=True
)
result = workflow.run()
print(result) # Raw agent output; parse via API models for structurefrom workflows.end_to_end_lead_workflow import EndToEndLeadGenerationWorkflow
workflow = EndToEndLeadGenerationWorkflow(
size="100-500 employees",
geography="EMEA region",
industry="Healthcare",
max_leads=12,
min_leads=5
)
qualified = workflow.run() # Returns JSON-like list of qualified leads with contactsStart the server:
python start_server.pyBase URL: http://localhost:8000
| Method | Path | Purpose |
|---|---|---|
| GET | /health |
Health check |
| POST | /api/v1/leads/search |
Discover leads (returns raw parsed leads) |
| GET | /api/v1/leads/search |
Convenience GET variant |
| POST | /api/v1/contacts/extract |
Extract contacts (KAM / Procurement) for company list |
| POST | /api/v1/leads/generate |
End-to-end: discovery → contact extraction → qualification |
curl -X POST http://localhost:8000/api/v1/leads/generate \
-H "Content-Type: application/json" \
-d '{
"size": "50-200 employees",
"geography": "North America",
"industry": "Technology",
"max_leads": 10
}'curl -X POST http://localhost:8000/api/v1/contacts/extract \
-H "Content-Type: application/json" \
-d '{"company_names": ["Example Corp", "Alpha Systems"]}'LeadSearchResponse → criteria + list[LeadInfo]
ContactExtractionResponse → companies + contacts
QualifiedLead → company + list[QualifiedContact]
| Agent | Role | Key Output |
|---|---|---|
CRMLeadGenerationAgent |
Finds candidate companies | Unstructured text (parsed into leads) |
ContactExtractionAgent |
Extracts decision-maker contacts | CompanyContactInfo objects |
LeadQualityCheckAgent |
Filters for qualified leads (website present) | Clean JSON leads |
EmailSendingAgent |
(Planned) Sends personalized outbound emails | Delivery status / message id |
Tasks define the specific objectives for each agent; workflows compose tasks + agents for multi-stage automation.
Centralized via logging_config.py. All modules acquire loggers using get_logger(__name__).
- Set
LOG_LEVEL=DEBUGfor richer parsing diagnostics. - Startup script warns if mandatory environment vars missing.
Short-term:
- Implement
tools/email_tool.py(SMTP / provider abstraction + templates) - Add unit tests for parsers (
parse_lead_results, contact extraction heuristics) - Add caching layer for repeated Tavily queries
- Basic rate limiting & API key auth for production usage
Mid-term: 5. Export to CRM (HubSpot / Salesforce) via pluggable adapters 6. Vector store for historical lead memory & deduplication 7. Structured metrics (Prometheus) + dashboard
Long-term: 8. Agent feedback reinforcement loop for improving parsing accuracy 9. Multi-channel outreach (LinkedIn, email sequencing) 10. Docker + CI/CD deployment templates
Pull requests welcome. Please open an issue first for significant changes. Suggested workflow:
- Fork & branch from
main - Implement feature with minimal scope
- Add/adjust README section if behavior is user-facing
- Ensure linters & tests (once added) pass
License information not yet specified. Add a LICENSE file (e.g. MIT) before external distribution.
- Added end-to-end qualified lead generation endpoint
- Added contact extraction & lead quality agent scaffolds
- Removed stale reference to non-existent
services/directory - Enhanced README with architecture, roadmap, and usage examples
This repository is evolving. Some features (email sending) are intentionally scaffolded and require tool implementation before production use.
Deploying the FastAPI service to Google Cloud Run (fully managed) involves: containerizing, pushing the image to Artifact Registry, and deploying with required environment variables.
- gcloud CLI installed and authenticated (
gcloud auth login) - A GCP project selected:
gcloud config set project YOUR_PROJECT_ID - Enable required services:
gcloud services enable artifactregistry.googleapis.com run.googleapis.comChoose a region (e.g. us-central1). Create a Docker repository:
REGION=us-central1
REPO=revgeniagent
gcloud artifacts repositories create $REPO --repository-format=docker --location=$REGION --description="RevGeniAgent images"Set helper variables:
PROJECT_ID=$(gcloud config get-value project)
IMAGE=revgeniagent
AR_PATH="$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:latest"Build & push:
docker build -t $AR_PATH .
docker push $AR_PATHProvide required environment variables (OPENAI_API_KEY, TAVILY_API_KEY). For simple testing:
gcloud run deploy revgeniagent \
--image $AR_PATH \
--region $REGION \
--platform managed \
--allow-unauthenticated \
--set-env-vars OPENAI_API_KEY=sk-REDACTED,TAVILY_API_KEY=tvly-REDACTED,LOG_LEVEL=INFOProduction tip: Use Secret Manager and --set-secrets OPENAI_API_KEY=projects/PROJECT_ID/secrets/openai-api-key:latest etc.
After a successful deploy the command prints the service URL:
SERVICE_URL=$(gcloud run services describe revgeniagent --region $REGION --format='value(status.url)')
curl -s $SERVICE_URL/health | jq .curl -X POST "$SERVICE_URL/api/v1/leads/search" \
-H "Content-Type: application/json" \
-d '{"size":"50-200 employees","geography":"North America","industry":"Technology","max_leads":5}'You can customize scaling and resources via the provided cloudrun.yaml:
sed -e "s/PROJECT_ID/$PROJECT_ID/" -e "s/REGION/$REGION/" cloudrun.yaml > cloudrun.rendered.yaml
gcloud run services replace cloudrun.rendered.yaml --region $REGIONRun locally with Docker:
docker run -e OPENAI_API_KEY=sk-REDACTED -e TAVILY_API_KEY=tvly-REDACTED -p 8080:8080 $AR_PATH
curl http://localhost:8080/healthstart_server.py now respects the PORT env var. The container entrypoint uses uvicorn directly; for local dev you can still run python start_server.py (auto-reload enabled unless DISABLE_RELOAD=1). Cloud Run sets PORT=8080 automatically.
- Add GitHub Actions workflow to build & push on tag / main merge.
- Integrate Secret Manager + IAM for secure key management.
- Add unit tests to validate parsing before deploy.
An opinionated helper script scripts/deploy.sh streamlines build → push → deploy.
./scripts/deploy.sh --project YOUR_PROJECT --region us-central1 --env-file .env --unauthLoads environment vars from the provided .env (simple KEY=VALUE lines), builds the Docker image, pushes to Artifact Registry (auto-creates repo if missing), and deploys Cloud Run service revgeniagent.
| Flag | Description | Default |
|---|---|---|
--project |
GCP project id (falls back to gcloud config) | (gcloud config) |
--region |
Deployment region | us-central1 |
--service |
Cloud Run service name | revgeniagent |
--repo |
Artifact Registry repo name | revgeniagent |
--image |
Image name inside repo | revgeniagent |
--image-tag |
Docker tag | latest |
--env-file |
Path to env file for --set-env-vars |
(none) |
--unauth |
Allow unauthenticated access | disabled |
--use-secrets |
Use Secret Manager (SECRET_* mappings) | disabled |
--dry-run |
Print actions without executing | disabled |
Set environment variables like SECRET_OPENAI_API_KEY=openai-api-key then run with --use-secrets to map secrets automatically:
SECRET_OPENAI_API_KEY=openai-api-key SECRET_TAVILY_API_KEY=tavily-api-key \
./scripts/deploy.sh --use-secrets --project YOUR_PROJECT./scripts/deploy.sh --image-tag $(date +%Y%m%d%H%M) --project YOUR_PROJECT./scripts/deploy.sh --dry-run --project YOUR_PROJECTOutputs the full deploy command without executing.