Early Beta — functional but APIs change, Apify actors rotate, and edge cases are still being discovered.
Версия на русском / Russian version
Systematic intelligence gathering on individuals. From a name or handle to a scored dossier with psychoprofile, career map, and entry points.
Works with any AI coding agent that supports the SKILL.md format:
Agent
Install
Claude Code
cp -r osint/ ~/.claude/skills/osint/
OpenClaw
Copy to workspace skills/ directory
Codex
Point to osint/SKILL.md in your agent config
OpenCode
Copy to project skills/ directory
Any SKILL.md agent
Place osint/ folder where your agent reads skills
The skill uses standard tools (bash, curl, node, python3) and writes instructions in Markdown — no vendor lock-in.
Phased pipeline (0 → 1 → 1.5 → 2 → 3 → 4 → 5 → 6): from quick search to deep research
Swarm Mode : coordinates 3-5 parallel sub-agents on Sonnet for speed
55+ Apify actors embedded: Instagram (12), Facebook (14), TikTok (14), YouTube (5), Google Maps (4), LinkedIn, and more
Psychoprofile : MBTI / Big Five based on content analysis (YouTube transcripts, Telegram messages, blogs)
Confidence Scoring : every fact gets graded A/B/C/D by number of independent confirmations
Internal Intelligence : checks Telegram history, email, vault contacts BEFORE going external
Research Escalation : 4 levels from free to $0.50, from seconds to minutes
Budget tracking : spends ≤$0.50 without asking, asks permission above that
# Clone the repo
git clone https://github.com/smixs/osint-skill.git
cd osint-skill
# Copy to your agent's skills directory (example for Claude Code)
cp -r osint/ ~ /.claude/skills/osint/
# Run self-diagnostics to check what's available
bash osint/scripts/diagnose.sh
Tool
Purpose
Install
curl
HTTP requests to APIs
Pre-installed on macOS/Linux
python3
JSON parsing, MCP client
Pre-installed on macOS/Linux
jq
JSON processing
brew install jq / apt install jq
For Apify actors (55+ platforms)
Tool
Purpose
Install
Node.js 18+
Runs run_actor.js (embedded Apify runner)
nodejs.org
Tool
Purpose
Install
mcpc
Dynamic actor discovery in Apify Store
npm install -g @apify/mcpc
The skill uses graceful degradation — the more API keys you provide, the deeper it can dig. You need at least one search API to get started.
Service
Env Variable
What It Does
Get It
Brave Search
(built into Claude Code)
2,000 queries/month, basic web search
Built-in, nothing needed
Jina AI
JINA_API_KEY
URL → markdown reader, search, deepsearch
jina.ai/api-key
Apify
APIFY_API_TOKEN
Instagram, TikTok, YouTube, LinkedIn scraping. Free tier ~$5/month
console.apify.com
Parallel AI
PARALLEL_API_KEY
AI-powered search with reasoning and citations
platform.parallel.ai
Service
Env Variable
What It Does
Cost
Get It
Perplexity API
PERPLEXITY_API_KEY
Sonar (fast AI answers), Deep Research
~$5/month
perplexity.ai/settings/api
Exa AI
EXA_API_KEY
Semantic search, people/company research
~$5/month
dashboard.exa.ai
Tavily
TAVILY_API_KEY
Agent-optimized search, $0.005/request basic
~$5/month
app.tavily.com
Service
Env Variable
What It Does
Cost
Get It
Bright Data
BRIGHTDATA_MCP_URL
CAPTCHA bypass, authwall bypass, Facebook scraping, Yandex search
~$10/month+
brightdata.com/mcp
Option 1 — Environment variables (recommended):
export PERPLEXITY_API_KEY=" pplx-..."
export EXA_API_KEY=" exa-..."
export APIFY_API_TOKEN=" apify_api_..."
export JINA_API_KEY=" jina_..."
export TAVILY_API_KEY=" tvly-..."
export PARALLEL_API_KEY=" ..."
export BRIGHTDATA_MCP_URL=" https://mcp.brightdata.com/..."
Option 2 — File fallback (supported by some scripts):
<workspace>/scripts/apify-api-token.txt
<workspace>/scripts/jina-api-key.txt
<workspace>/scripts/parallel-api-key.txt
<workspace>/scripts/brightdata-mcp-url.txt
Phase 0: Tooling Self-Check → diagnose.sh, check available tools
Phase 1: Seed Collection → parallel search across all engines
Phase 1.5: Internal Intelligence → Telegram, email, vault (BEFORE external sources)
Phase 2: Platform Extraction → LinkedIn, Instagram, Facebook, TikTok, YouTube...
Phase 3: Cross-Reference → facts verified, graded A/B/C/D
Phase 4: Psychoprofile → MBTI, Big Five, communication style
Phase 5: Completeness Check → 9 mandatory checks + Depth Score 1-10
Phase 6: Dossier Output → formatted dossier from template
Research Escalation (cheap → expensive)
Level 1: Quick Answers → Perplexity Sonar, Brave, Tavily, Exa (~$0.00)
Level 2: Source Verification → Jina read, Parallel extract (~$0.01)
Level 3: Social Media → Apify scrapers, Bright Data (~$0.01-0.10)
Level 4: Deep Research → Perplexity Deep, Exa Deep, Jina Deep (~$0.05-0.50)
Script
Purpose
diagnose.sh
Self-diagnostics for all tools and APIs
perplexity.sh
search / sonar / deep research
tavily.sh
search / deep / extract
exa.sh
search / company / people / crawl / deep
first-volley.sh
Parallel search across all engines at once
merge-volley.sh
Deduplicate and group search results
apify.sh
LinkedIn / Instagram / any actor / store search
run-actor.sh
Universal Apify runner (55+ actors, polling, CSV/JSON export)
run_actor.js
Node.js engine powering run-actor.sh
jina.sh
read URL / search / deepsearch
parallel.sh
search / extract
brightdata.sh
scrape / search / search-geo / search-yandex
mcp-client.py
Lightweight MCP client for Bright Data (stdlib only)
osint/
├── SKILL.md # Main skill file (452 lines)
├── references/
│ ├── tools.md # Full catalog of 55+ Apify actors + all tools
│ ├── platforms.md # Platform-specific extraction guide
│ ├── content-extraction.md # YouTube/podcast/blog extraction
│ └── psychoprofile.md # MBTI/Big Five methodology
├── assets/
│ └── dossier-template.md # Output dossier template
└── scripts/
├── diagnose.sh # Self-check
├── run-actor.sh # Universal Apify runner (bash wrapper)
├── run_actor.js # Apify runner engine (Node.js, embedded)
├── package.json # ESM support for run_actor.js
├── apify.sh # Apify shortcuts
├── perplexity.sh # Perplexity API
├── tavily.sh # Tavily API
├── exa.sh # Exa AI API
├── jina.sh # Jina AI API
├── parallel.sh # Parallel AI API
├── brightdata.sh # Bright Data MCP
├── mcp-client.py # MCP client (Python, stdlib only)
├── first-volley.sh # Parallel first search
└── merge-volley.sh # Result merging
Shell injection : user input is interpolated into JSON without jq escaping. Do not run with untrusted input.
macOS : first-volley.sh uses tail --pid (Linux-only). Parallel searches work on macOS, but timeout logic may not trigger.
Apify actors : actor IDs can change or get removed without notice. Use apify.sh store-search to find alternatives.
Inconsistent key loading : Perplexity, Tavily, and Exa only load from env vars (no file fallback, unlike Apify/Jina/Parallel).
MIT