Self-hosted search API for AI agents. 17 endpoints. Kill-chain content extraction with optional browser rendering. Optional Tor-anonymized stack. No third-party search API keys, no per-query fees, no vendor lock-in. Optional local bearer auth is supported.
git clone https://github.com/brcrusoe72/agent-search.git
cd agent-search
./scripts/prepare-searxng.sh
docker compose up -d
curl "http://localhost:3939/search?q=distributed+consensus+algorithms"You now have a deduplicated, multi-engine search API running on :3939.
If you enable auth, pass the token on all non-health endpoints:
export AGENT_SEARCH_TOKEN="change-me"
curl -H "Authorization: Bearer $AGENT_SEARCH_TOKEN" \
"http://localhost:3939/search?q=distributed+consensus+algorithms"Prefer not to use Docker for the API server?
git clone https://github.com/brcrusoe72/agent-search.git
cd agent-search
./scripts/install-native.sh
./scripts/run-native.shNative mode requires Python 3.11+ and a reachable SearXNG instance with JSON output enabled. It stores AgentSearch state in ./data. See Native Install.
Run the self-contained test suite:
python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt pytest requests
pip install -e sdk -e mcp-server
./scripts/prepare-searxng.sh
pytest tests -q
python -m compileall app adapters mcp-server/agent_search_mcp scripts sdk -q
docker compose -f docker-compose.yml config --quiet
docker compose -f docker-compose.yml -f examples/compose.private.yml config --quiet
docker build -t agent-search-api:test .Those tests mock SearXNG, so they do not require Docker or a running local service.
Run the optional live localhost check:
AGENTSEARCH_INTEGRATION=1 pytest tests -qIf your local instance requires auth:
AGENT_SEARCH_TOKEN="change-me" AGENTSEARCH_INTEGRATION=1 pytest tests -qRun the optional Docker smoke tests against a running direct/private stack:
AGENT_SEARCH_TOKEN="change-me" \
AGENTSEARCH_DOCKER_INTEGRATION=1 \
pytest tests/test_live_docker.py -qAgentSearch wraps SearXNG with a FastAPI layer that adds everything LLM agents actually need: deduplication, cross-engine scoring, content extraction, query expansion, domain trust scoring, prompt injection scrubbing, and self-improvement.
Standard stack — docker compose up gives you search on :3939.
Private stack — docker compose -f docker-compose.yml -f examples/compose.private.yml up adds an anonymized instance on :3940 that routes all traffic through Tor with Snowflake obfuscation. Encrypted DNS via CoreDNS → Cloudflare DoT. Network-level isolation — the private stack physically cannot egress without Tor.
AgentSearch delegates engine support to the connected SearXNG instance. The authoritative list for a running stack is:
curl "http://localhost:3939/engines"The bundled searxng/settings.example.yml explicitly enables 25 engines, including best-effort Google/Startpage/Yahoo entries plus Brave, Bing, DuckDuckGo, Google Scholar, Semantic Scholar, arXiv, Crossref, OpenAlex, PubMed, Bing News, Reuters, Wikinews, Wikipedia, Wikidata, Hugging Face, Reddit, Hacker News, Stack Overflow, GitHub, Docker Hub, and Lobsters.
Run ./scripts/prepare-searxng.sh to create ignored local runtime files at searxng/settings.yml and searxng/settings.tor.yml with generated SearXNG instance secrets. Do not commit those generated files.
Because SearXNG is configured with use_default_settings: true, your live instance may expose additional enabled engines from the installed SearXNG catalog. Use the engines= query parameter to request specific engines, and use /engines to verify what is available in that deployment.
Strategy modes can also use direct no-key providers for vertical search. These do not require paid search APIs: GitHub repository search, MDN search, Docker Hub search, PyPI package metadata, Wikipedia, Wikidata, Hacker News, arXiv, Crossref, OpenAlex, and Semantic Scholar are called directly when a mode selects them. SearXNG remains the broad-web provider for Google/Bing/Brave/DuckDuckGo-style engines, but Google, Startpage, Yahoo, and Reddit are best-effort explicit sources rather than defaults because they are commonly blocked or empty.
SearXNG finds pages. AgentSearch finds pages, reads them, scores them, deduplicates them, caches them, scrubs prompt injections out of them, detects paywalls, falls back through escalating extraction strategies when the first one fails, and gets better at it over time. One API call.
| AgentSearch | Tavily | Exa | SerpAPI | Raw SearXNG | |
|---|---|---|---|---|---|
| Cost | Free | $0.005/query | $0.001/query | $50/mo | Free |
| Self-hosted | ✅ | ❌ | ❌ | ❌ | ✅ |
| Content extraction | Kill chain + browser renderer | Basic | Basic | ❌ | ❌ |
| Deduplication | Cross-engine | ❌ | ❌ | ❌ | ❌ |
| Prompt injection scrubbing | ✅ | ❌ | ❌ | ❌ | ❌ |
| Self-improving | ✅ (evolver) | ❌ | ❌ | ❌ | ❌ |
| Tor anonymization | Optional | ❌ | ❌ | ❌ | Manual |
| Endpoint | Method | What it does |
|---|---|---|
/search |
GET | Multi-engine web search with deduplication and scoring |
/search/strategy |
GET | Named search modes: general, code, academic, news, private, reference, community |
/search/deep |
GET | Server-side query expansion — runs variations in parallel, fuses results |
/search/extract |
GET | Search + inline content extraction in one call |
/search/jobs |
GET | Job search across LinkedIn, Indeed, Glassdoor, ZipRecruiter |
/search/policy |
GET | Policy and regulatory document search |
/search/sources |
GET | Source discovery with institutional filtering |
/search/sources/institutions |
GET | List source registry institutions |
/search/stats |
GET | Query statistics and cache metrics |
/news |
GET | Structured multi-source news with reliable defaults and explicit engine overrides |
| Endpoint | Method | What it does |
|---|---|---|
/read |
GET | Kill chain extraction for any URL |
/read/batch |
POST | Concurrent multi-URL extraction in one request |
/providers/browser/fetch |
GET | Ephemeral browser render/extract for JS-rendered target pages |
The kill chain escalates through strategies until one succeeds:
- Direct fetch + smart content selectors
- Readability scoring (paragraph density vs link density)
- User-agent rotation (Chrome/Safari/Firefox/Edge signatures)
- Browser render/extract for JS-rendered target pages
- Wayback Machine (CDX API → latest snapshot)
- Google Cache
- Search-about fallback (find coverage elsewhere)
- Custom adapters (pluggable Python modules from disk)
- PDF extraction (pdfplumber)
- YouTube transcript (yt-dlp)
Every request gets SSRF protection, prompt injection detection, paywall detection, and content length caps automatically. The browser renderer uses an ephemeral context, blocks high-cost resource types by default, and reports CAPTCHA/challenge pages instead of trying to bypass them. It is for rendering target pages, not for scraping blocked search-result pages.
| Endpoint | Method | What it does |
|---|---|---|
/adapt/report |
POST | Report a fetch failure for a URL |
/adapt/stats |
GET | View adaptation metrics and failure patterns |
/adapt/evolve |
POST | Trigger self-improvement cycle — analyzes failures, tunes config |
| Endpoint | Method | What it does |
|---|---|---|
/health |
GET | Health check (API + SearXNG status) |
/engines |
GET | List available search engines and their status |
/providers/health |
GET | Summarize provider health from recorded live attempts |
/providers/stats |
GET | Rolling provider/SearXNG attempt telemetry |
curl "http://localhost:3939/search/extract?q=python+async+patterns&count=3"Returns search results with extracted content inline — no second round-trip to /read.
curl "http://localhost:3939/search/strategy?q=fetch+api&mode=code&count=5"
curl "http://localhost:3939/search?q=AI+regulation&mode=academic&count=5"
curl "http://localhost:3939/search?q=Python&mode=reference&count=5"Modes validate or call only their declared sources instead of falling back silently: general tries Bing first, then uses DuckDuckGo/Brave and direct reference/community providers only when more coverage is needed; code uses direct GitHub, MDN, Docker Hub, and PyPI providers; academic uses direct arXiv, Crossref, OpenAlex, and Semantic Scholar providers; news uses Reuters, Bing News, DuckDuckGo News, and Wikinews through SearXNG; reference uses direct Wikipedia and Wikidata providers; community uses direct Hacker News; private avoids broad general web engines.
Each strategy response includes meta.engine_attempts with source/provider, query, raw result count, latency, and upstream errors so blocked or empty providers stay visible.
Provider telemetry is available without running another probe:
curl "http://localhost:3939/providers/health"
curl "http://localhost:3939/providers/stats"Telemetry is in-memory and reflects live attempts since the API process started. It tracks attempts, successes, empty-result rate, errors, latency, last error, and last success per direct provider or SearXNG pack.
curl "http://localhost:3939/search/deep?q=ethon+industrial+ai+platform&count=10"Server-side query variation + parallel execution + result fusion. Surfaces results that flat /search misses.
curl "http://localhost:3939/read?url=https://example.com/paywalled-article"{
"url": "https://example.com/paywalled-article",
"content": "Full article text extracted via Wayback Machine...",
"strategy": "wayback",
"chars": 4821,
"cached": false,
"strategies_tried": ["direct", "readability", "ua_rotation", "wayback"]
}curl "http://localhost:3939/providers/browser/fetch?url=https://example.com/app&max_links=20"Returns rendered text, page title, final URL, extracted links, trust metadata, and challenge_detected=true if the page is a CAPTCHA or bot challenge.
curl -X POST "http://localhost:3939/read/batch" \
-H "Content-Type: application/json" \
-d '{"urls": ["https://a.com", "https://b.com", "https://c.com"]}'pip install agentsearch-clientfrom agentsearch import AgentSearch
client = AgentSearch() # defaults to localhost:3939
results = client.search("manufacturing OEE best practices", count=5)
for r in results.results:
print(f"{r.title} — {r.url}")
# Content extraction
page = client.read("https://example.com/article")
print(page.content[:500])
# Browser-rendered extraction for JS-heavy target pages
rendered = client.browser_fetch("https://example.com/app", max_links=20)
print(rendered.title, rendered.links[:3])
# Batch read
pages = client.read_batch(["https://a.com", "https://b.com"])
print(f"{pages.successful}/{pages.total} succeeded")For authenticated instances, pass token=... or use AGENT_SEARCH_TOKEN,
AGENTSEARCH_TOKEN, credentials/agent-search-token.txt, or
~/.config/agent-search/token.
from langchain.tools import tool
import requests
@tool
def web_search(query: str) -> str:
"""Search the web using AgentSearch."""
resp = requests.get("http://localhost:3939/search", params={"q": query, "count": 5})
return "\n".join(
f"- {r['title']}: {r['url']}\n {r['snippet']}"
for r in resp.json()["results"]
)pip install mcp httpx
python mcp-server/server.pyFor authenticated instances, set AGENT_SEARCH_TOKEN or run:
python mcp-server/server.py --token "change-me"Add to Claude Desktop config:
{
"mcpServers": {
"agent-search": {
"command": "python",
"args": ["/path/to/mcp-server/server.py"]
}
}
}See mcp-server/README.md for details.
The optional private stack adds a fully anonymized search path:
┌──────────┐ ┌──────────────┐ ┌───────────────┐ ┌──────────┐
│ :3940 │───▶│ api-private │───▶│ searxng-priv │───▶│ Tor │──▶ Internet
│ (agent) │ │ (FastAPI) │ │ (SearXNG) │ │(Snowflake│
└──────────┘ └──────────────┘ └───────────────┘ │ + obfs4) │
└──────────┘
All containers use CoreDNS → Cloudflare DoT
tor-internal network: no direct egress possible
What this gives you:
- Your ISP sees TLS to Cloudflare (DNS) and WebRTC-looking traffic (Snowflake). Not search queries.
- The private SearXNG instance lives on an internal-only Docker network with no internet route except through Tor.
- Port 3939 = direct (fast), port 3940 = anonymized (slower, private).
Setup:
./scripts/prepare-searxng.sh
docker compose -f docker-compose.yml -f examples/compose.private.yml up -d --buildAll private stack configs live in examples/ — copy and customize as needed.
Port 3939 (direct) Port 3940 (Tor-anonymized)
│ │
▼ ▼
┌─────────┐ ┌─────────────┐
│ API │ │ api-private │
│(FastAPI) │ │ (FastAPI) │
├─────────┤ ├─────────────┤
│ dedup │ ┌─────────────────┐ │ same code │ ┌───────────────┐
│ scoring │ │ SearXNG │ │ Tor egress │ │ SearXNG-priv │
│ cache │──│ Google, Bing, │ │ only │──│ (tor-internal │
│ scrub │ │ DDG, Brave, │ └─────────────┘ │ network) │
│ killchn │ │ /engines list │ └───────┬───────┘
│ trust │ └─────────────────┘ │
│ evolver │ ┌─────┴─────┐
└─────────┘ │ Tor │
│ │ Snowflake │
▼ │ + obfs4 │
┌─────────┐ └───────────┘
│ CoreDNS │──▶ Cloudflare DoT (encrypted DNS)
└─────────┘
| Module | LOC | What it does |
|---|---|---|
killchain.py |
1016 | Escalating content extraction and browser-render fallback |
browser_renderer.py |
320 | Ephemeral browser rendering, extraction, and challenge detection |
main.py |
920 | FastAPI app, 17 endpoints, auth, rate limiting |
source_tracer.py |
620 | Source provenance tracking and citation chains |
scrubber.py |
539 | Prompt injection detection and content sanitization |
source_library.py |
310 | Curated institutional source registry |
domain_trust.py |
311 | Domain trust scoring (TLD, age, reputation) |
evolver.py |
301 | Self-improvement engine — failure analysis → config tuning |
content_cache.py |
241 | URL-keyed content cache with TTL |
query_expansion.py |
201 | Server-side query variation and fusion |
Plus: 5 pluggable adapters (Cloudflare bypass, Medium, 403 handler, parse error recovery, empty content fallback), MCP server, Python SDK, test suite.
A real autonomous research agent ("the wolf") uses every AgentSearch endpoint. Before AgentSearch was wired in correctly, the agent's hand-rolled SearXNG client silently 401'd on three of four engines. Every hunt on a low-profile entity returned 0 frameworks.
After: 17 frameworks per hunt, 7/7 gaps closed. Same agent, same model, same prompts. The difference was the search infrastructure underneath.
→ Full walkthrough: case-studies/wolf.md
Environment variables (set in docker-compose.yml or .env):
| Variable | Default | Description |
|---|---|---|
SEARXNG_URL |
http://searxng:8080 |
SearXNG instance URL |
SEARXNG_IMAGE |
pinned SearXNG digest | SearXNG container image; override only when intentionally upgrading |
PYTHON_BASE_IMAGE |
pinned Python digest | API Docker base image; override only when intentionally upgrading |
COREDNS_IMAGE |
pinned CoreDNS digest | Private-stack DNS image |
SOCAT_IMAGE |
pinned socat digest | Private-stack TCP forwarder image |
TOR_BASE_IMAGE |
pinned Debian digest | Private-stack Tor proxy base image |
CACHE_TTL |
3600 |
Cache duration in seconds |
RATE_LIMIT |
60 |
Max requests per minute |
SQLITE_TIMEOUT |
1.0 |
SQLite lock wait timeout in seconds for query stats |
FETCH_LOG_RETENTION_DAYS |
30 |
Delete fetch analytics rows older than this many days during SQLite maintenance |
QUERY_LOG_RETENTION_DAYS |
30 |
Delete query analytics rows older than this many days during SQLite maintenance |
SQLITE_MAINTENANCE_INTERVAL_SECONDS |
3600 |
Interval for expired content-cache cleanup, log retention, and PRAGMA optimize |
SQLITE_VACUUM_MIN_DELETED_ROWS |
1000 |
Run VACUUM only when a maintenance pass deletes at least this many rows; set 0 to disable threshold vacuum |
AGENT_SEARCH_TOKEN |
(empty) | Bearer token for auth (optional) |
ADAPTERS_DIR |
/app/adapters |
Path to pluggable adapter modules |
- Search engines, news engines, rate limits, and failure modes depend on the connected SearXNG instance.
/enginesis the live source of truth. - Bearer auth is a simple local API gate, not a multi-user authorization system. Treat
AGENT_SEARCH_TOKENas a shared service token. - Rate limiting is in memory. It resets on restart and is per API process.
- Query statistics use local SQLite with WAL and a bounded lock timeout. For high-volume multi-worker deployments, move query logging to an external database or telemetry backend.
- The MCP package intentionally bounds its
mcpdependency to the tested 1.27.x line. Upgrade deliberately and run the package/CI checks before publishing. - Content extraction validates the starting URL and every redirect hop before fetching redirected content, but fetched third-party pages are still untrusted and are scrubbed before being returned.
- Google Cache is unreliable because public cache availability changes frequently.
- The Tor/private stack is intentionally slower than direct search.
pip install -r requirements.txt
SEARXNG_URL=http://localhost:8080 uvicorn app.main:app --reload --port 3939
# Run tests
pytest tests/Testruns on every push and pull request.CodeQLruns on push, pull request, and a weekly schedule.- Dependabot watches Python packages, Dockerfiles, and GitHub Actions.
- Version releases are created from semantic tags such as
v2.0.1or by manually running theReleaseworkflow with a tag input. - Update
CHANGELOG.mdbefore creating a release tag.
- Fork → branch → commit → PR.
Issues and PRs welcome. If you're building an agent that needs search, this is for you.
The root AgentSearch API, SDK, Docker stack, and docs are MIT licensed. The MCP server under mcp-server/ is AGPL-3.0 licensed; see mcp-server/LICENSE.