Skip to content

ENDEVSOLS/LongTracer

LongTracer Logo

RAG hallucination detection, multi-project tracing, and pluggable backends — all batteries included.

PyPI Version Total Downloads Monthly Downloads GitHub Stars CI Python Versions License

📖 Documentation  ·  Quick Start  ·  API Reference  ·  Changelog

Detect hallucinations in LLM-generated responses. LongTracer verifies every claim against source documents using hybrid STS + NLI, works with any RAG framework, and traces the full verification pipeline.

Quick Start

pip install longtracer

One-Liner & Batch API

from longtracer import check, check_batch

# Verify a single response
result = check(
    "The Eiffel Tower is 330 meters tall and located in Berlin.",
    ["The Eiffel Tower is a wrought-iron lattice tower in Paris, France. It is 330 metres tall."]
)

print(result.verdict)             # "FAIL"
print(result.trust_score)         # 0.0 - 1.0
print(result.hallucination_count) # 1 ("Berlin" contradicts "Paris")

# Verify in bulk
results = check_batch([
    {"response": "P is NP.", "sources": ["It is not known if P is NP."]},
    {"response": "Water boils at 100C.", "sources": ["Water boils at 100C."]}
])

CLI (no Python needed)

longtracer check "The Eiffel Tower is in Berlin." "The Eiffel Tower is in Paris."
# ✗ FAIL  trust=0.50  hallucinations=1

Full API

from longtracer import CitationVerifier

verifier = CitationVerifier(cache=True)  # optional result caching
result = verifier.verify_parallel(
    response="The Eiffel Tower is 330 meters tall and located in Berlin.",
    sources=["The Eiffel Tower is a wrought-iron lattice tower in Paris, France. It is 330 metres tall."]
)

No vector store dependency. No LLM dependency. Just strings in, verification out.

How It Works

  1. Claim splitting — LLM response is split into individual sentences/claims
  2. STS matching — Fast bi-encoder (all-MiniLM-L6-v2) finds the best-matching source sentence for each claim
  3. NLI verification — Cross-encoder (nli-deberta-v3-xsmall) classifies entailment/contradiction/neutral
  4. Verdict — Trust score computed, hallucinations flagged

Framework Adapters

LangChain (3 lines)

pip install "longtracer[langchain]"
from longtracer import LongTracer, instrument_langchain

LongTracer.init(verbose=True)
instrument_langchain(your_chain)
# Your chain.invoke() now auto-verifies every response

LlamaIndex (3 lines)

pip install "longtracer[llamaindex]"
from longtracer import LongTracer, instrument_llamaindex

LongTracer.init(verbose=True)
instrument_llamaindex(your_query_engine)

Direct API (any framework)

from longtracer.guard.verifier import CitationVerifier

verifier = CitationVerifier()
result = verifier.verify_parallel(
    response="LLM said this...",
    sources=["chunk 1 text", "chunk 2 text"],
    source_metadata=[{"source": "doc.pdf", "page": 1}, {"source": "doc.pdf", "page": 2}]
)

Haystack v2

pip install "longtracer[haystack]"
from longtracer.adapters.haystack_handler import LongTracerVerifier

pipeline.add_component("verifier", LongTracerVerifier())
pipeline.connect("generator.replies", "verifier.response")
pipeline.connect("retriever.documents", "verifier.documents")

LangGraph Agents

pip install "longtracer[langgraph]"
from longtracer import instrument_langgraph

handler = instrument_langgraph(graph)
result = agent.invoke(
    {"messages": [("user", "What is X?")]},
    config={"callbacks": [handler]}
)

LangChain Agents

from longtracer import instrument_langchain_agent

handler = instrument_langchain_agent(agent_executor)
result = agent_executor.invoke({"input": "What is X?"})

Async Support

result = await verifier.verify_parallel_async(response, sources)

Works with Haystack, custom pipelines, or any code that produces strings.

Multi-Project Tracing

Track multiple RAG applications independently:

from longtracer import LongTracer

LongTracer.init(project_name="chatbot-prod", backend="sqlite")

# Get project-specific tracers
chatbot = LongTracer.get_tracer("chatbot-prod")
search  = LongTracer.get_tracer("search-api")

# Each project's traces are tagged and filterable
chatbot.start_root(inputs={"query": "..."})

Vector Store & LLM Agnostic

The SDK core takes plain str and List[str]. It does not depend on any vector store (Chroma, FAISS, Pinecone, Weaviate, Qdrant, pgvector) or any LLM provider (OpenAI, Anthropic, Ollama, Bedrock). Use whatever you want — LongTracer just verifies the output.

Trace Storage Backends

LongTracer.init(backend="sqlite")   # default — persists to ~/.longtracer/traces.db
LongTracer.init(backend="memory")   # in-memory, lost on restart
LongTracer.init(backend="mongo")    # production, distributed
Backend Install Where traces live
SQLite built-in (default) ~/.longtracer/traces.db
Memory built-in RAM only, lost on restart
MongoDB pip install "longtracer[mongo]" MongoDB database
PostgreSQL pip install "longtracer[postgres]" PostgreSQL database
Redis pip install "longtracer[redis]" Redis key-value store

Viewing Traces

CLI

longtracer view                        # list recent traces
longtracer view --last                 # view most recent
longtracer view --id <trace_id>        # view specific trace
longtracer view --project chatbot-prod # filter by project
longtracer view --export <trace_id>    # export to JSON
longtracer view --html <trace_id>      # export to HTML report

Console (verbose mode)

[longtracer] span=retrieval    chunks=5
[longtracer] span=llm_call     answer_len=179
[longtracer] span=eval_claims  total=3 supported=2
[longtracer] span=grounding    score=0.67 verdict=FAIL

HTML Report

from longtracer.guard.trace_report import export_trace_html
export_trace_html(tracer, filepath="report.html")

Generates a standalone HTML file with trust scores, a summary stats bar, and clickable per-claim evidence diffs — viewable in any browser, zero external dependencies.

JSON Export

from longtracer.guard.trace_report import export_trace_json
export_trace_json(tracer, filepath="trace.json")

Optional Dependencies

Extra Install What it adds
langchain pip install "longtracer[langchain]" LangChain callback adapter
llamaindex pip install "longtracer[llamaindex]" LlamaIndex event adapter
haystack pip install "longtracer[haystack]" Haystack v2 component adapter
langgraph pip install "longtracer[langgraph]" LangGraph & LangChain agent tracing
mongo pip install "longtracer[mongo]" MongoDB trace backend
postgres pip install "longtracer[postgres]" PostgreSQL trace backend
redis pip install "longtracer[redis]" Redis trace backend
chroma pip install "longtracer[chroma]" ChromaDB + HuggingFace embeddings
all pip install "longtracer[all]" Everything

Configuration

Set project-level defaults effortlessly via pyproject.toml or environment variables (env vars override file).

pyproject.toml

[tool.longtracer]
project = "my-rag-app"
backend = "sqlite"
threshold = 0.5
verbose = true
log_level = "INFO"

Environment Variables

Variable Default Description
LONGTRACER_ENABLED false Auto-enable with LongTracer.auto()
LONGTRACER_VERBOSE false Print per-span summaries
LONGTRACER_LOG_LEVEL INFO Python logging level
LONGTRACER_PROJECT longtracer Default project name
TRACE_CACHE_BACKEND sqlite Trace storage: sqlite, memory, mongo, postgres, redis
MONGODB_URI MongoDB connection URI
POSTGRES_HOST PostgreSQL host
REDIS_HOST Redis host

Demo Application

The examples/ directory contains a complete RAG demo using ChromaDB + Ollama. It is NOT part of the published PyPI package. See examples/README.md for setup instructions.

Documentation

Full documentation at endevsols.github.io/LongTracer

License

MIT

About

Detect hallucinations in LLM responses. Verify every claim against source documents using hybrid STS + NLI. Works with LangChain, LlamaIndex, or any RAG pipeline. pip install longtracer

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors