Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions docs/agents/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -1424,3 +1424,44 @@ result = await tool.run(action="list_products", limit=10)
| `limit` | `10` | Max results for list endpoints (max 100) |

Auth via `STRIPE_API_KEY` env var (use restricted read-only keys in production).

## LinearTool

Manage Linear issues via the Linear GraphQL API. Stdlib `urllib` only — no extra dependencies.

```python
from synapsekit import LinearTool

tool = LinearTool()
# set LINEAR_API_KEY env var

# List issues for a team
result = await tool.run(action="list_issues", team_id="team-uuid")

# Get a single issue
result = await tool.run(action="get_issue", issue_id="ISS-42")

# Create an issue
result = await tool.run(
action="create_issue",
team_id="team-uuid",
title="Add dark mode",
description="Users have been asking for it",
priority=2,
)

# Update an issue's state
result = await tool.run(action="update_issue", issue_id="ISS-42", status="state-uuid")
```

| Parameter | Default | Description |
|---|---|---|
| `action` | — | `list_issues`, `get_issue`, `create_issue`, `update_issue` (required) |
| `team_id` | — | Linear team ID (required for `list_issues` and `create_issue`) |
| `issue_id` | — | Linear issue ID (required for `get_issue` and `update_issue`) |
| `title` | — | Issue title (required for `create_issue`) |
| `description` | — | Issue body markdown |
| `priority` | `0` | `0` none, `1` urgent, `2` high, `3` medium, `4` low |
| `status` | — | New `stateId` for `update_issue` |

Auth via constructor arg or `LINEAR_API_KEY` env var.
15 changes: 13 additions & 2 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,19 @@ All notable changes to SynapseKit are documented here.
- **`CodeSplitter`** — split source code using language-aware separators; supports Python, JavaScript, TypeScript, Go, Rust, Java, C++; preserves logical structures (classes, functions); falls back to recursive character splitting
- **`SentenceWindowSplitter`** — one chunk per sentence, padded with up to `window_size` surrounding sentences; `split_with_metadata()` adds `target_sentence` to each chunk's metadata; useful for retrieval systems that embed with context but score by target sentence
- **`TwilioTool`** — send SMS and WhatsApp messages via the Twilio REST API; stdlib `urllib` only, no extra deps; auth via constructor args or env vars; automatic `whatsapp:` prefix handling for both sender and recipient; security warning logged on instantiation

**Stats:** 1500 tests · 27 LLM providers · 43 tools · 26 loaders · 8 text splitters · 9 vector store backends
- **`NewsTool`** — fetch top headlines and search articles via NewsAPI; actions: `get_headlines`, `search`; stdlib urllib only; auth via constructor arg or `NEWS_API_KEY` env var
- **`WeatherTool`** — get current weather and short-term forecasts via OpenWeatherMap; actions: `current`, `forecast` (1–5 day); async-safe with `run_in_executor`; auth via `OPENWEATHERMAP_API_KEY`
- **`StripeTool`** — read-only Stripe data lookup: `get_customer`, `list_invoices`, `get_charge`, `list_products`; stdlib urllib only; auth via `STRIPE_API_KEY`; async-safe with `run_in_executor`
- **`LinearTool`** — manage Linear issues via the Linear GraphQL API; actions: `list_issues`, `get_issue`, `create_issue`, `update_issue`; stdlib urllib only, no extra deps; auth via constructor arg or `LINEAR_API_KEY`
- **`XaiLLM`** — xAI Grok LLM provider; OpenAI-compatible API; supports `grok-beta`, `grok-2`, `grok-2-mini`; streaming and tool calling; `pip install synapsekit[openai]`
- **`NovitaLLM`** — NovitaAI LLM provider; OpenAI-compatible API; supports Llama, Mistral, Qwen, and other open models; streaming and tool calling; `pip install synapsekit[openai]`
- **`WriterLLM`** — Writer (Palmyra) LLM provider; OpenAI-compatible API; supports `palmyra-x-004`, `palmyra-x-003-instruct`, `palmyra-med`, `palmyra-fin`; streaming and tool calling; `pip install synapsekit[openai]`
- **`HTMLTextSplitter`** — split HTML documents on block-level tags (h1–h6, p, div, section, article, li, blockquote, pre); strips tags to plain text; falls back to `RecursiveCharacterTextSplitter` for long sections; stdlib `html.parser` only
- **`GCSLoader`** — load files from Google Cloud Storage buckets as Documents; service account auth (file path or dict) or default credentials; prefix filtering, `max_files` limit, binary file handling; sync `load()` and async `aload()`; `pip install synapsekit[gcs]`
- **`SQLLoader`** — load rows from any SQLAlchemy-supported database (PostgreSQL, MySQL, SQLite, etc.) as Documents; configurable text/metadata columns; full SQL query support; sync `load()` and async `aload()`; `pip install synapsekit[sql]`
- **`GitHubLoader`** — load README, issues, pull requests, or repository files from GitHub via the REST API; retry with exponential back-off for rate limits and 5xx; optional token auth for higher rate limits; path filtering and limit for files; uses existing `httpx` dep; sync `load_sync()` and async `load()`

**Stats:** 1715 tests · 30 LLM providers · 46 tools · 29 loaders · 9 text splitters · 9 vector store backends

---

Expand Down
12 changes: 6 additions & 6 deletions docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sidebar_position: 1

# Introduction

**SynapseKit** is an async-native Python framework for building LLM applications — RAG pipelines, tool-using agents, and graph workflows. Streaming-first, transparent API, 2 hard deps. 30 providers · 45 tools · 26 loaders · 9 vector stores.
**SynapseKit** is an async-native Python framework for building LLM applications — RAG pipelines, tool-using agents, and graph workflows. Streaming-first, transparent API, 2 hard deps. 30 providers · 46 tools · 29 loaders · 9 vector stores.

It is designed from the ground up to be **async-native** and **streaming-first**. Every public API is `async`. Streaming tokens is the default, not an opt-in. There are no hidden chains, no magic callbacks, no global state.

Expand Down Expand Up @@ -42,9 +42,9 @@ Full retrieval-augmented generation with chunking, embedding, vector search, BM2

→ [RAG Pipeline docs](/docs/rag/pipeline)

### 27 LLM providers
### 30 LLM providers

OpenAI, Anthropic, Ollama, Cohere, Mistral, Gemini, AWS Bedrock, Azure OpenAI, Groq, DeepSeek, OpenRouter, Together, Fireworks, Perplexity, Cerebras, Vertex AI, Moonshot, Zhipu, Cloudflare, AI21 Labs, Databricks, Baidu ERNIE, llama.cpp, Minimax, Aleph Alpha, Hugging Face, SambaNova — all behind `BaseLLM`. Auto-detected from the model name.
OpenAI, Anthropic, Ollama, Cohere, Mistral, Gemini, AWS Bedrock, Azure OpenAI, Groq, DeepSeek, OpenRouter, Together, Fireworks, Perplexity, Cerebras, Vertex AI, Moonshot, Zhipu, Cloudflare, AI21 Labs, Databricks, Baidu ERNIE, llama.cpp, Minimax, Aleph Alpha, Hugging Face, SambaNova, xAI (Grok), NovitaAI, Writer (Palmyra) — all behind `BaseLLM`. Auto-detected from the model name.

→ [LLM Provider docs](/docs/llms/overview)

Expand All @@ -54,9 +54,9 @@ InMemoryVectorStore (built-in, `.npz` persistence), ChromaDB, FAISS, Qdrant, Pin

→ [Vector store docs](/docs/rag/vector-stores)

### 26 document loaders
### 29 document loaders

`TextLoader`, `StringLoader`, `PDFLoader`, `HTMLLoader`, `CSVLoader`, `JSONLoader`, `YAMLLoader`, `XMLLoader`, `DiscordLoader`, `SlackLoader`, `NotionLoader`, `GoogleDriveLoader`, `DirectoryLoader`, `WebLoader`, `ExcelLoader`, `PowerPointLoader`, `DocxLoader`, `MarkdownLoader`, `AudioLoader`, `VideoLoader`, `WikipediaLoader`, `ArXivLoader`, `EmailLoader`, `ImageLoader`, `ConfluenceLoader`, `RSSLoader`.
`TextLoader`, `StringLoader`, `PDFLoader`, `HTMLLoader`, `CSVLoader`, `JSONLoader`, `YAMLLoader`, `XMLLoader`, `DiscordLoader`, `SlackLoader`, `NotionLoader`, `GoogleDriveLoader`, `DirectoryLoader`, `WebLoader`, `ExcelLoader`, `PowerPointLoader`, `DocxLoader`, `MarkdownLoader`, `AudioLoader`, `VideoLoader`, `WikipediaLoader`, `ArXivLoader`, `EmailLoader`, `ImageLoader`, `ConfluenceLoader`, `RSSLoader`, `GCSLoader`, `SQLLoader`, `GitHubLoader`.

→ [Loader docs](/docs/rag/loaders)

Expand All @@ -65,7 +65,7 @@ InMemoryVectorStore (built-in, `.npz` persistence), ChromaDB, FAISS, Qdrant, Pin
`ReActAgent` — Thought → Action → Observation loop, works with any LLM.
`FunctionCallingAgent` — native `tool_calls` / `tool_use` for OpenAI, Anthropic, Gemini, and Mistral.
`AgentExecutor` — unified runner, picks the right agent from config.
45 built-in tools: Calculator, PythonREPL, FileRead, FileWrite, FileList, WebSearch, DuckDuckGoSearch, SQL, HTTP, GraphQL, DateTime, Regex, JSONQuery, HumanInput, Wikipedia, Summarization, SentimentAnalysis, Translation, WebScraper, Shell, SQLSchemaInspection, PDFReader, ArxivSearch, TavilySearch, Email, GitHubAPI, PubMedSearch, VectorSearch, YouTubeSearch, Slack, Notion, Jira, BraveSearch, APIBuilder, GoogleCalendar, AWSLambda, ImageAnalysis, TextToSpeech, SpeechToText, BingSearch, WolframAlpha, GoogleSearch, Twilio, NewsTool, WeatherTool, StripeTool.
46 built-in tools: Calculator, PythonREPL, FileRead, FileWrite, FileList, WebSearch, DuckDuckGoSearch, SQL, HTTP, GraphQL, DateTime, Regex, JSONQuery, HumanInput, Wikipedia, Summarization, SentimentAnalysis, Translation, WebScraper, Shell, SQLSchemaInspection, PDFReader, ArxivSearch, TavilySearch, Email, GitHubAPI, PubMedSearch, VectorSearch, YouTubeSearch, Slack, Notion, Jira, BraveSearch, APIBuilder, GoogleCalendar, AWSLambda, ImageAnalysis, TextToSpeech, SpeechToText, BingSearch, WolframAlpha, GoogleSearch, Twilio, NewsTool, WeatherTool, StripeTool, LinearTool.

→ [Agent docs](/docs/agents/overview)

Expand Down
97 changes: 97 additions & 0 deletions docs/rag/loaders.md
Original file line number Diff line number Diff line change
Expand Up @@ -795,6 +795,103 @@ Each feed entry becomes one `Document`. Metadata fields (`title`, `published`, `

---

## GCSLoader

Load files from a Google Cloud Storage bucket as Documents. Install with `pip install synapsekit[gcs]`.

```python
from synapsekit import GCSLoader

loader = GCSLoader(
bucket_name="my-bucket",
prefix="documents/",
credentials_path="service-account.json",
max_files=100,
)

docs = await loader.aload()
```

| Parameter | Type | Description |
|---|---|---|
| `bucket_name` | `str` | GCS bucket name (required) |
| `prefix` | `str \| None` | Optional prefix filter (e.g. `"documents/"`) |
| `credentials_path` | `str \| None` | Path to a service account JSON file |
| `credentials_dict` | `dict \| None` | Service account credentials as a dict |
| `max_files` | `int \| None` | Maximum number of files to load |

If neither `credentials_path` nor `credentials_dict` is provided, default application credentials are used. Binary files are kept with a placeholder string and their content type in metadata.

---

## SQLLoader

Load rows from any SQLAlchemy-supported database (PostgreSQL, MySQL, SQLite, etc.) as Documents. Install with `pip install synapsekit[sql]`.

```python
from synapsekit import SQLLoader

loader = SQLLoader(
connection_string="postgresql://user:pass@localhost/db",
query="SELECT id, title, body, author FROM articles WHERE published = true",
text_columns=["title", "body"],
metadata_columns=["id", "author"],
)

docs = await loader.aload()
```

| Parameter | Type | Description |
|---|---|---|
| `connection_string` | `str` | SQLAlchemy database URL (required) |
| `query` | `str` | SQL query to execute (required) |
| `text_columns` | `list[str] \| None` | Columns concatenated into the document text. Defaults to all columns. |
| `metadata_columns` | `list[str] \| None` | Columns included in metadata. Defaults to all columns. |

Each Document gets `metadata["source"] = "sql"` and `metadata["row_index"]` automatically.

---

## GitHubLoader

Load README, issues, pull requests, or repository files from a GitHub repository via the REST API. Uses the existing `httpx` dependency — no new install needed if you already have `synapsekit[web]`.

```python
from synapsekit import GitHubLoader

# README
loader = GitHubLoader(repo="SynapseKit/SynapseKit", content_type="readme")

# Issues (filters out PRs automatically)
loader = GitHubLoader(repo="SynapseKit/SynapseKit", content_type="issues", limit=20)

# Pull requests
loader = GitHubLoader(repo="SynapseKit/SynapseKit", content_type="prs", limit=10)

# Repository files (recursive Git Trees API)
loader = GitHubLoader(
repo="SynapseKit/SynapseKit",
content_type="files",
path="src/synapsekit/llm/",
limit=50,
token="ghp_...", # optional but recommended for higher rate limits
)

docs = await loader.load()
```

| Parameter | Type | Description |
|---|---|---|
| `repo` | `str` | Repository in `owner/repo` format (required) |
| `content_type` | `"readme" \| "issues" \| "prs" \| "files"` | What to load. Defaults to `"readme"`. |
| `token` | `str \| None` | GitHub token for higher rate limits |
| `path` | `str \| None` | Path prefix filter (only for `files`) |
| `limit` | `int \| None` | Maximum number of items to load |

Includes retry with exponential back-off for rate limits (HTTP 429) and 5xx errors.

---

## Loading into the RAG facade

All loaders return `List[Document]`, which you can pass directly to `add_documents()`:
Expand Down
Loading