MCP-first, eval-driven starter architecture for agent systems.
This is a public technical workbench from the SAIBA direction: small, inspectable agent services with explicit tool boundaries, TypeScript-first conventions, and evals before production claims.
Public starter/workbench. Not a production product.
Use this repo to inspect the architecture choices behind the current SAIBA agent work. Production client implementation details live elsewhere.
- A pnpm monorepo layout for agent services and MCP servers.
- TypeScript ESM as the default runtime style.
- MCP as the boundary between agent logic and external tools.
- A starter eval harness for checking agent behavior against concrete cases.
- Lightweight templates that can become client-specific systems without starting from a blank folder.
For the public Gus/Kisgus profile, this repo is the technical counterpart to the SAIBA narrative. It shows the working style behind the thesis: small services, narrow tool boundaries, explicit evals, and reusable agent scaffolds instead of one-off prompt experiments.
| Path | Purpose |
|---|---|
starters/agent-starter |
Minimal agent service with config, tools, server entrypoint, and eval harness. |
starters/mcp-server-starter |
Minimal MCP server template for exposing tools/resources to agent clients. |
docs/adrs |
Architecture decision records for monorepo, TypeScript, MCP, and eval choices. |
tsconfig.base.json |
Shared TypeScript baseline. |
pnpm-workspace.yaml |
Workspace package map. |
- ADR-001: Monorepo with pnpm workspaces
- ADR-002: TypeScript ESM first
- ADR-003: MCP-first tool integration
- ADR-004: Eval-driven development
pnpm install
pnpm -r typecheck
pnpm -r test
pnpm -r lintEval runs may require API keys and can cost money. They are intentionally not treated as always-on CI by default.
This repo is a starter surface, not an autopilot.
- Tool access should be exposed through MCP servers with narrow permissions.
- Customer-facing sends require a human approval gate.
- Agent outputs should have receipts, source references, and review status.
- Client-specific secrets, raw customer data, and private implementation details do not belong in this public repo.