CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Commands

# Install required tools (one-time)
go install gotest.tools/gotestsum@latest
go install github.com/swaggo/swag/cmd/swag@latest
# Atlas CLI: https://atlasgo.io/docs

# Run locally (auto-applies migrations on startup)
go run ./cmd/api

# Run tests
make test                          # pretty output via gotestsum
make test-verbose
go test ./internal/modules/ai/...  # single package
go test -run TestName ./internal/modules/router/application/...  # single test

# Build Lambda binaries
make build-ApiFunction
make build-WorkerFunction

# Regenerate Swagger docs (after changing handlers or DTOs)
make swagger

# Database migrations (Atlas)
make migrate-apply                 # apply pending migrations locally
make migrate-diff name=add_col     # generate new migration from GORM models
make migrate-status                # show applied vs pending
make migrate-hash                  # rehash after manual edits

# Production migrations (requires DATABASE_URL + DATABASE_DEV_URL env vars)
make migrate-apply-prod
make migrate-diff-prod name=add_col
make migrate-status-prod

# Seed dev database
make seed-inference-logs           # insert 200 synthetic inference_log rows over 30 days (idempotent)

Default local port is 8090 (not 8080 — see config.go).

API at http://localhost:8090 · Swagger UI at http://localhost:8090/swagger/index.html · Health check at GET /healthz.

Config loads .env then .env.local (override) at startup via godotenv; both files are silently ignored in production.

Swagger annotations

Swagger docs are generated by swag from godoc comments directly above each handler function. The global API metadata lives in cmd/api/main.go. After adding or changing any handler or DTO, run make swagger to regenerate docs/.

Global metadata (`cmd/api/main.go`)

// @title AI Proxy API
// @version 1.0
// @description ...
// @BasePath /
// @securityDefinitions.apikey BearerAuth
// @in header
// @name Authorization

BearerAuth is the only security scheme. It covers both session tokens (JWT) and API keys — the middleware decides which it accepts, not the swagger definition.

Per-handler annotation shape

Every handler follows this exact structure:

// FunctionName godoc
// @Summary     One-line summary (shown in the Swagger UI list)
// @Description Longer description — omit if the summary is already clear
// @Tags        hyperstrate
// @Tags        <feature-tag>     ← one of: auth | catalog | models | routers | jobs | conversations | observability
// @Accept      json              ← only on POST/PUT/PATCH that read a body
// @Produce     json              ← always present unless the response is 204
// @Param       <name>  <in>  <type>  <required>  "<description>"
// @Success     <code>  {object|array}  <ResponseType>
// @Failure     <code>  {object}        ErrorResponse
// @Security    BearerAuth             ← omit only on public endpoints (no auth required)
// @Router      /path/{param} [method]

Always use two @Tags lines: hyperstrate (groups all routes together in the UI) and the feature-specific tag.

`@Param` syntax

Location	Example
Path	`// @Param id path string true "Router ID"`
Query	`// @Param page query int false "Page number (default 1)"`
Query	`// @Param perPage query int false "Items per page (default 30, max 500)"`
Body	`// @Param body body application.CreateFooInput true "Foo input"`

`@Success` / `@Failure` response types

Pattern	When to use
`{object} application.FooResponse`	single JSON object
`{array} domain.FooEntity`	bare JSON array (not paginated)
`{object} pagination.Paginated[application.FooResponse]`	paginated list — use `{object}` not `{array}`
`// @Success 204`	no body (DELETE)
`{object} ErrorResponse`	all `@Failure` lines

Auth security by route group

Each module's module.go registers routes in groups with different middleware. Use @Security BearerAuth accordingly:

Group registered in	Middleware	`@Security`
`RegisterPublicRoutes`	none	omit `@Security`
`RegisterSessionRoutes`	`RequireSession` — valid JWT, any role	`// @Security BearerAuth`
`RegisterAdminRoutes`	`RequireAdmin` — valid JWT + admin role	`// @Security BearerAuth`
`RegisterInferRoutes`	`InferAuth` — API key or session JWT	`// @Security BearerAuth`

Complete examples

Public endpoint (no auth):

// GetSetupStatus godoc
// @Summary     Check if initial setup is required
// @Tags        hyperstrate
// @Tags        auth
// @Produce     json
// @Success     200  {object}  application.SetupStatusResponse
// @Router      /auth/setup/status [get]

Admin endpoint with body + path param:

// UpdateOrganization godoc
// @Summary     Update an organisation
// @Tags        hyperstrate
// @Tags        auth
// @Accept      json
// @Produce     json
// @Param       orgId  path  string                              true  "Org ID"
// @Param       body   body  application.UpdateOrganizationInput true  "Fields to update"
// @Success     200    {object}  application.OrganizationResponse
// @Failure     400    {object}  ErrorResponse
// @Failure     404    {object}  ErrorResponse
// @Security    BearerAuth
// @Router      /auth/organizations/{orgId} [patch]

Paginated list with optional query filter:

// ListAPIKeys godoc
// @Summary     List API keys
// @Tags        hyperstrate
// @Tags        auth
// @Produce     json
// @Param       routerId  query  string  false  "Filter by router ID"
// @Param       page      query  int     false  "Page number (default 1)"
// @Param       perPage   query  int     false  "Items per page (default 30, max 500)"
// @Success     200  {object}  pagination.Paginated[application.APIKeyResponse]
// @Security    BearerAuth
// @Router      /auth/api-keys [get]

Delete (204 no body):

// DeleteOrganization godoc
// @Summary     Delete an organisation
// @Tags        hyperstrate
// @Tags        auth
// @Param       orgId  path  string  true  "Org ID"
// @Success     204
// @Failure     404  {object}  ErrorResponse
// @Security    BearerAuth
// @Router      /auth/organizations/{orgId} [delete]

Architecture

Go 1.25, Gin, Fx (dependency injection), GORM, Atlas migrations, AWS Lambda + SQS.

Module structure

Every feature module lives under internal/modules/<name>/ and follows the same layout:

domain/             entities, repository interfaces, domain errors
application/        use-case service, DTOs, event types
infrastructure/
  persistence/      GORM repositories implementing domain interfaces
  proxy/ or vault/  external integrations
interfaces/http/    Gin handlers
module.go           Fx wiring: provides dependencies, invokes route registration

internal/app/app.go composes all modules into three Fx app factories:

NewHTTPApp() — local dev server
NewLambdaApp() — API Gateway Lambda handler
NewWorkerApp() — SQS worker Lambda (minimal: no HTTP, no router, no auth)

Modules

ai — model catalog, registrations, inference, async jobs, conversations, MCP servers.

domain/catalog.go: static map of all supported models baked into the binary. Adding a model = adding an entry here; no DB change needed.
A registration (domain/model.go) links a catalog key to a DB row and API key config.
Inference dispatches via application.JobDispatcher: goroutine locally, SQS when SQS_QUEUE_URL is set (selected at Fx startup in module.go:newJobDispatcher).
Conversations (domain/conversation.go) — multi-turn chat sessions stored in the DB; CRUD at /ai/conversations, messages at /ai/conversations/:id/messages.
MCP servers (domain/mcp_server.go) — registered external tool servers used by the router mcp_tools feature; CRUD at /ai/mcp/servers.
Routes split into two groups: /ai admin (session token + admin role) and /ai infer (API key or session).

router — named routers that proxy inference through a configurable pipeline.

A router has targets (model registrations), features (pipeline stages), interceptors, and a 1:1 RouterConfiguration row.
RouterConfiguration fields: WebhookURL (event notifications), PromptID (linked prompt template), StorePayloads (persist raw request+response to inference_payloads).
application/pipeline.go runs the full request pipeline: rate limit → budget → cache → field transforms → interceptors → target selection → inference → budget accounting → cache store. Additional stages (hedging, quality gate, MCP tool dispatch, semantic memory) activate when the corresponding features are enabled.
Routing strategies: round_robin, weighted, percentage, failover, random, latency_based.
Feature types: response_cache, semantic_cache, retry, fallback, token_optimization, context_trimming, rate_limit, budget, mcp_tools, health_check, prompt_caching, structured_output, request_coalescing, hedging, quality_gate, context_compression, semantic_memory, cost_aware_routing, response_prefetch, response_fingerprinting.
Interceptor types: semantic_classifier, content_filter, pii_detector, prompt_guard, ab_test, prompt_shield, team_budget.
Semantic features (embedding-based cache, classifier, semantic memory) degrade gracefully when no EmbeddingProvider is wired; currently a noopEmbedder is provided.
Routes split into three groups: RegisterCRUDRoutes (admin session required — CRUD for routers, targets, features, interceptors, evaluations), RegisterInferRoutes (InferAuth — /router/:id/infer, /router/:id/v1/chat/completions, /router/:id/v1/messages), and RegisterProxyRoutes (InferAuth — /proxy/router/:id/*path, usable as an OpenAI or Anthropic SDK baseURL).
application/service_eval.go manages evaluation sets: named collections of test cases with exact, contains, or llm scoring, run as scored RouterEvaluationRun records.

prompts — named, versioned system-prompt templates.

A Prompt has {{variable}} placeholders extracted on save and stored in a variables JSON column.
Every save creates an immutable PromptVersion snapshot; full version history with restore is supported.
Prompts can be attached to a router via router.PromptID; the pipeline loads and interpolates the prompt before inference.
All routes require admin session (RequireAdmin middleware).

auth — organizations, users, teams, API keys, virtual keys, OIDC login.

Two middleware types used across all modules: RequireAdmin(sessionValidator) and InferAuth(keyValidator) (accepts API key or session).
JWT secret comes from JWT_SECRET env var; there's a hardcoded insecure fallback for dev.
vault.Provider is a NoopProvider by default; swap it to store API keys in a real vault.
VirtualKey supports optional spending/request budgets with ResetPeriod (daily, weekly, monthly) and per-key RateLimitRPS (token-bucket, in-process, not persisted).
Team tracks aggregate spending (UsedRequests, UsedCostUSD) with optional MaxRequests/MaxCostUSD caps.
RouterTeamAccess rows restrict a router to specific teams; when no rows exist the router is open to all authenticated callers.

observability — inference logs, health monitoring.

observability/module.go registers listeners on ai/application.InferenceEventBus and router/application.RouterInferenceEventBus to persist inference logs. The observability module owns this wiring — emitting modules know nothing about it.

Cross-module communication: event buses

When one module needs to react to something that happened in another module, use an event bus — never a direct interface dependency between modules.

Pattern (see ai/application/events.go for the canonical example):

The emitting module defines the event type and a typed bus in its application/events.go:

type ThingHappenedEvent struct { ... }
type ThingHappenedListener func(e ThingHappenedEvent)
type ThingEventBus struct { listeners []ThingHappenedListener }
func NewThingEventBus() *ThingEventBus { ... }
func (b *ThingEventBus) OnHappened(l ThingHappenedListener) { ... }
func (b *ThingEventBus) Emit(e ThingHappenedEvent) { ... }

The emitting module provides the bus via fx.Provide(application.NewThingEventBus) in its module.go and calls bus.Emit(...) at the right moment in its service.

The consuming module registers a listener in its own module.go via fx.Invoke:

func registerThingListeners(bus *emitterApp.ThingEventBus, svc application.Service) {
    bus.OnHappened(func(e emitterApp.ThingHappenedEvent) { svc.Handle(e) })
}

Rules:

Emitting modules must not import consuming modules — the bus is the only coupling.
The consuming module owns the listener registration, not the emitter.
Use context.Context in listener signatures only when the listener needs to propagate cancellation. Fire-and-forget listeners (e.g. observability) take the event value directly and must not block.
Buses run listeners synchronously in registration order. If a listener can be slow, have it spawn a goroutine internally.
Existing buses: ai/application.ModelEventBus (model lifecycle), ai/application.InferenceEventBus (direct inference calls), ai/application.MCPServerEventBus (MCP server lifecycle), router/application.RouterInferenceEventBus (routed inference calls), router/application.RouterTargetEventBus (target deletion), prompts/application.PromptEventBus (prompt lifecycle).

Cross-module synchronous queries (adapter pattern)

For synchronous reads from another module (not fire-and-forget events), define a narrow interface in your own application/ layer and implement a private adapter in module.go that delegates to the other module's service. This keeps modules decoupled while satisfying compile-time dependencies through Fx.

Example from router/module.go: PromptLoader, HealthChecker, BudgetQuerier, MCPServerLoader, and ModelLookup are interfaces defined in router/application/ and fulfilled by private adapter structs (e.g. promptLoaderAdapter, obsHealthAdapter) in router/module.go that wrap services from the prompts, observability, and ai modules respectively.

The adapter is wired via fx.Provide in the consuming module's Module():

fx.Provide(newPromptLoaderAdapter)  // returns application.PromptLoader

func newPromptLoaderAdapter(svc promptsApp.Service) application.PromptLoader {
    return &promptLoaderAdapter{svc: svc}
}

Input validation

Validation is the handler's responsibility; services trust their inputs.

Body DTOs (Gin binding tags)

Use binding tags on input structs — Gin's validator runs automatically on ShouldBindJSON. The handler calls respondBindError(c, err, &input) on failure.

Scenario	Tag
Required string	`binding:"required"`
Required string with length cap	`binding:"required,max=255"`
Optional URL	`binding:"omitempty,url"`
Required URL	`binding:"required,url"`
Percentage (0–100)	`binding:"min=0,max=100"`
Non-negative integer/float	`binding:"min=0"`
Enum field	validate in the handler after binding, or add a custom `oneof=val1 val2` tag

Specific gaps to fix when touching these DTOs:

AddConversationMessageInput.Role — add binding:"required,oneof=user assistant"
AddTargetInput.Percentage — add binding:"min=0,max=100"
CreateVirtualKeyInput.MaxRequests / MaxCostUSD and CreateTeamInput.MaxRequests / MaxCostUSD — add binding:"min=0"
SubmitJobRequest.CallbackURL — add binding:"omitempty,url"
All Name string fields — add max=255 (e.g. binding:"required,max=255")

Path parameters

No path param validation exists today. Every handler passes c.Param("id") straight to the service. When adding or editing a handler that reads a path param, validate it explicitly before calling the service:

id := c.Param("id")
if id == "" || len(id) > 100 {
    c.JSON(http.StatusBadRequest, gin.H{"error": "invalid id"})
    return
}

A shared validateParam(c *gin.Context, name string) (string, bool) helper in each module's handler file is the preferred pattern (returns the value and false + writes the 400 if invalid).

Query parameters

Filter IDs passed as query params (e.g. routerId, status) must be at most 100 chars; reject or ignore longer values.
parseDateRangeOptional() in the observability handler silently returns nil on a bad date — it should return an explicit 400 instead.
Numeric query params with explicit range constraints (limit, offset) should clamp or reject out-of-range values consistently. Current pattern (observability limit): values ≤ 0 or > max fall back to the default. Apply this pattern to offset as well.

Error responses

ErrorResponse{Error string, Fields map[string][]string} is the standard error envelope (defined locally in each module's handler file, not shared). Fields is populated only for binding validation errors via validation.BindingErrors from internal/shared/validation.

Each module's handler file has its own respondError(c, err) and respondBindError(c, err, &input) — intentionally duplicated because respondError switches on that module's own domain sentinel errors to map them to the correct HTTP status code.

Shared utilities

internal/shared/ contains packages used across modules — do not import one module from another; use these instead.

dbtype — dialect-aware GORM column types for JSON columns. Use dbtype.JSONMap (map[string]any), dbtype.JSONStringMap (map[string]string), or dbtype.JSONStringSlice ([]string) on GORM struct fields that need JSON storage. They serialize as jsonb on PostgreSQL and text on SQLite automatically.
template — template.Interpolate(text, vars) replaces {{key}} placeholders; template.ExtractVariables(text) returns the unique set of placeholder names. Used by the prompts module; reuse here rather than reimplementing.
validation — validation.BindingErrors(err, input) converts a validator.ValidationErrors into a human-readable summary and a map[string][]string of field-level messages keyed by JSON field name.
pagination — pagination.ParseSlice(c) reads page/perPage query params; pagination.New(items, total, slice) wraps results in pagination.Paginated[T].
audit / webhook — global singletons set once at startup by the observability module. audit.Log(...) and webhook.Send(...) are no-ops until the logger/recorder is registered — they silently do nothing outside a full NewHTTPApp() / NewLambdaApp() context (e.g. in unit tests).
httpserver — httpserver.NewRouter(cfg) creates the shared Gin engine with CORS (origin from FRONTEND_URL), GET /healthz, Swagger UI (GET /swagger/*), and structured slog request logging (5xx→Error, 4xx→Warn, 2xx→Info, healthz suppressed). Provided by app.go — all module handlers attach their routes to this engine.
logger — logger.Init() configures a colored slog handler (via tint) as the global default. Colors are disabled automatically when stderr is not a TTY (CI, Lambda). Called once from cmd/api/main.go at startup.

Metrics endpoint

GET /metrics returns Prometheus text format. It is registered in internal/app/app.go by registerMetricsEndpoint and aggregates all metrics.Collector implementations. Currently only router pipeline metrics are collected (request counts + average latency per router).

Adding a new module

Create internal/modules/<name>/ with the standard layout above.
Export func Module() fx.Option in module.go.
Import and add <name>.Module() in internal/app/app.go.
Register routes inside your module.go via fx.Invoke.

Database migrations

Migrations live in internal/db/migrations/ with separate sqlite/ and postgres/ sub-directories. Atlas drives them; atlas.hcl defines local and production environments. All migration files are embedded into the binary at compile time via embed.FS — no external migration files are needed at runtime.

The app auto-applies pending migrations at startup for local dev — make migrate-apply is only needed if you want to apply without starting the server.

After changing any GORM struct that maps to a DB table, run make migrate-diff name=<description> to generate a new versioned SQL file, then run make migrate-hash if you edit it manually.

SQLite note: the DB layer forces WAL journal mode and SetMaxOpenConns(1) — concurrent writers will queue, not error.

`cmd/migrate`

Standalone binary that registers GORM models with Atlas so make migrate-diff can generate SQL from struct changes. Not invoked directly — Atlas calls it via the atlas.hcl data source.

Testing conventions

Tests operate at the application layer using hand-rolled stub repositories — no test database, no integration harness. Service tests live in package application_test (external); pipeline tests live in package application (internal, so they can reach unexported pipeline helpers).

Each module has an org_security_test.go file that verifies org isolation: stubs return ErrNotFound when the orgID doesn't match the resource owner, and tests confirm the service propagates orgID through every repo call. Add a case here whenever you add a service method that takes orgID.

Environment variables

Variable	Default	Notes
`PORT`	`8090`	Local HTTP port
`DATABASE_DSN`	SQLite file	Override to a PostgreSQL DSN in prod
`JWT_SECRET`	insecure default	Must be set in production
`ADMIN_EMAIL`	—	User with this email always gets admin role
`SQS_QUEUE_URL`	—	Set to use SQS dispatcher instead of goroutines
`FRONTEND_URL`	`http://localhost:8080`	OIDC redirect target
`OIDC_JWKS_URL`	—	Required for OIDC token exchange
`OIDC_PROVIDERS`	—	Comma-separated list (e.g. `google,github`)
`CACHE_BACKEND`	`memory`	`memory` or `redis`
`CACHE_REDIS_ADDR`	—	Required when `CACHE_BACKEND=redis` (e.g. `localhost:6379`)
`CACHE_REDIS_PREFIX`	—	Optional key namespace for the Redis cache
`RATE_LIMIT_BACKEND`	`memory`	`memory` or `redis`
`APP_ENV`	`development`	Set to `production` to require `JWT_SECRET`
`LOG_RETENTION_DAYS`	`90`	Days before inference/audit logs are purged
`HEALTH_CHECK_INTERVAL_SECS`	`120`	Interval between provider health probes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CLAUDE.md

Commands

Swagger annotations

Global metadata (`cmd/api/main.go`)

Per-handler annotation shape

`@Param` syntax

`@Success` / `@Failure` response types

Auth security by route group

Complete examples

Architecture

Module structure

Modules

Cross-module communication: event buses

Cross-module synchronous queries (adapter pattern)

Input validation

Body DTOs (Gin binding tags)

Path parameters

Query parameters

Error responses

Shared utilities

Metrics endpoint

Adding a new module

Database migrations

`cmd/migrate`

Testing conventions

Environment variables

Uh oh!

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Commands

Swagger annotations

Global metadata (cmd/api/main.go)

Per-handler annotation shape

@Param syntax

@Success / @Failure response types

Auth security by route group

Complete examples

Architecture

Module structure

Modules

Cross-module communication: event buses

Cross-module synchronous queries (adapter pattern)

Input validation

Body DTOs (Gin binding tags)

Path parameters

Query parameters

Error responses

Shared utilities

Metrics endpoint

Adding a new module

Database migrations

cmd/migrate

Testing conventions

Environment variables

Global metadata (`cmd/api/main.go`)

`@Param` syntax

`@Success` / `@Failure` response types

`cmd/migrate`