Skip to content

Latest commit

 

History

History
345 lines (273 loc) · 6.31 KB

File metadata and controls

345 lines (273 loc) · 6.31 KB

Memory-Optimized Agent API Documentation

REST API for semantic memory management with LLM agents.

Quick Start

1. Run Migrations

python migrations/run_migration.py

2. Start the Server

python src/api/app.py

Or using uvicorn directly:

uvicorn src.api.app:app --reload --host 0.0.0.0 --port 8000

3. Access Documentation

Authentication

All endpoints (except /health) require API key authentication.

Include the API key in request headers:

X-API-Key: your-api-key-here

Default API key for development: dev-key-12345

Set custom API key via environment variable:

export API_KEY=your-secure-key

Endpoints

Health Check

GET /health

Check API and database connectivity status.

Response:

{
  "status": "healthy",
  "timestamp": "2025-01-15T10:30:00Z",
  "database": "connected"
}

Create Session

POST /sessions

Create a new chat session.

Request:

{
  "user_id": "user123",
  "session_name": "Technical Discussion"
}

Response:

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "user123",
  "created_at": "2025-01-15T10:30:00Z",
  "message_count": 0
}

Get Session

GET /sessions/{session_id}

Get session details and message count.

Response:

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "user123",
  "created_at": "2025-01-15T10:30:00Z",
  "message_count": 15
}

Delete Session

DELETE /sessions/{session_id}

Delete session and all associated memories.

Response:

{
  "message": "Session 550e8400-e29b-41d4-a716-446655440000 deleted successfully"
}

Chat

POST /chat

Send a message and get AI response with semantic memory.

Request:

{
  "message": "What did we discuss about Python?",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "user123"
}

Response:

{
  "response": "Based on our previous conversation, we discussed...",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "relevant_context_count": 5
}

Search Memories

POST /search

Search for relevant memories using semantic similarity.

Request:

{
  "query": "machine learning models",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "user123",
  "limit": 10
}

Response:

{
  "memories": [
    {
      "id": "conversations_user123_msg_1",
      "key": "msg_1",
      "value": {
        "role": "human",
        "content": "Tell me about machine learning models",
        "session_id": "550e8400-e29b-41d4-a716-446655440000"
      },
      "timestamp": "2025-01-15T10:25:00Z",
      "score": 0.92
    }
  ],
  "count": 1
}

Example Usage

cURL

# Health check
curl -X GET http://localhost:8000/health \
  -H "X-API-Key: dev-key-12345"

# Create session
curl -X POST http://localhost:8000/sessions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key-12345" \
  -d '{
    "user_id": "user123",
    "session_name": "Test Session"
  }'

# Send chat message
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key-12345" \
  -d '{
    "message": "Hello, how are you?",
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "user_id": "user123"
  }'

Python

import requests

BASE_URL = "http://localhost:8000"
HEADERS = {
    "X-API-Key": "dev-key-12345",
    "Content-Type": "application/json"
}

# Create session
response = requests.post(
    f"{BASE_URL}/sessions",
    headers=HEADERS,
    json={"user_id": "user123", "session_name": "Test"}
)
session = response.json()
session_id = session["session_id"]

# Chat
response = requests.post(
    f"{BASE_URL}/chat",
    headers=HEADERS,
    json={
        "message": "What is semantic search?",
        "session_id": session_id,
        "user_id": "user123"
    }
)
chat_response = response.json()
print(chat_response["response"])

JavaScript/TypeScript

const BASE_URL = "http://localhost:8000";
const API_KEY = "dev-key-12345";

// Create session
const sessionResponse = await fetch(`${BASE_URL}/sessions`, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "X-API-Key": API_KEY,
  },
  body: JSON.stringify({
    user_id: "user123",
    session_name: "Test Session",
  }),
});
const session = await sessionResponse.json();

// Chat
const chatResponse = await fetch(`${BASE_URL}/chat`, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "X-API-Key": API_KEY,
  },
  body: JSON.stringify({
    message: "Tell me about AI",
    session_id: session.session_id,
    user_id: "user123",
  }),
});
const chat = await chatResponse.json();
console.log(chat.response);

Configuration

Environment Variables

# Database Configuration
export POSTGRES_HOST=localhost
export POSTGRES_PORT=5432
export POSTGRES_DB=memory_agent
export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=postgres

# API Configuration
export API_KEY=your-secure-api-key

# Server Configuration
export HOST=0.0.0.0
export PORT=8000

Error Handling

All endpoints return standard HTTP status codes:

  • 200: Success
  • 201: Created
  • 400: Bad Request (invalid input)
  • 401: Unauthorized (missing API key)
  • 403: Forbidden (invalid API key)
  • 404: Not Found
  • 500: Internal Server Error

Error Response Format:

{
  "detail": "Error description here"
}

Rate Limiting

Not currently implemented. Consider adding rate limiting middleware for production:

  • slowapi
  • fastapi-limiter
  • Custom middleware

Production Considerations

  1. Security:

    • Use strong API keys (environment-based)
    • Enable HTTPS/TLS
    • Configure CORS appropriately
    • Add rate limiting
  2. Performance:

    • Use connection pooling (pgbouncer)
    • Enable caching (Redis)
    • Add request validation
    • Monitor response times
  3. Monitoring:

    • Add logging middleware
    • Integrate with Prometheus/Grafana
    • Set up error tracking (Sentry)
    • Monitor database performance
  4. Scaling:

    • Deploy behind load balancer
    • Use container orchestration (Kubernetes)
    • Implement horizontal scaling
    • Add health checks for auto-scaling