A conversational AI agent that enables engineers to inspect and diagnose cloud infrastructure using plain English. Instead of navigating static dashboards, users ask natural language questions and the agent dynamically decides which metrics to query, executes the relevant tools, and synthesizes a diagnostic answer.
The agent is powered by a local LLM (via Ollama) with tool calling. Given a user query, it autonomously selects and calls the appropriate inspection tools — CPU usage, memory usage, error logs, cost breakdown — then composes a final answer based on the retrieved data. No hardcoded query logic; the LLM drives the reasoning.
User Query
↓
Streamlit Chat UI
↓
LLM Agent (Ollama — qwen2.5)
↓
Tool Dispatcher
↓
┌─────────────────────────────────┐
│ list_services() │
│ get_cpu_usage(service, hours) │
│ get_memory_usage(service, hours)│
│ get_error_logs(service, limit) │
│ get_cost_summary(days) │
└─────────────────────────────────┘
↓
SQLite Metrics Database
↓
Synthesized Answer
cloud-resource-agent/
├── data/
│ └── generate.py # Generates synthetic cloud metrics (SQLite)
├── agent/
│ ├── tools.py # Tool functions that query the database
│ └── agent.py # LLM agent with agentic tool-calling loop
├── ui/
│ └── app.py # Streamlit chat interface
├── requirements.txt
└── README.md
- Natural language interface — ask questions in plain English
- Agentic reasoning — the LLM decides which tools to call and in what order
- Multi-step queries — e.g. "Compare CPU and memory of ml-pipeline and database"
- Tool call transparency — every tool invoked is shown in an expandable panel
- Fully local — runs on Ollama, no external API or internet required
Synthetic cloud metrics are generated for 7 simulated services:
| Service | Type | Region |
|---|---|---|
| web-api | EC2 | us-east-1 |
| payment-service | EC2 | us-east-1 |
| auth-service | Lambda | us-west-2 |
| database | RDS | us-east-1 |
| cache | ElastiCache | us-east-1 |
| ml-pipeline | EC2 | eu-west-1 |
| notification-svc | Lambda | us-west-2 |
30 days of hourly CPU, memory, error logs, and daily cost data are generated per service.
| Layer | Technology |
|---|---|
| LLM | qwen2.5 via Ollama (local) |
| Agent | Python with tool calling loop |
| Database | SQLite |
| UI | Streamlit |
- Python 3.8+
- Ollama installed and running
ollama pull qwen2.5pip install -r requirements.txtpython data/generate.pystreamlit run ui/app.pyOpen http://localhost:8501 in your browser.
- "List all running services"
- "Which service has the highest CPU usage?"
- "Show me recent errors from payment-service"
- "What is the most expensive service this month?"
- "Compare the CPU and memory usage of ml-pipeline and database"
- "Which service has the most errors?"
- The SQLite database is not committed — regenerate it with
python data/generate.py - Ollama must be running before starting the app (
ollama serveif not auto-started) - Tool call logs appear in the terminal for debugging response times