Skip to content

alim1496/cloud-resource-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-Based Agentic Workflow for Cloud Resource Analysis

A conversational AI agent that enables engineers to inspect and diagnose cloud infrastructure using plain English. Instead of navigating static dashboards, users ask natural language questions and the agent dynamically decides which metrics to query, executes the relevant tools, and synthesizes a diagnostic answer.

Overview

The agent is powered by a local LLM (via Ollama) with tool calling. Given a user query, it autonomously selects and calls the appropriate inspection tools — CPU usage, memory usage, error logs, cost breakdown — then composes a final answer based on the retrieved data. No hardcoded query logic; the LLM drives the reasoning.

Architecture

User Query
    ↓
Streamlit Chat UI
    ↓
LLM Agent (Ollama — qwen2.5)
    ↓
Tool Dispatcher
    ↓
┌─────────────────────────────────┐
│ list_services()                 │
│ get_cpu_usage(service, hours)   │
│ get_memory_usage(service, hours)│
│ get_error_logs(service, limit)  │
│ get_cost_summary(days)          │
└─────────────────────────────────┘
    ↓
SQLite Metrics Database
    ↓
Synthesized Answer

Project Structure

cloud-resource-agent/
├── data/
│   └── generate.py         # Generates synthetic cloud metrics (SQLite)
├── agent/
│   ├── tools.py            # Tool functions that query the database
│   └── agent.py            # LLM agent with agentic tool-calling loop
├── ui/
│   └── app.py              # Streamlit chat interface
├── requirements.txt
└── README.md

Features

  • Natural language interface — ask questions in plain English
  • Agentic reasoning — the LLM decides which tools to call and in what order
  • Multi-step queries — e.g. "Compare CPU and memory of ml-pipeline and database"
  • Tool call transparency — every tool invoked is shown in an expandable panel
  • Fully local — runs on Ollama, no external API or internet required

Dataset

Synthetic cloud metrics are generated for 7 simulated services:

Service Type Region
web-api EC2 us-east-1
payment-service EC2 us-east-1
auth-service Lambda us-west-2
database RDS us-east-1
cache ElastiCache us-east-1
ml-pipeline EC2 eu-west-1
notification-svc Lambda us-west-2

30 days of hourly CPU, memory, error logs, and daily cost data are generated per service.

Tech Stack

Layer Technology
LLM qwen2.5 via Ollama (local)
Agent Python with tool calling loop
Database SQLite
UI Streamlit

Setup & Run

Prerequisites

  • Python 3.8+
  • Ollama installed and running

1. Install Ollama model

ollama pull qwen2.5

2. Install Python dependencies

pip install -r requirements.txt

3. Generate synthetic data

python data/generate.py

4. Start the app

streamlit run ui/app.py

Open http://localhost:8501 in your browser.

Example Queries

  • "List all running services"
  • "Which service has the highest CPU usage?"
  • "Show me recent errors from payment-service"
  • "What is the most expensive service this month?"
  • "Compare the CPU and memory usage of ml-pipeline and database"
  • "Which service has the most errors?"

Notes

  • The SQLite database is not committed — regenerate it with python data/generate.py
  • Ollama must be running before starting the app (ollama serve if not auto-started)
  • Tool call logs appear in the terminal for debugging response times

Releases

No releases published

Packages

 
 
 

Contributors

Languages