VulnAI is an AI-powered Static Application Security Testing (SAST) engine that uses machine learning to detect vulnerabilities in source code. Built on transformer models (CodeBERT), it provides accurate vulnerability classification with explainable results.
- ML-Based Detection: Transformer-based vulnerability classification
- Multi-Language Support: Python, Java, JavaScript, TypeScript, C/C++
- 10+ Vulnerability Categories: SQL Injection, XSS, Code Injection, and more
- REST API: FastAPI-based detection service
- CLI Tool: Easy command-line scanning
- Vector Database: Similarity search for vulnerability intelligence
- False Positive Reduction: Rule-based filtering + taint analysis
- Explainable AI: Attention visualization for vulnerability highlighting
┌─────────────────────────────────────────────────────────────────────────────┐
│ AI-Powered SAST Engine Architecture │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ USER INTERFACE LAYER │
│ ┌─────────────────┐ ┌────────────────────────────────┐│
│ │ REST API │ │ CLI Detection Tool ││
│ │ (FastAPI) │ │ (Python-based Scanner) ││
│ └────────┬────────┘ └───────────────┬────────────────┘│
└───────────┼─────────────────────────────────────────────┼──────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ DETECTION ENGINE LAYER │
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────────────────────┐│
│ │ Code Parser │ │ AST Generator │ │ Rule-Based Filter ││
│ │ (Multi-lang) │ │ (Tree-sitter) │ │ (False Positive Reduction) ││
│ └────────┬────────┘ └────────┬────────┘ └───────────────┬────────────────┘│
│ │ │ │ │
│ └──────────┬─────────┘ │ │
│ ▼ │ │
│ ┌──────────────────────────────────────────────────────────▼─────────────┐│
│ │ MODEL INFERENCE ENGINE ││
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐││
│ │ │ CodeBERT │ │ Similarity │ │ Output Formatter │││
│ │ │ Embedding │ │ Search │ │ (JSON Results) │││
│ │ └────────┬────────┘ └────────┬────────┘ └──────────────┬────────┘││
│ └───────────┼────────────────────┼─────────────────────────┼──────────┘│
└──────────────┼────────────────────┼─────────────────────────┼─────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ VULNERABILITY INTELLIGENCE LAYER │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ PostgreSQL + pgvector Database ││
│ │ ┌──────────────────────┐ ┌─────────────────────────────────────────┐││
│ │ │ vulnerabilities │ │ detected_issues │││
│ │ │ - id │ │ - id │││
│ │ │ - cwe_id │ │ - file_name │││
│ │ │ - name │ │ - line_number │││
│ │ │ - description │ │ - detected_cwe │││
│ │ │ - severity │ │ - confidence │││
│ │ │ - remediation │ │ - timestamp │││
│ │ │ - embedding_vector │ │ │││
│ │ └──────────────────────┘ └─────────────────────────────────────────┘││
│ └─────────────────────────────────────────────────────────────────────────┘│
└──────────────────────────────────────────────────────────────────────────────┘
- Python 3.10+
- PostgreSQL 15+ (optional, for database)
- CUDA-capable GPU (recommended for training)
# Clone the repository
git clone https://github.com/vulnai/sast-engine.git
cd sast-engine
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install package
pip install -e .# Scan a single file
vulnai scan -f path/to/code.py
# Scan a directory
vulnai scan -d ./src
# Output JSON
vulnai scan -f app.py -o json
# Specify language
vulnai scan -f main.js -l javascript
# Scan with verbose output
vulnai scan -f app.py -v# Start the API server
uvicorn vulnai.api.main:app --host 0.0.0.0 --port 8000
# API Documentation
# Open http://localhost:8000/docs in your browserfrom vulnai.detection.engine import DetectionEngine
# Initialize engine
engine = DetectionEngine(
model_path="models/trained/vulnai_classifier.pt",
confidence_threshold=0.5
)
# Detect vulnerabilities
result = engine.detect(code="your code here")
print(f"Is vulnerable: {result.is_vulnerable}")
for vuln in result.vulnerabilities:
print(f" {vuln.cwe_id} at line {vuln.line_number}")from vulnai.models.trainer import ModelTrainer, TrainingConfig
from vulnai.data.loader import load_training_data
# Load data
train, val, test = load_training_data()
# Configure training
config = TrainingConfig(
model_name="microsoft/codebert-base",
num_epochs=10,
batch_size=16,
learning_rate=2e-5
)
# Train model
trainer = ModelTrainer(config)
history = trainer.train(train_loader, val_loader)| CWE ID | Vulnerability Type | Severity |
|---|---|---|
| CWE-89 | SQL Injection | HIGH |
| CWE-79 | Cross-Site Scripting (XSS) | MEDIUM |
| CWE-94 | Code Injection | HIGH |
| CWE-78 | OS Command Injection | HIGH |
| CWE-287 | Insecure Authentication | HIGH |
| CWE-862 | Insecure Authorization | MEDIUM |
| CWE-434 | Unrestricted File Upload | HIGH |
| CWE-502 | Insecure Deserialization | HIGH |
| CWE-119 | Buffer Overflow | HIGH |
| CWE-200 | Information Exposure | LOW |
Environment variables can be set in .env file:
# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/vulnai_db
# Model
MODEL_NAME=microsoft/codebert-base
MAX_SEQ_LENGTH=512
# API
API_HOST=0.0.0.0
API_PORT=8000
# Detection
CONFIDENCE_THRESHOLD=0.5vulnai/
├── api/ # FastAPI REST API
│ ├── main.py
│ ├── models/
│ └── routes/
├── cli/ # CLI tool
├── core/ # Configuration & logging
├── data/ # Data collection & loading
├── detection/ # Detection engine
├── models/ # ML models & training
├── preprocessing/ # Code preprocessing
└── storage/ # Database & vector store
Run evaluation on test data:
from vulnai.models.evaluator import evaluate_model
results = evaluate_model(
model_path="models/trained/vulnai_classifier.pt",
dataloader=test_loader,
output_dir="evaluation"
)
print(results.accuracy)
print(results.f1_score)
print(results.false_positive_rate)| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/detect | Detect vulnerabilities in code |
| GET | /api/v1/vulnerabilities | List stored vulnerabilities |
| GET | /api/v1/vulnerabilities/{cwe_id} | Get specific vulnerability |
| POST | /api/v1/feedback | Submit feedback for learning |
| GET | /api/v1/stats | Get detection statistics |
| GET | /api/v1/health | Health check |
# Build image
docker build -t vulnai/sast-engine .
# Run container
docker run -p 8000:8000 vulnai/sast-engineContributions are welcome! Please read our contributing guidelines first.
This project is licensed under the MIT License - see the LICENSE file for details.
