Skip to content

r11creates/codeAtlas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeAtlas AI

CodeAtlas AI is a full-stack AI repository analysis platform. Paste a public GitHub repository URL, let the backend index it asynchronously, then explore the codebase through file metadata, a dependency graph, an architecture summary, semantic search, and repository-aware chat with cited files.

The goal of this project is to demonstrate production-style full-stack engineering: background jobs, database-backed indexing, vector search, AI integration, and a polished developer-facing dashboard.

Demo Screenshots

Homepage

Homepage

Indexing Status

Indexing status

Dashboard Overview

Dashboard overview

File Explorer

File explorer

Dependency Graph

Dependency graph

RAG Chat

Repository chat

Features

  • Submit and index public GitHub repositories.
  • Run indexing asynchronously with Redis + RQ.
  • Store repositories, files, chunks, summaries, dependencies, and chat messages in PostgreSQL.
  • Use pgvector for embedding-backed code retrieval.
  • Generate architecture summaries from indexed repository context.
  • Ask repository-aware questions and receive answers with cited source files.
  • Browse an indexed file tree and preview file contents.
  • Generate AI explanations for individual files.
  • View a basic file-level dependency graph built from local imports.
  • Re-index existing repositories from the UI.

Tech Stack

Area Technology
Frontend Next.js, React, TypeScript, Tailwind CSS
Backend FastAPI, SQLAlchemy, Alembic
Database PostgreSQL, pgvector
Background jobs Redis, RQ
AI OpenAI embeddings and chat models
Graph UI React Flow
Local development Docker Compose

Architecture

Browser
  |
  v
Next.js frontend
  |
  v
FastAPI backend  --->  PostgreSQL + pgvector
  |
  v
Redis queue
  |
  v
RQ worker  ---> clone repo, scan files, chunk code, embed chunks, build graph, summarize
  |
  v
OpenAI API

Indexing flow:

  1. The user submits a public GitHub repository URL.
  2. FastAPI validates the URL, creates a repositories row, and enqueues an RQ job.
  3. The worker clones the repository into a temporary directory.
  4. The worker ignores generated folders, binaries, oversized files, lock files, and common build artifacts.
  5. Source files are scanned, language-tagged, counted, and stored.
  6. File contents are chunked and embedded.
  7. Local import relationships are parsed into a dependency graph.
  8. An architecture summary is generated and stored.
  9. The frontend polls status until the repository is ready.

Local Setup

Prerequisites

Install:

  • Docker Desktop
  • Git
  • An OpenAI API key

You do not need to install PostgreSQL, Redis, Python packages, or Node packages manually for the default setup. Docker Compose runs those services for you.

1. Clone The Repository

git clone https://github.com/YOUR_USERNAME/codeAtlas.git
cd codeAtlas

2. Create Your Local Environment File

cp .env.example .env

Open .env and add your own OpenAI API key:

OPENAI_API_KEY=your_openai_api_key_here

Important: .env is intentionally ignored by Git. Do not commit it. Commit .env.example, not .env.

3. Start The App

docker compose up --build

Open the app:

FastAPI also exposes developer API docs at http://localhost:8000/docs when the backend is running.

4. Run A Demo

Try a small public repository first:

https://github.com/pallets/markupsafe

Then try a larger full-stack repository:

https://github.com/fastapi/full-stack-fastapi-template

Ask questions like:

  • Where is authentication handled?
  • Which files should I read first?
  • How does the backend connect to the database?
  • What are the main entry points?

Environment Variables

Variable Required Description
DATABASE_URL Yes SQLAlchemy database URL used by backend and worker
REDIS_URL Yes Redis connection URL for RQ jobs
OPENAI_API_KEY Yes for AI features Your OpenAI API key for embeddings, summaries, chat, and file explanations
GITHUB_TOKEN No Optional token for higher GitHub rate limits
REPOSITORY_TMP_DIR Yes Temporary clone directory inside the backend/worker container
FRONTEND_ORIGIN Yes Allowed frontend origin for backend CORS
NEXT_PUBLIC_API_URL Yes Browser-visible backend API URL

API Overview

Method Route Description
POST /repositories Submit a public GitHub repo for indexing
GET /repositories List recent repositories
GET /repositories/{repo_id} Get repository status and metadata
POST /repositories/{repo_id}/reindex Re-index an existing repository
GET /repositories/{repo_id}/files List indexed files
GET /repositories/{repo_id}/files/{file_id} Get one file with content
GET /repositories/{repo_id}/summary Get generated architecture summary
GET /repositories/{repo_id}/graph Get dependency graph nodes and edges
POST /repositories/{repo_id}/search Search relevant code chunks
POST /repositories/{repo_id}/chat Ask a repository-aware question
POST /repositories/{repo_id}/explain-file Generate an explanation for one file

Testing

Backend tests:

docker compose exec backend pytest -q

Frontend typecheck:

docker compose exec frontend npm run typecheck

GitHub Safety Checklist

Before pushing:

  • Confirm .env is not staged.
  • Confirm frontend/node_modules/ is not staged.
  • Confirm frontend/.next/ is not staged.
  • Confirm frontend/tsconfig.tsbuildinfo is not staged.
  • Confirm .DS_Store is not staged.
  • Commit .env.example so other people know what variables they need.

Useful check:

git status --short

If you ever accidentally commit an API key, revoke that key immediately in the OpenAI dashboard and create a new one.

Limitations

  • Public GitHub repositories only.
  • No user accounts or saved private workspaces.
  • No private GitHub repository support.
  • Import parsing is regex-based and intentionally lightweight.
  • Large repositories can take longer and may use more OpenAI API credits.
  • Dependency graph quality depends on language and import style.
  • The app is designed for local development, not production deployment yet.

Future Improvements

  • User authentication and saved workspaces.
  • Private GitHub repository support.
  • Better dependency parsing with tree-sitter or language servers.
  • Streaming chat responses.
  • More granular indexing progress per file/chunk.
  • Hosted deployment with managed Postgres, Redis, and object storage.
  • Shareable repository reports.

About

Full-stack AI repository analysis platform.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors